Qt
Internal/Contributor docs for the Qt SDK. Note: These are NOT official API docs; those are found at https://doc.qt.io/
Loading...
Searching...
No Matches
qttest-best-practices.qdoc
Go to the documentation of this file.
1
// Copyright (C) 2019 The Qt Company Ltd.
2
// SPDX-License-Identifier: LicenseRef-Qt-Commercial OR GFDL-1.3-no-invariants-only
3
4
/*!
5
\page qttest-best-practices.html
6
7
\title Qt Test Best Practices
8
9
\brief Guidelines for creating Qt tests.
10
11
We recommend that you add Qt tests for bug fixes and new features. Before
12
you try to fix a bug, add a \e {regression test} (ideally automatic) that
13
fails before the fix, exhibiting the bug, and passes after the fix. While
14
you're developing new features, add tests to verify that they work as
15
intended.
16
17
Conforming to a set of coding standards will make it more likely for
18
Qt autotests to work reliably in all environments. For example, some
19
tests need to read data from disk. If no standards are set for how this
20
is done, some tests won't be portable. For example, a test that assumes
21
its test-data files are in the current working directory only works for
22
an in-source build. In a shadow build (outside the source directory), the
23
test will fail to find its data.
24
25
The following sections contain guidelines for writing Qt tests:
26
27
\list
28
\li \l {General Principles}
29
\li \l {Writing Reliable Tests}
30
\li \l {Improving Test Output}
31
\li \l {Writing Testable Code}
32
\li \l {Setting up Test Machines}
33
\endlist
34
35
\section1 General Principles
36
37
The following sections provide general guidelines for writing unit tests:
38
39
\list
40
\li \l {Verify Tests}
41
\li \l {Give Test Functions Descriptive Names}
42
\li \l {Write Self-contained Test Functions}
43
\li \l {Test the Full Stack}
44
\li \l {Make Tests Complete Quickly}
45
\li \l {Use Data-driven Testing}
46
\li \l {Use Coverage Tools}
47
\li \l {Select Appropriate Mechanisms to Exclude Tests}
48
\li \l {Avoid Q_ASSERT}
49
\endlist
50
51
\section2 Verify Tests
52
53
Write and commit your tests along with your fix or new feature on a new
54
branch. Once you're done, you can check out the branch on which your work
55
is based, and then check out into this branch the test-files for your new
56
tests. This enables you to verify that the tests do fail on the prior
57
branch, and therefore actually do catch a bug or test a new feature.
58
59
For example, the workflow to fix a bug in the \c QDateTime class could be
60
like this if you use the Git version control system:
61
62
\list 1
63
\li Create a branch for your fix and test:
64
\c {git checkout -b fix-branch 5.14}
65
\li Write a test and fix the bug.
66
\li Build and test with both the fix and the new test, to verify that
67
the new test passes with the fix.
68
\li Add the fix and test to your branch:
69
\c {git add tests/auto/corelib/time/qdatetime/tst_qdatetime.cpp src/corelib/time/qdatetime.cpp}
70
\li Commit the fix and test to your branch:
71
\c {git commit -m 'Fix bug in QDateTime'}
72
\li To verify that the test actually catches something for which you
73
needed the fix, checkout the branch you based your own branch on:
74
\c {git checkout 5.14}
75
\li Checkout only the test file to the 5.14 branch:
76
\c {git checkout fix-branch -- tests/auto/corelib/time/qdatetime/tst_qdatetime.cpp}
77
78
Only the test is now on the fix-branch. The rest of the source tree
79
is still on 5.14.
80
\li Build and run the test to verify that it fails on 5.14, and
81
therefore does indeed catch a bug.
82
\li You can now return to the fix branch:
83
\c {git checkout fix-branch}
84
\li Alternatively, you can restore your work tree to a clean state on
85
5.14:
86
\c{git checkout HEAD -- tests/auto/corelib/time/qdatetime/tst_qdatetime.cpp}
87
\endlist
88
89
When you're reviewing a change, you can adapt this workflow to check that
90
the change does indeed come with a test for a problem it does fix.
91
92
\section2 Give Test Functions Descriptive Names
93
94
Naming test cases is important. The test name appears in the failure report
95
for a test run. For data-driven tests, the name of the data row also appears
96
in the failure report. The names give those reading the report a first
97
indication of what has gone wrong.
98
99
Test function names should make it obvious what the function is trying to
100
test. Do not simply use the bug-tracking identifier, because the identifiers
101
become obsolete if the bug-tracker is replaced. Also, some bug-trackers may
102
not be accessible to all users. When the bug report may be of interest to
103
later readers of the test code, you can mention it in a comment alongside a
104
relevant part of the test.
105
106
Likewise, when writing data-driven tests, give descriptive names to the
107
test-cases, that indicate what aspect of the functionality each focuses on.
108
Do not simply number the test-case, or use bug-tracking identifiers. Someone
109
reading the test output will have no idea what the numbers or identifiers
110
mean. You can add a comment on the test-row that mentions the bug-tracking
111
identifier, when relevant. It's best to avoid spacing characters and
112
characters that may be significant to command-line shells on which you may
113
want to run tests. This makes it easier to specify the test and tag on \l{Qt
114
Test Command Line Arguments}{the command-line} to your test program - for
115
example, to limit a test run to just one test-case.
116
117
\section2 Write Self-contained Test Functions
118
119
Within a test program, test functions should be independent of each other
120
and they should not rely upon previous test functions having been run. You
121
can check this by running the test function on its own with
122
\c {tst_foo testname}.
123
124
Do not re-use instances of the class under test in several tests. Test
125
instances (for example widgets) should not be member variables of the
126
tests, but preferably be instantiated on the stack to ensure proper
127
cleanup even if a test fails, so that tests do not interfere with
128
each other.
129
130
If your test involves making global changes, take care to ensure the prior
131
state is restored at the end of the test, whether it passes or fails. Since
132
failure prevents code later than the failing check from running, restoring
133
at the end of the test doesn't work when the test fails. The robust way to
134
restore even on failure is to instantiate an RAII object whose destructor
135
restores the prior state. This can often be conveniently done using \l
136
qScopeGuard, for example
137
138
\snippet code/src_qtestlib_qtestcase.cpp 36
139
140
before the first call to \l QLocale::setDefault() in a test that needs to
141
control the locale used by the code under test.
142
143
\section2 Test the Full Stack
144
145
If an API is implemented in terms of pluggable or platform-specific backends
146
that do the heavy-lifting, make sure to write tests that cover the
147
code-paths all the way down into the backends. Testing the upper layer API
148
parts using a mock backend is a nice way to isolate errors in the API layer
149
from the backends, but it is complementary to tests that run the actual
150
implementation with real-world data.
151
152
\section2 Make Tests Complete Quickly
153
154
Tests should not waste time by being unnecessarily repetitious, by using
155
inappropriately large volumes of test data, or by introducing needless
156
idle time.
157
158
This is particularly true for unit testing, where every second of extra
159
unit test execution time makes CI testing of a branch across multiple
160
targets take longer. Remember that unit testing is separate from load and
161
reliability testing, where larger volumes of test data and longer test
162
runs are expected.
163
164
Benchmark tests, which typically execute the same test multiple times,
165
should be located in a separate \c tests/benchmarks directory and they
166
should not be mixed with functional unit tests.
167
168
\section2 Use Data-driven Testing
169
170
\l{Chapter 2: Data Driven Testing}{Data-driven tests} make it easier to add
171
new tests for boundary conditions found in later bug reports.
172
173
Using a data-driven test rather than testing several items in sequence in
174
a test saves repetition of very similar code and ensures later cases are
175
tested even when earlier ones fail. It also encourages systematic and
176
uniform testing, because the same tests are applied to each data sample.
177
178
When a test is data-driven, you can specify its data-tag along with the
179
test-function name, as \c{function:tag}, on the command-line of the test to
180
run the test on just one specific test-case, rather than all test-cases of
181
the function. This can be used for either a global data tag or a local tag,
182
identifying a row from the function's own data; you can even combine them as
183
\c{function:global:local}.
184
185
\section2 Use Coverage Tools
186
187
Use a coverage tool such as \l {Coco} or \l {gcov}
188
to help write tests that cover as many statements, branches, and conditions
189
as possible in the function or class being tested. The earlier this is done
190
in the development cycle for a new feature, the easier it will be to catch
191
regressions later when the code is refactored.
192
193
\section2 Select Appropriate Mechanisms to Exclude Tests
194
195
It is important to select the appropriate mechanism to exclude inapplicable
196
tests.
197
198
Use \l QSKIP() to handle cases where a whole test function is found at
199
run-time to be inapplicable in the current test environment. When just a
200
part of a test function is to be skipped, a conditional statement can be
201
used, optionally with a \c qDebug() call to report the reason for skipping
202
the inapplicable part.
203
204
When there are known test failures that should eventually be fixed,
205
\l QEXPECT_FAIL is recommended, as it supports running the rest of the
206
test, when possible. It also verifies that the issue still exists, and
207
lets the code's maintainer know if they unwittingly fix it, a benefit
208
which is gained even when using the \l {QTest::}{Abort} flag.
209
210
Test functions or data rows of a data-driven test can be limited to
211
particular platforms, or to particular features being enabled using
212
\c{#if}. However, beware of \l moc limitations when using \c{#if} to
213
skip test functions. The \c moc preprocessor does not have access to
214
all the \c builtin macros of the compiler that are often used for
215
feature detection of the compiler. Therefore, \c moc might get a different
216
result for a preprocessor condition from that seen by the rest of your
217
code. This may result in \c moc generating meta-data for a test slot that
218
the actual compiler skips, or omitting the meta-data for a test slot that
219
is actually compiled into the class. In the first case, the test will
220
attempt to run a slot that is not implemented. In the second case, the
221
test will not attempt to run a test slot even though it should.
222
223
If an entire test program is inapplicable for a specific platform or unless
224
a particular feature is enabled, the best approach is to use the parent
225
directory's build configuration to avoid building the test. For example, if
226
the \c tests/auto/gui/someclass test is not valid for \macOS, wrap its
227
inclusion as a subdirectory in \c{tests/auto/gui/CMakeLists.txt} in a
228
platform check:
229
230
\badcode
231
if(NOT APPLE)
232
add_subdirectory(someclass)
233
endif
234
\endcode
235
236
or, if using \c qmake, add the following line to \c tests/auto/gui.pro:
237
238
\badcode
239
mac*: SUBDIRS -= someclass
240
\endcode
241
242
See also \l {Chapter 6: Skipping Tests with QSKIP}
243
{Skipping Tests with QSKIP}.
244
245
\section2 Avoid Q_ASSERT
246
247
The \l Q_ASSERT macro causes a program to abort whenever the asserted
248
condition is \c false, but only if the software was built in debug mode.
249
In both release and debug-and-release builds, \c Q_ASSERT does nothing.
250
251
\c Q_ASSERT should be avoided because it makes tests behave differently
252
depending on whether a debug build is being tested, and because it causes
253
a test to abort immediately, skipping all remaining test functions and
254
returning incomplete or malformed test results.
255
256
It also skips any tear-down or tidy-up that was supposed to happen at the
257
end of the test, and might therefore leave the workspace in an untidy state,
258
which might cause complications for further tests.
259
260
Instead of \c Q_ASSERT, the \l QCOMPARE() or \l QVERIFY() macro variants
261
should be used. They cause the current test to report a failure and
262
terminate, but allow the remaining test functions to be executed and the
263
entire test program to terminate normally. \l QVERIFY2() even allows a
264
descriptive error message to be recorded in the test log.
265
266
\section1 Writing Reliable Tests
267
268
The following sections provide guidelines for writing reliable tests:
269
270
\list
271
\li \l {Avoid Side-effects in Verification Steps}
272
\li \l {Avoid Fixed Timeouts}
273
\li \l {Beware of Timing-dependent Behavior}
274
\li \l {Avoid Bitmap Capture and Comparison}
275
\endlist
276
277
\section2 Avoid Side-effects in Verification Steps
278
279
When performing verification steps in an autotest using \l QCOMPARE(),
280
\l QVERIFY(), and so on, side-effects should be avoided. Side-effects
281
in verification steps can make a test difficult to understand. Also,
282
they can easily break a test in ways that are difficult to diagnose
283
when the test is changed to use \l QTRY_VERIFY(), \l QTRY_COMPARE() or
284
\l QBENCHMARK(). These can execute the passed expression multiple times,
285
thus repeating any side-effects.
286
287
When side-effects are unavoidable, ensure that the prior state is restored
288
at the end of the test function, even if the test fails. This commonly
289
requires use of an RAII (resource acquisition is initialization) class
290
that restores state when the function returns, or a \c cleanup() method.
291
Do not simply put the restoration code at the end of the test. If part of
292
the test fails, such code will be skipped and the prior state will not be
293
restored.
294
295
\section2 Avoid Fixed Timeouts
296
297
Avoid using hard-coded timeouts, such as QTest::qWait() to wait for some
298
conditions to become true. Consider using the \l QSignalSpy class,
299
the \l QTRY_VERIFY() or \l QTRY_COMPARE() macros, or the \c QSignalSpy
300
class in conjunction with the \c QTRY_ macro variants.
301
302
The \c qWait() function can be used to set a delay for a fixed period
303
between performing some action and waiting for some asynchronous behavior
304
triggered by that action to be completed. For example, changing the state
305
of a widget and then waiting for the widget to be repainted. However,
306
such timeouts often cause failures when a test written on a workstation is
307
executed on a device, where the expected behavior might take longer to
308
complete. Increasing the fixed timeout to a value several times larger
309
than needed on the slowest test platform is not a good solution, because
310
it slows down the test run on all platforms, particularly for table-driven
311
tests.
312
313
If the code under test issues Qt signals on completion of the asynchronous
314
behavior, a better approach is to use the \l QSignalSpy class to notify
315
the test function that the verification step can now be performed.
316
317
If there are no Qt signals, use the \c QTRY_COMPARE() and \c QTRY_VERIFY()
318
macros, which periodically test a specified condition until it becomes true
319
or some maximum timeout is reached. These macros prevent the test from
320
taking longer than necessary, while avoiding breakages when tests are
321
written on workstations and later executed on embedded platforms.
322
323
If there are no Qt signals, and you are writing the test as part of
324
developing a new API, consider whether the API could benefit from the
325
addition of a signal that reports the completion of the asynchronous
326
behavior.
327
328
\section2 Beware of Timing-dependent Behavior
329
330
Some test strategies are vulnerable to timing-dependent behavior of certain
331
classes, which can lead to tests that fail only on certain platforms or that
332
do not return consistent results.
333
334
One example of this is text-entry widgets, which often have a blinking
335
cursor that can make comparisons of captured bitmaps succeed or fail
336
depending on the state of the cursor when the bitmap is captured. This,
337
in turn, may depend on the speed of the machine executing the test.
338
339
When testing classes that change their state based on timer events, the
340
timer-based behavior needs to be taken into account when performing
341
verification steps. Due to the variety of timing-dependent behavior, there
342
is no single generic solution to this testing problem.
343
344
For text-entry widgets, potential solutions include disabling the cursor
345
blinking behavior (if the API provides that feature), waiting for the
346
cursor to be in a known state before capturing a bitmap (for example, by
347
subscribing to an appropriate signal if the API provides one), or
348
excluding the area containing the cursor from the bitmap comparison.
349
350
\section2 Avoid Bitmap Capture and Comparison
351
352
While verifying test results by capturing and comparing bitmaps is sometimes
353
necessary, it can be quite fragile and labor-intensive.
354
355
For example, a particular widget may have different appearance on different
356
platforms or with different widget styles, so reference bitmaps may need to
357
be created multiple times and then maintained in the future as Qt's set of
358
supported platforms evolves. Making changes that affect the bitmap thus
359
means having to recreate the expected bitmaps on each supported platform,
360
which would require access to each platform.
361
362
Bitmap comparisons can also be influenced by factors such as the test
363
machine's screen resolution, bit depth, active theme, color scheme,
364
widget style, active locale (currency symbols, text direction, and so
365
on), font size, transparency effects, and choice of window manager.
366
367
Where possible, use programmatic means, such as verifying properties of
368
objects and variables, instead of capturing and comparing bitmaps.
369
370
\section1 Improving Test Output
371
372
The following sections provide guidelines for producing readable and
373
helpful test output:
374
375
\list
376
\li \l {Test for Warnings}
377
\li \l {Avoid Printing Debug Messages from Autotests}
378
\li \l {Write Well-structured Diagnostic Code}
379
\endlist
380
381
\section2 Test for Warnings
382
383
Just as when building your software, if test output is cluttered with
384
warnings you will find it harder to notice a warning that really is a clue
385
to the emergence of a bug. It is thus prudent to regularly check your test
386
logs for warnings, and other extraneous output, and investigate the
387
causes. When they are signs of a bug, you can make warnings trigger test
388
failure.
389
390
When the code under test \e should produce messages, such as warnings
391
about misguided use, it is also important to test that it \e does produce
392
them when so used. You can test for expected messages from the code under
393
test, produced by \l qWarning(), \l qDebug(), \l qInfo() and friends,
394
using \l QTest::ignoreMessage(). This will verify that the message is
395
produced and filter it out of the output of the test run. If the message
396
is not produced, the test will fail.
397
398
If an expected message is only output when Qt is built in debug mode, use
399
\l QLibraryInfo::isDebugBuild() to determine whether the Qt libraries were
400
built in debug mode. Using \c{#ifdef QT_DEBUG} is not enough, as it will
401
only tell you whether \e{the test} was built in debug mode, and that does
402
not guarantee that the \e{Qt libraries} were also built in debug mode.
403
404
Your tests can (since Qt 6.3) verify that they do not trigger calls to
405
\l qWarning() by calling \l QTest::failOnWarning(). This takes the warning
406
message to test for or a \l QRegularExpression to match against warnings; if
407
a matching warning is produced, it will be reported and cause the test to
408
fail. For example, a test that should produce no warnings at all can
409
\c{QTest::failOnWarning(QRegularExpression(u".*"_s))}, which will match any
410
warning at all.
411
412
You can also set the environment variable \c QT_FATAL_WARNINGS to cause
413
warnings to be treated as fatal errors. See \l qWarning() for details; this
414
is not specific to autotests. If warnings would otherwise be lost in vast
415
test logs, the occasional run with this environment variable set can help
416
you to find and eliminate any that do arise.
417
418
\section2 Avoid Printing Debug Messages from Autotests
419
420
Autotests should not produce any unhandled warning or debug messages.
421
This will allow the CI Gate to treat new warning or debug messages as
422
test failures.
423
424
Adding debug messages during development is fine, but these should be
425
either disabled or removed before a test is checked in.
426
427
\section2 Write Well-structured Diagnostic Code
428
429
Any diagnostic output that would be useful if a test fails should be part
430
of the regular test output rather than being commented-out, disabled by
431
preprocessor directives, or enabled only in debug builds. If a test fails
432
during continuous integration, having all of the relevant diagnostic output
433
in the CI logs could save you a lot of time compared to enabling the
434
diagnostic code and testing again. Epecially, if the failure was on a
435
platform that you don't have on your desktop.
436
437
Diagnostic messages in tests should use Qt's output mechanisms, such as
438
\c qDebug() and \c qWarning(), rather than \c stdio.h or \c iostream.h output
439
mechanisms. The latter bypass Qt's message handling and prevent the
440
\c -silent command-line option from suppressing the diagnostic messages.
441
This could result in important failure messages being hidden in a large
442
volume of debugging output.
443
444
\section1 Writing Testable Code
445
446
The following sections provide guidelines for writing code that is easy to
447
test:
448
449
\list
450
\li \l {Break Dependencies}
451
\li \l {Compile All Classes into Libraries}
452
\endlist
453
454
\section2 Break Dependencies
455
456
The idea of unit testing is to use every class in isolation. Since many
457
classes instantiate other classes, it is not possible to instantiate one
458
class separately. Therefore, you should use a technique called
459
\e {dependency injection} that separates object creation from object use.
460
A factory is responsible for building object trees. Other objects manipulate
461
these objects through abstract interfaces.
462
463
This technique works well for data-driven applications. For GUI
464
applications, this approach can be difficult as objects are frequently
465
created and destructed. To verify the correct behavior of classes that
466
depend on abstract interfaces, \e mocking can be used. For example, see
467
\l {Googletest Mocking (gMock) Framework}.
468
469
\section2 Compile All Classes into Libraries
470
471
In small to medium sized projects, a build script typically lists all
472
source files and then compiles the executable in one go. This means that
473
the build scripts for the tests must list the needed source files again.
474
475
It is easier to list the source files and the headers only once in a
476
script to build a static library. Then the \c main() function will be
477
linked against the static library to build the executable and the tests
478
will be linked against the static libraries.
479
480
For projects where the same source files are used in building several
481
programs, it may be more appropriate to build the shared classes into
482
a dynamically-linked (or shared object) library that each program,
483
including the test programs, can load at run-time. Again, having the
484
compiled code in a library helps to avoid duplication in the description
485
of which components to combine to make the various programs.
486
487
\section1 Setting up Test Machines
488
489
The following sections discuss common problems caused by test machine setup:
490
491
\list
492
\li \l {Screen Savers}
493
\li \l {System Dialogs}
494
\li \l {Display Usage}
495
\li \l {Window Managers}
496
\endlist
497
498
All of these problems can typically be solved by the judicious use of
499
virtualisation.
500
501
\section2 Screen Savers
502
503
Screen savers can interfere with some of the tests for GUI classes, causing
504
unreliable test results. Screen savers should be disabled to ensure that
505
test results are consistent and reliable.
506
507
\section2 System Dialogs
508
509
Dialogs displayed unexpectedly by the operating system or other running
510
applications can steal input focus from widgets involved in an autotest,
511
causing unreproducible failures.
512
513
Examples of typical problems include online update notification dialogs
514
on macOS, false alarms from virus scanners, scheduled tasks such as virus
515
signature updates, software updates pushed out to workstations, and chat
516
programs popping up windows on top of the stack.
517
518
\section2 Display Usage
519
520
Some tests use the test machine's display, mouse, and keyboard, and can
521
thus fail if the machine is being used for something else at the same
522
time or if multiple tests are run in parallel.
523
524
The CI system uses dedicated test machines to avoid this problem, but if
525
you don't have a dedicated test machine, you may be able to solve this
526
problem by running the tests on a second display.
527
528
On Unix, one can also run the tests on a nested or virtual X-server, such as
529
Xephyr. For example, to run the entire set of tests on Xephyr, execute the
530
following commands:
531
532
\code
533
Xephyr :1 -ac -screen 1920x1200 >/dev/null 2>&1 &
534
sleep 5
535
DISPLAY=:1 icewm >/dev/null 2>&1 &
536
cd tests/auto
537
make
538
DISPLAY=:1 make -k -j1 check
539
\endcode
540
541
Users of NVIDIA binary drivers should note that Xephyr might not be able to
542
provide GLX extensions. Forcing Mesa libGL might help:
543
544
\code
545
export LD_PRELOAD=/usr/lib/mesa-diverted/x86_64-linux-gnu/libGL.so.1
546
\endcode
547
548
However, when tests are run on Xephyr and the real X-server with different
549
libGL versions, the QML disk cache can make the tests crash. To avoid this,
550
use \c QML_DISABLE_DISK_CACHE=1.
551
552
Alternatively, use the offscreen plugin:
553
554
\code
555
TESTARGS="-platform offscreen" make check -k -j1
556
\endcode
557
558
\section2 Window Managers
559
560
On Unix, at least two autotests (\c tst_examples and \c tst_gestures)
561
require a window manager to be running. Therefore, if running these
562
tests under a nested X-server, you must also run a window manager
563
in that X-server.
564
565
Your window manager must be configured to position all windows on the
566
display automatically. Some windows managers, such as Tab Window Manager
567
(twm), have a mode for manually positioning new windows, and this prevents
568
the test suite from running without user interaction.
569
570
\note Tab Window Manager is not suitable for running the full suite of
571
Qt autotests, as the \c tst_gestures autotest causes it to forget its
572
configuration and revert to manual window placement.
573
*/
qtbase
src
testlib
doc
src
qttest-best-practices.qdoc
Generated on
for Qt by
1.14.0