In the world of user interface (UI) testing we all have come across tests and UI behavior that is not 100% repeatable. In other words, sometimes tests are “flaky.” The challenge with these tests is in trying to determine what causes this flakiness, and finding ways to resolve it, so that we are able to test the UI in a reliable fashion without raising false alarms and creating test gaps.
The two most common situations that lead to flakiness are external factors over which one has no control, and the application or website you are testing not being in the expected state when you test it. Among the external factors, network conditions, different client types, and test infrastructure limitations are among the most common causes of flakiness.The way to diagnose these factors usually boils down to comparing the physical testing environment of the flaky test with the non-flaky one.
More challenging are those flaky tests that concern the application or website under test not being in the expected state when the test is run. This can be caused, for example, by someone having made a change in the source code between the first time a test suite as run, and the next time. In these situations, you will typically see errors like
InvalidElementStateException, etc. The Selenium 2.0 documentation includes a full list of the exceptions that can be thrown when running a test, along with a description of the conditions that can cause them. Unlike the physical factors that can lead to flakiness, these “unexpected state” factors are easily diagnosed by inspecting the stack trace and isolating the specific components that led to the exception.
Dealing with Flakiness
There are a couple ways to approach flaky behavior.
If you get a TimeoutException, which can be caused by external factors such as network latency, you can increase the amount of time for implicit timeouts and see if this improves the test behavior. However, keep in mind that this comes with the cost of increasing overall test run time, and should not be considered as a permanent solution. If using an implicit time out helps resolve the flakiness, then within your test you should use explicit waits or fluent waits, or a combination of the two, to make your tests more efficient and adaptive to changing test conditions.
An “explicit wait” waits for a single component to fulfill a condition, such as an element appearing on a page, for an explicit amount of time. In this code example,
timeOut would be set to the amount of time in which the condition should be fulfilled.
A FluentWait “fluently” waits for a single component to fulfill a condition, checking every 5 seconds and ignoring the resulting exception or exceptions for a total duration of 30 seconds. In this example, the total maximum wait time is 30 seconds and the minimum wait time is 0 seconds, in case the element is present and visible after the first try. The exceptions to be ignored can also be defined for targeted handling of exceptions for the duration.
Use Retry Rules
If waiting patiently doesn’t resolve your test flakiness, as a last resort you can use the JUnit test framework TestRule class, or the TestNG test framework RetryAnalyzer class. These classes will rerun tests that have failed without interrupting your test flow. However, you should use these only as a last resort, and very carefully, as rerunning failed tests can mask both flaky tests and flaky product features.
This example shows how to use a test rule, along with a custom annotation,
@Retry, to mark individual tests to retry.
This annotation is used to mark the tests to retry.
This is the usage within the test class.
You can use this analyzer class to retry failed tests, and reference it in the TestNg
Sample TestNg RetryAnalyzer class
Usage within the test class