TeamCity offers a number of ways to help you work with tests. First of all, it presents the test results for each build execution so it’s easy to analyze what happened and why something has failed. Besides that, TeamCity includes advanced features like code coverage, flaky test detection, metadata in tests, and the ability to re-order the test execution. In this blog post, we provide a brief overview of these features.
It all starts with presenting the results. On the build configuration home page, for each build, we can see the number of tests that were executed. We can also see the numbers for failed, ignored, and muted tests.
If we choose a single build and navigate to the build overview page, we will see some more data. For instance, for build number 14 we can see that there are 39 passed tests, and 1 test (system.CrashControllerTests) is ignored.
Further on, the Tests tab displays the full list of the tests that were executed:
It is possible to use the controls at the top of the list for grouping and filtering the results. Since TeamCity preserves the history of the tests, we can also see the historical data for their duration:
The duration is associated with the build agent on which the test was executed. The graph immediately shows us whether any agents are performing poorly for the specific test. We may want to fix something in the environment for that agent, or we can also exclude the agent entirely for this build configuration by adjusting the Agent Requirements.
Flaky test detection
Sometimes, seemingly at random, the same test of the same code will produce diﬀerent results. We call those tests flaky. Flaky tests can be caused by various factors: environment affecting the test or application code, concurrency issues, etc.
It’s really great if we can detect such unstable tests sooner and fix them faster. Analyzing the historical test reports manually to find out, if there are any flaky tests would take ages. TeamCity detects flaky tests and displays them on the dedicated tab for a given project.
For every single test, we can view the history, assign an investigation, or temporarily mute the failure. For instance, we can mute all the flaky tests for the duration while someone is fixing them. Muting the tests will allow the build to pass so that the build chain can continue the execution. Once the tests are fixed, we can unmute them again.
Here’s an example of a flaky test. At the screenshot below, we can see that the test status was changing from passed to failed and back a few times, even though there were no related changes in the version control. Hence, the test was marked as flaky so that we could easily find it in the test report.
The test results are associated with the build agent which helps to diagnose the failures better. For instance, the test might be failing only on a specific build agent. Then maybe the failure is related to the environment and not the test logic itself.
Code coverage is a combination of metrics that measure how your code is covered by unit tests. To get the code coverage information displayed in TeamCity for the supported tools, first we need to configure it in the dedicated section of a Build Runner settings page.
For instance, for Maven build step, at the bottom of the configuration page, we can find the Code Coverage section. We can choose the code coverage runner and specify which binaries should be inspected.
After the test execution, the coverage metrics are presented on the build overview page. We can view the full report on the Code Coverage tab.
Metadata in tests
Sometimes you need to provide more data with the test results. For instance, include a screenshot, add some numeric data, or just provide any description. This is useful when we need to analyze why the test failed and any additional information can help save precious time.
In TeamCity, a test may be associated with some information – the metadata. The types of data that we can associate with the test is text, numeric values, images, and files that are published as build artifacts.
We can use service messages to associate the data with the test. The data is then rendered with the test results:
For more about metadata in tests, read this blog post.
Risk group tests re-ordering
A test suite with a large number of tests takes some time to execute. Sometimes, for large suites, the execution time can be many hours. What if the very last test fails and we want to rerun the test suite? Would we need to wait a few more hours just to learn that the same test fails again? Instead, it would be nice to run the recently failed tests first.