Improving performance and scalability of your TeamCity server
When we released TeamCity 1.0 in October 2006, we had about 20 agents on local server. Since then TeamCity has come a long way, we’ve added a lot of features, and we have been increasing our agents pool. Now we have 110 agents connected, a tripled user base, bigger artifacts, significantly increased number of check-ins to version control systems monitored by TeamCity.
To support this number of agents in our installation we’ve made several configuration changes. In this blog post I would like to share some general recommendations how to achieve better performance and scalability with your own TeamCity installation.
First of all, switch your server to external database. We are using MySQL with InnoDB storage. TeamCity can also work with Oracle, MSSQL, PostgreSQL and Sybase. If you choose to use Oracle or MSSQL, it is better to install them on a separate server, as these databases usually require a lot of resources. MySQL and PostgreSQL are much more lightweight and can be installed on the same machine as TeamCity server but we recommend to use dedicated disk for the database files. If you are not sure what database to use, or you do not have experienced DBA who would help you, choose MySQL (with InnoDB storage), it’s fast, easy to install and use and most likely will suite your needs.
TeamCity is mostly I/O bound system. The more agents you have, the more builds will be executed concurrently, which means more build logs and artifacts are stored on disk, more data saved in the database.
To achieve the best performance install several disks to your server. At least separate OS and TeamCity binaries from TeamCity data directory (.BuildServer), i.e. place them on different disks. Moreover .BuildServer directory can be further split onto different disks too, as TeamCity can perform significant I/O activity there. In particular the following directories can be accessed frequently:
.BuildServer/system/caches– this directory holds search indexes and version control specific caches, in case of Git or Mercurial, TeamCity stores cloned repositories there
.BuildServer/system/artifacts– this directory contains artifacts and is accessed each time when build publishes its artifacts or retrieves them via artifact dependencies
.BuildServer/system/messages– if you have many agents, this is probably the most frequently used directory, as it holds build logs
<TeamCity installation directory>/temp– this directory stores temporary files created by TeamCity, it is also used as a temporary directory for external processes started by the server, like hg, p4 and so on
messages directories can hold huge amount of data.
Given that, the following setup seems reasonable: use a large disk for the whole
caches directory and another medium size disk for
temp. If you really need faster access to artifacts, you can further separate
In our case we have 2x2Tb disks combined in RAID1 where we placed the whole
.BuildServer directory except
.BuildServer/system/caches and 400Gb disk for
We also have 2x145Gb SAS disks RAID1 for the MySQL database files.
Most likely you won’t need anything extraordinary here. Choose the best CPU in terms of its value for the money. Note that it also makes sense to have several CPUs, TeamCity is multi-threaded application and is able to leverage their power.
Our server has one Intel Xeon E5520 CPU (8M Cache, 2.26 GHz, 5.86 GT/s) and I would say it is enough for us. We also have MySQL database installed on the same server, so this CPU is shared between TeamCity and MySQL.
It always makes sense to have a server where you can add more memory easily. Also it makes sense to have 64-bit hardware, OS and 64-bit JVM. For example, in case of Windows, 32-bit JVM can’t allocate more than 1,2 – 1,3Gb of memory just because of OS limitations.
Unfortunately, if Java application is started by 64-bit version of JVM, it requires 20%-50% more memory. For example, if you ran TeamCity with 32-bit JVM and 1Gb Xmx, you need to increase this Xmx up to 1,5Gb with 64-bit JVM. This is the price for better scalability.
Let’s return to TeamCity memory usage. Probably the most common factors affecting memory usage are the length of the builds history and the number of tests per build. TeamCity caches test results to be able to perform fast calculations to find a build where the test failed for the first time, or where it was fixed. If you have relaxed cleanup policy and prefer to have many builds in the history and these builds have many tests (tens of thousands) expect memory usage higher than usual.
If you are using Git, you should also expect higher memory usage on the server side. Opening a Git repository, accessing objects stored in Git database may require a lot of memory. That was the reason why we moved some of the operations with Git outside of TeamCity Java process.
It may seem obvious, but if you have a big number of active users (hundreds) and many agents you may also see an increased memory footprint. The reason for that is that more active threads process user and agent requests concurrently, this means more memory is allocated on the threads stacks. Also with a big number of users you may need to increase the number of connections in the database pool. Plus this impacts memory too, as each connection holds various caches.
I would say that for production purposes, if TeamCity is running under 64-bit JVM, 2Gb Xmx should be enough for the start. If needed you’ll be able to increase it later. Since TeamCity 6.5 we have memory usage information in thread dumps taken from Diagnostics page. Memory information block is at the bottom of thread dump and looks like this:
Memory usage: Code Cache: used = 35.43Mb committed = 35.75Mb max = 48.0Mb used/max = 73.81% PS Eden Space: used = 542.27Mb committed = 1088.18Mb max = 1091.68Mb used/max = 49.67% PS Survivor Space: used = 18.39Mb committed = 38.0Mb max = 38.0Mb used/max = 48.42% PS Old Gen: used = 1036.67Mb committed = 1653.50Mb max = 2333.37Mb used/max = 44.43% PS Perm Gen: used = 122.66Mb committed = 202.12Mb max = 250.0Mb used/max = 49.07% Total: used = 1755.44Mb max = 3761.06Mb used/max = 46.67%
In general, total percentage of the used memory should be about 50%-60% most of the time. If during several hours it is closer to 80%-90%, consider increasing Xmx.
Our own server has 12Gb, of those 4Gb are dedicated to TeamCity server, and 2Gb are used by MySQL. The rest is mostly unused or holds OS file caches (Windows Server 2008 R2).
We are not using superfast hardware locally, as we prefer to see bottlenecks faster. We keep paying attention to overall server performance, and each release should be better or at least not worse in this aspect. If you are still using TeamCity 5 or even 4 and are experiencing performance issues, consider upgrade to version 6.5, in general it should work faster than previous versions. Do not hesitate to contact us if you have performance problems. If you choose to do so, try to gather as much details as possible: http://confluence.jetbrains.net/display/TCD6/Reporting+Issues#ReportingIssues-HangsandThreadDumps.
I hope these recommendations will be useful for those who need some estimations for the server hardware, and for those who would like to increase server performance, but do not know where to start.