Transition to native Git in TeamCity 2022.04 brings 10x fetch time reduction to IntelliJ Platform
Starting from version 2022.04, TeamCity switched to native Git on the server side for Git VCS connections. The switch should positively impact both performance and overall experience of working with Git repositories on the TeamCity server side.
In this blog post, we’ll talk about the reasons for the switch and our own experience with this feature.
In the case of the agent-side checkout, TeamCity agents have always been using a native Git executable to checkout sources. This functionality spawns a separate process for native Git commands like ls-remote, fetch, checkout, and others, so it requires the Git executable to be present on the agent. When it comes to SSH communication and depending on the configuration, native Git may use the OpenSSH, PuTTY, or JCraft JSch Java library as the SSH client.
Meanwhile, the TeamCity server also requires a copy of a Git repository to detect newly pushed commits. These newly found commits later pop up as pending and are included in the following builds. In a regular scenario on the server side, TeamCity first checks if the local repository has the same state (a set of branches pointing to a set of commits) as the remote one. If the local state is out-of-date, missing commits should be obtained from the remote.
In terms of Git, TeamCity server performs an ls-remote operation possibly followed by a fetch. There is also some other server-side functionality which works with remote repositories, like VCS labeling feature that creates Git tags, or versions settings that can push commits to the remote.
Approach with Eclipse JGit and JCraft JSch
Historically, to work with both local and remote repositories on the TeamCity server-side, we chose the Eclipse JGit Java library. For SSH transport and operations with SSH keys, we used the JCraft JSch Java library. Both JGit and JSch libraries did a very good job. However, they have some performance, memory-related, and other limiting issues.
Among the main concerns and complaints were:
- Slow fetch for large repositories, which could drag out the overall process of checking for changes on the TeamCity server.
- Memory issues when performing fetch inside the TeamCity process, which required more complicated out-of-process solutions.
- Notable delay between creating a branch in a Git repository and the moment when TeamCity detects it and starts showing it in the branch list for a build configuration, providing the ability to start a build in the new branch.
- Inability to receive a pack file with any object larger than 2GB, which requires switching the whole repository to Git LFS.
Besides, it was often problematic to compare errors reported by the mentioned libraries with messages coming from native Git under the same conditions.
Finally, the official JCraft JSch library is no longer maintained and consequently does not support modern secure algorithms and ciphers. Although an actively developed fork has become available recently, we decided to use the native Git approach.
Native Git approach
These issues brought about the idea of trying native Git with OpenSSH for SSH transport on the TeamCity server side (as we said before, agents have already been using native Git). After a significant amount of work which took a few months, we were finally able to test the native Git approach internally, and the results were better than we had expected.
Our current solution is to use a hybrid approach. For convenience we’re using Eclipse JGit while working with local repositories, e.g. iterating over commits history in the local repository, when collecting modified files, their content, etc. But when it comes to remote communication, such as obtaining a current remote repository state, TeamCity server spawns a process with native Git executable instead of running a JGit inside a Java process.
JetBrains successful experience
As you might know, we in JetBrains dogfood our products as much as possible. Our internal TeamCity server, which builds most of JetBrains products, currently has more than 3,000 VCS roots (mainly Git). Among the biggest Git repositories which are checked out on our TeamCity server is, of course, the IntelliJ Platform hosted by JetBrains Space.
Switching to native Git together with a few other related fixes in 2022.04 solved all the above-mentioned problems on our internal TeamCity server and also showed an impressive fetch time reduction for our Linux node, which is responsible for VCS operations.
The following image shows statistics for the fetch time for IntelliJ Platform repository collected during the time slot covering the switch from JGit to native Git. The fetch time decreased by almost 10 times!
Note: if you’re using Windows, the performance improvements might not be so impressive. But still they can be noticeable.
Try native Git on your server
Before you try native Git on your TeamCity server, we suggest taking a look at the known issues.
To enable native Git, navigate to Administration | Diagnostics and open the Git tab there.
On this page, you can test the connection via native Git for any of the VCS roots on your server. If you choose to test all VCS roots, TeamCity will check whether they successfully connect via JGit and then will test their connection via native Git. This measure helps ensure that none of your pipelines will break after switching to native Git. If the connection test is successful, then you can enable native Git support on your server.
Give it a try and tell us about your experience! We’re looking forward to your feedback.