Code Duplicates Search

Posted on by lisar

As we mentioned in our previous blog post, TeamCity has a number of ways to verify the code quality. Searching for repetitive blocks of code – known as code duplicates – is one of them.

The redundant and seemingly harmless pieces of code may eventually lead to an unmanageable project code base. Our own team is a really good example: searching for code duplicates in the IntelliJ IDEA project locally took about 40 minutes, and it was almost impossible to constantly monitor the state of the code base. But as soon as we implemented the server-side search for code duplicates:

  • all of the team members had access to the duplicates report and could make the code base better structured and controlled;
  • unnecessary pieces of code were removed, thus reducing the overall size of the code base;
  • the code base became much more manageable;
  • the developers were able keep on coding as usual because the duplicates were searched for on the server, and their own machines resources were not used.

TeamCity has supported searching for Java code duplicates for quite some time. Support for finding C# duplicate code will be available in TeamCity 3.0. We are working on the possibility of configuring the Java duplicates finder to work not only via an IntelliJ IDEA project file, but also by using the Eclipse project file or a Maven POM file. Setting up a build configuration with the Duplicates finder runner ensures that all of the integrated code is constantly checked for duplicates.

When specifying this runner settings for Java projects, you need to point TeamCity to the project .ipr file and set a number of options, in particular:

  • searching or ignoring the duplicates in the test sources of your project if it does not make much sense to verify tests for duplicates
  • whether to regard usages of different variables, fields, methods, types, or literals in otherwise structurally similar code as duplicate
  • the minimum relative size/complexity of a code piece that will be regarded as a duplicate

The duplicates search is, as usual, performed on one of the available build agents, and when a build is finished, a detailed report is available in the web UI. Working with code duplicates results is similar to working with the server-side inspection results: after the build is finished, you can browse the duplicates list (build results page, Duplicates tab) and opt to show all or only the newly discovered duplicates.

Duplicates report

The top left panel displays a list of found duplicates and their instances with the file names. Use the arrows to examine the selected duplicate in one of the lower panes.

From the duplicates report page you can jump to the source code in your IDE (IntelliJ IDEA, Eclipse or Microsoft Visual Studio). Click the Open in active IDE button, and you will navigate directly to the file containing the duplicate instance and be able to quickly revise the issues found. In IntelliJ IDEA you can work with found duplicates either in the editor or in a tool window.

Keep an eye on our Early Access Program!

View the example of a server-side duplicates search at our demo site.

Technorati tags: , , , , , , , , ,

Comments below can no longer be edited.

1 Response to Code Duplicates Search

  1. Sandra says:

    December 17, 2008

    Thank you for the nice post.
    I also tried this: http://www.digeus.com/products/duplicatefinder/index.html
    Do you know anything of it? It might be very useful.

Subscribe

Subscribe for updates