IntelliJ IDEA
IntelliJ IDEA – the Leading Java and Kotlin IDE, by JetBrains
Fumigating the IDEA Ultimate code using dataflow analysis
IntelliJ IDEA ships with thousands of Java code inspections. Most of them work like very advanced regular expressions: they look for specific patterns in the code which look like typos, are not very performant, confusing, or overly verbose. However one of them is not of that kind. Its name is “Constant conditions & exceptions” which sounds somewhat confusing. In fact, it performs data flow analysis of Java methods using a technique called “symbolic execution” and reveals suspicious facts about the code. Here are some examples of the facts it can reveal:
- Dereference may cause a
NullPointerException
on some code path - Condition is always true or false in all possible code paths
- Array index is always out of bounds in all possible code paths
- Type cast will always fail
These facts do not always signal a bug. For example, one may want to add an always-true statement for code clarity, like in this case:
if (obj == null) { … } else if (obj != null) { // 'obj != null' is always true reported here … }
However often such warnings signal that code contains a typo or the code author did not examine the code behavior fully. This inspection may help to discover really weird bugs which are not covered by any patterns. Sometimes even we, the IDE authors, could not imagine the code snippet in which “Constant conditions & exceptions” warning was issued.
Over the years we have been constantly improving this inspection. It used to be mostly concentrated on nullability and type analysis. Later we added Optional presence analysis, integral ranges analysis, and array bounds checking. In the previous major release 2017.3, we improved integral ranges analysis and added basic support for Stream API and Optional call chains.
The first big testing of “Constant conditions & exceptions” is performed on the source code of IDEA Ultimate itself. It’s quite a huge project as the codebase contains tens of thousands of classes, written by dozens of developers, over more than 15 years. After the next inspection improvement, it’s always interesting to discover yet another bug in some ancient class. Let’s see what we have found after improving the dataflow analysis for the upcoming release.
One of the most interesting dataflow analysis improvements in 2018.1 is the tracking of relations between variables. In older IDEA versions the inspection tracked the variables equality or inequality, so if(a == b && a != b)
was highlighted as always false. When we started tracking integral ranges, new warnings appeared in the code like if(a > 0 && a <= 0)
, because the first condition implies that the a
value belongs to the range [1..MAX_VALUE]
while the second condition requires the range [MIN_VALUE..0]
and these ranges never intersect. However, up until now, there was no warning on an if(a > b && a <= b)
condition because we knew nothing more about the value of these two variables. Now we have started to track which variable is greater, so in 2018.1 this snippet will be highlighted as well. This might look pretty trivial and such mistakes would unlikely remain unnoticed in the code, but remember that dataflow analysis doesn’t just track specific code patterns.
First, it can easily detect an opposite case, and it was actually found in some SQL tables processing code:
Yes, the method starts with an always true condition, so it always returns the empty list, because the idx
variable is always either less than the myAncestorRefs
array length, or greater, or equal to. The following ten lines of code were never executed. It’s obviously a typo: idx < 0
was intended instead of idx < myAncestorRefs.length
.
Speaking of array lengths, it appeared that the new feature nicely couples with already existing arrays bounds checking. The following code snippet was discovered in our Groovy plugin:
Now at the array access expression, the inspection knows that the i
value is always greater than the initializers
array length. Thus an ArrayIndexOutOfBoundsException
is inevitable if this code is executed. Clearly the condition was mistakenly flipped.
Another array-related problem was discovered in the source code of some Java inspection which processes catch blocks:
As you can see, the index
bound is checked after it’s used to access an array element. Thus the condition can never be false: if it’s false, then ArrayIndexOutOfBoundsException
would be thrown right before.
Now in some cases, we track values from array initializer expressions. This has allowed us to discover some redundant code like this:
Long ago the result
one-element array was used inside the command
lambda to record that editor activation is necessary. Later during code evolution, the editor activation became unnecessary, but the redundant condition was not reported by any existing inspection. Now we clearly see this.
Boolean methods starting with “is” are now assumed to return the same result if called twice in a row on the same qualifier value. This is a heuristic, which is not necessarily true, but if it produces a false-positive warning, then it’s probably a sign that the code would be confusing not only for the IDE, but for other teammates as well. Here’s a bug which was discovered in CSS selectors handling code thanks to this improvement:
Even though the qualifiers are different here, they were assigned to the same value previously. Thus we assume that isUniversalSelector
would return the same result. The typo is actually not in the condition, but in a line above: of course it should be remoteSelectorElement = (CssSimpleSelector)remoteSelectorElements[j]
.
Since 2018.1 we added another kind of warning. When a variable is assigned to a value, we warn you if it already contains the same value on every possible execution path. Again, it doesn’t always indicate a bug, but it may help to remove redundant code. However real bugs also could be found with the help of this warning. For example, the following code was found in our HTML formatting tests:
The problem is three lines above the warning: the first line of the code fragment should not have “= 2
” at the end. As a result, previous indent size is not restored in the finally block.
Also, this warning sometimes allows you to spot duplicates in the field or array initialization series like here:
This line is probably redundant, or something else was meant to be here instead, but it’s clearly a mistake.
Trivial improvement in this
reference handling revealed a bug in our tabs UI control:
As known, the disjunction operator is short-circuited: if the first condition is true, then the second is not checked. If the first condition is false, then component
equals to this
, but this
is never null, which means the component
is not null as well. Thus the whole condition is always true. It’s a very common mistake when ||
was used instead of &&
and vice versa.
Finally, we improved Stream API processing inside dataflow analysis. In particular, it now works for incomplete stream chains as well (that is, stream chains not ending with the terminal operation). No actual bugs were discovered because of this improvement, but several redundant code occurrences were found. For example, a method like this appears in our code coverage processing:
The Pair.create
method is inferred as @NotNull
, thus x != null
condition is known to be redundant. The inspection is aware that x
has the same value which was returned by the previous lambda expression.
The “Constant conditions & exceptions” inspection surprisingly often finds bugs which are lurking in the big project for years. It’s even more helpful when you’re writing new yet untested code. While sometimes the warnings produced by this inspection might seem annoying, it really helps to prevent new bugs.
We already started a EAP program for 2018.1, so you can try the new features right now. We are constantly improving this inspection, reducing false-positives and increasing the number of valuable reports. We have more ideas to implement in future versions of IntelliJ IDEA, so stay tuned.