IntelliJ IDEA
IntelliJ IDEA – the Leading Java and Kotlin IDE, by JetBrains
IntelliJ IDEA’s Static Analysis vs. the Human Brain
A while ago, I was studying the output of IntelliJ IDEA’s static analyzer for Java code and came across an interesting case. Since the code fragment was not open source, I’ve anonymized it and removed the external dependencies. Let’s just say it looked something like this:
private static List process(Mapoptions, List inputs) { List res = new ArrayList<>(); int cur = -1; for (String str : inputs) { if (str.startsWith("-")) if (options.containsKey(str)) { if (cur == -1) cur = options.get(str); } else if (options.containsKey("+" + str)) { if (cur == -1) cur = res.isEmpty() ? -1 : res.remove(res.size() - 1); if (cur != -1) res.add(cur + str.length()); } } return res; }
As you can see, there is nothing special about this code: just a bunch of transformations and other operations. Nevertheless, the static analyzer didn’t like it. We can see two warnings here:
The IDE says that the res.isEmpty()
condition is always true, and, when checking cur
, it reports a meaningless assignment operation as this variable already has the same value. It’s easy to see that the problem with the assignment is a consequence of the first problem. If res.isEmpty()
really is always true, then the line is reduced to:
if (cur == -1) cur = -1;
That’s redundant indeed. But why is the expression always true? res
is a list populated in the same loop. The IDE doesn’t know how many times the loop iterates and which branches will be visited, as this depends on the input parameters. For example, if an element was added to res
on the previous iteration, the list won’t be empty.
This was the first time I saw this code, and I spent a lot of time thinking about it. Initially, I was almost sure I’d stumbled upon a bug in the analyzer and I would have to fix it. Let’s see if that’s true.
First, let’s mark all of the lines where the method state changes. There is a change in the variable cur
or the changing list res
:
private static List process(Mapoptions, List inputs) { List res = new ArrayList<>(); int cur = -1; for (String str : inputs) { if (str.startsWith("-")) if (options.containsKey(str)) { if (cur == -1) cur = options.get(str); // A } else if (options.containsKey("+" + str)) { if (cur == -1) cur = res.isEmpty() ? -1 : // B res.remove(res.size() - 1); // C if (cur != -1) res.add(cur + str.length()); // D } } return res; }
Lines ‘A’ and ‘B’ (‘B’ is the first branch of the conditional operator) change the variable cur; ‘D’ changes the list; аnd ‘C’ (the second branch of the conditional operator) changes both the list and the variable cur
. It is important whether or not cur
equals -1 and whether the list is empty. That is, we have to watch four states:
Line ‘A’ changes cur
if it was equal to -1 before. We don’t know if the result will be -1 or not. Therefore, there are two possible options:
Line ‘B’ also works only if cur
equals -1. But, as we’ve already seen, it’s not doing anything. However, let’s mark this edge on the graph to get the whole picture:
Line ‘C’, like the previous ones, works when cur == -1
and changes it arbitrarily (just as ‘A’ does). Additionally, it can turn a non-empty res
list into an empty one, or leave it non-empty if there was more than one element in it.
Finally, line ‘D’ increases the list size: it can turn an empty list into a non-empty one or enlarge a non-empty list. It can’t make a non-empty list empty, though:
What does this give us? Literally nothing. It’s confusing why res.isEmpty()
is allegedly always true.
Actually, we started out incorrectly. In this case, it’s not enough to check the state of each variable separately. Correlating states play an important role here. Fortunately, 2+2 = 2*2
, so we only have four of them:
I have marked the initial state with a double border when entering the method. Well, let’s try again. ‘A’ changes cur
or keeps it the same at any res, while res
doesn’t change:
‘B’ only works when cur == -1 && res.isEmpty()
and doesn’t do anything. So, we add:
‘C’ only works when cur == -1 && !res.isEmpty()
. Both cur
and res
change arbitrarily: after ‘C’ we find ourselves in any state:
At last, ‘D’ can start at cur != -1 && res.isEmpty()
and make the list non-empty, or start with cur != -1 && !res.isEmpty()
and not go anywhere:
At first glance, it seems like it’s only getting worse: the graph has become more complex and it’s unclear how to use it. Nevertheless, we’re actually close to finding a solution. Arrows now show all of our method’s possible execution flows. Since we know where we’ve started, let’s follow the arrows:
And here’s where it starts getting strange. We can’t get into the lower left-hand corner. And since we can’t get into it, we can’t follow any of the ‘C’ arrows. So, line ‘C’ is completely unreachable, while ‘B’ can be executed. That’s only possible if the condition res.isEmpty()
really is always true! The IntelliJ IDEA analyzer is completely right. Sorry, analyzer, I should never have thought you were buggy. It’s that you’re so smart, it’s hard for me, a mortal, to keep up with you.
Our analyzer isn’t based on any hyped-up AI technologies. Instead, it uses control flow analysis and data flow analysis approaches, both of which have been around for over 50 years. It does sometimes arrive at unexpected conclusions, but this is understandable: machines have long surpassed humans when it comes to drawing graphs and walking through them.
There’s an important unresolved issue here: it’s not enough just to tell someone they have an error in their program. The artificial brain has to explain to the human brain why it has solved something the way it has, and go the extra mile to dumb it down. If you have any bright ideas about how to do this, our team would love to hear from you and look into working together!
One of the acceptance tests is in front of you. For this example, the IDE should automatically generate a helpful explanation. It can be text, a graph, a tree, a picture with kittens – anything a human can wrap their head around.
P.S. We still don’t know what the method author meant anyway and how the code should look. The people in charge of the subsystem told me that this part has been kind of abandoned, and they have no idea how to fix the code or maybe just remove it.