IntelliJ IDEA’s Static Analysis vs. the Human Brain

Posted on by Tagir Valeev

A while ago, I was studying the output of IntelliJ IDEA’s static analyzer for Java code and came across an interesting case. Since the code fragment was not open source, I’ve anonymized it and removed the external dependencies. Let’s just say it looked something like this:

private static List process(Map options, List inputs) {
    List res = new ArrayList<>();
    int cur = -1;
    for (String str : inputs) {
        if (str.startsWith("-"))
            if (options.containsKey(str)) {
                if (cur == -1) cur = options.get(str);
            else if (options.containsKey("+" + str)) {
                if (cur == -1) cur = res.isEmpty() ? -1 :
                        res.remove(res.size() - 1);
                if (cur != -1) res.add(cur + str.length());
    return res;

As you can see, there is nothing special about this code: just a bunch of transformations and other operations. Nevertheless, the static analyzer didn’t like it. We can see two warnings here:


The IDE says that the res.isEmpty() condition is always true, and, when checking cur, it reports a meaningless assignment operation as this variable already has the same value. It’s easy to see that the problem with the assignment is a consequence of the first problem. If res.isEmpty() really is always true, then the line is reduced to:

if (cur == -1) cur = -1;

That’s redundant indeed. But why is the expression always true? res is a list populated in the same loop. The IDE doesn’t know how many times the loop iterates and which branches will be visited, as this depends on the input parameters. For example, if an element was added to res on the previous iteration, the list won’t be empty.

This was the first time I saw this code, and I spent a lot of time thinking about it. Initially, I was almost sure I’d stumbled upon a bug in the analyzer and I would have to fix it. Let’s see if that’s true.

First, let’s mark all of the lines where the method state changes. There is a change in the variable cur or the changing list res:

private static List process(Map options, List inputs) {
    List res = new ArrayList<>();
    int cur = -1;
    for (String str : inputs) {
        if (str.startsWith("-"))
            if (options.containsKey(str)) {
                if (cur == -1) cur = options.get(str); // A
            else if (options.containsKey("+" + str)) {
                if (cur == -1) cur = res.isEmpty() ? -1 : // B 
                        res.remove(res.size() - 1); // C
                if (cur != -1) res.add(cur + str.length()); // D
    return res;

Lines ‘A’ and ‘B’ (‘B’ is the first branch of the conditional operator) change the variable cur; ‘D’ changes the list; аnd ‘C’ (the second branch of the conditional operator) changes both the list and the variable cur. It is important whether or not cur equals -1 and whether the list is empty. That is, we have to watch four states:


Line ‘A’ changes cur if it was equal to -1 before. We don’t know if the result will be -1 or not. Therefore, there are two possible options:


Line ‘B’ also works only if cur equals -1. But, as we’ve already seen, it’s not doing anything. However, let’s mark this edge on the graph to get the whole picture:


Line ‘C’, like the previous ones, works when cur == -1 and changes it arbitrarily (just as ‘A’ does). Additionally, it can turn a non-empty res list into an empty one, or leave it non-empty if there was more than one element in it.


Finally, line ‘D’ increases the list size: it can turn an empty list into a non-empty one or enlarge a non-empty list. It can’t make a non-empty list empty, though:


What does this give us? Literally nothing. It’s confusing why res.isEmpty() is allegedly always true.
Actually, we started out incorrectly. In this case, it’s not enough to check the state of each variable separately. Correlating states play an important role here. Fortunately, 2+2 = 2*2, so we only have four of them:


I have marked the initial state with a double border when entering the method. Well, let’s try again. ‘A’ changes cur or keeps it the same at any res, while res doesn’t change:


‘B’ only works when cur == -1 && res.isEmpty() and doesn’t do anything. So, we add:


‘C’ only works when cur == -1 && !res.isEmpty(). Both cur and res change arbitrarily: after ‘C’ we find ourselves in any state:


At last, ‘D’ can start at cur != -1 && res.isEmpty() and make the list non-empty, or start with cur != -1 && !res.isEmpty() and not go anywhere:


At first glance, it seems like it’s only getting worse: the graph has become more complex and it’s unclear how to use it. Nevertheless, we’re actually close to finding a solution. Arrows now show all of our method’s possible execution flows. Since we know where we’ve started, let’s follow the arrows:


And here’s where it starts getting strange. We can’t get into the lower left-hand corner. And since we can’t get into it, we can’t follow any of the ‘C’ arrows. So, line ‘C’ is completely unreachable, while ‘B’ can be executed. That’s only possible if the condition res.isEmpty() really is always true! The IntelliJ IDEA analyzer is completely right. Sorry, analyzer, I should never have thought you were buggy. It’s that you’re so smart, it’s hard for me, a mortal, to keep up with you.

Our analyzer isn’t based on any hyped-up AI technologies. Instead, it uses control flow analysis and data flow analysis approaches, both of which have been around for over 50 years. It does sometimes arrive at unexpected conclusions, but this is understandable: machines have long surpassed humans when it comes to drawing graphs and walking through them.
There’s an important unresolved issue here: it’s not enough just to tell someone they have an error in their program. The artificial brain has to explain to the human brain why it has solved something the way it has, and go the extra mile to dumb it down. If you have any bright ideas about how to do this, our team would love to hear from you and look into working together!
One of the acceptance tests is in front of you. For this example, the IDE should automatically generate a helpful explanation. It can be text, a graph, a tree, a picture with kittens – anything a human can wrap their head around.

P.S. We still don’t know what the method author meant anyway and how the code should look. The people in charge of the subsystem told me that this part has been kind of abandoned, and they have no idea how to fix the code or maybe just remove it.

Comments below can no longer be edited.

15 Responses to IntelliJ IDEA’s Static Analysis vs. the Human Brain

  1. Sergei Tachenov says:

    November 19, 2019

    And I know of an exactly opposite case, when it looks like IDEA is right, but it’s totally wrong and the suggested change breaks everything. In short, it reported an unnecessary not-null check when a final field was initialized to a non-null value. Except that if that check happens in a method called from a base class constructor… which is not obvious at all. See IDEA-221035 for an example. Thankfully, it has been fixed already.

  2. Stef van Schie says:

    November 19, 2019

    I think the graph you’ve shown here excellently represents the control flow of a method and immediately explains why something can’t be reached. Of course, something like A, B and C don’t really say much when you don’t know which letter stands for which line, but that could easily be fixed by showing the entire line, or even presenting a legend. I’ve run into a similar case where IntelliJ IDEA was too smart for me, to the point where someone at JetBrains had to explain to me why this warning was shown. If a diagram as shown in here was shown along with the warning or could even be shown with some kind of action at any time, that’d be a huge help.

    • Christian says:

      November 20, 2019

      That’s a very nice plugin-idea! I don’t know if the required information is public via API but if yes it should not be too hard to produce such graphs.

  3. Salvatore Pelligra says:

    November 19, 2019

    What about this flow?

    Given inputs to be [“-1”, “-2”, “-3”, “-4”]
    Given options to be {“-1”: 2, “+-2”: 3, “-3”: -1, “+-4”: 4 }

    Then we we have:

    cur = -1, res = []

    inp = “-1”
    cur = 2, res = []

    inp = “-2”
    cur = 3, res = [5]

    inp = “-3”
    cur = -1, res = [5]

    As you can see, with next input we’ll hit the C case:
    inp = “-4”
    cur = -1, res = []

    • Salvatore Pelligra says:

      November 19, 2019

      Ah, nevermind, I didn’t see the `if cur == -1` guard on case A! Feel free to delete the previous comment 🙂

  4. Mark Yagnatinsky says:

    November 19, 2019

    Ideally, I’d wish for something like “once res.add is reached, cur will never equal -1”.
    I have no idea how to implement something like that though.
    For those who haven’t seen it, this is a fun read though:

  5. Mark Yagnatinsky says:

    November 19, 2019

    I just tried re-writing this method slightly, using only transformations that IntelliJ can do for me. In particular I turned this:

    if (cur == -1) cur = res.isEmpty() ? -1 : res.remove(res.size() – 1);

    into this:

    if (cur == -1 && !res.isEmpty()) res.remove(res.size() – 1);

    IntelliJ now I click “find cause” on the warning that is generated, and it’s actually surprisingly good! It even includes a mention of the line containing the “if” statement which guards the add! So now I do have some concrete advice: “simply” (ha-ha) generate the same warning without forcing the user to perform the mechanical refactoring that I did.

    • Tagir Valeev says:

      November 20, 2019

      Thank you! We will think how to improve “Find the cause” feature in this case.

  6. Reece Dunn says:

    November 19, 2019

    I find the clang scan-build output (and similar tools like coverity) somewhat useful for following the logic, but can easily get lost in the logic flow. It would be useful to improve on this, making it easier to follow. For example:
    1. Have a table with the file, line, column, and message, allowing you to have the inference steps in order and easily navigating them from the IDE.
    2. Have the analysis messages available in the gutter via a tooltip, with a way of navigating to the previous/next inference step.
    3. Follow the inference state of the variables/methods (e.g. the `cur = -1`/`cur != -1` in the example in this post) — this would allow you to follow the entire state of the inference at each point in the chain.

    Maybe a UI similar to the Debugging UI, so you could have the call stack (inference chain), variables/watches, etc. Maybe having the step in/out/over/back for navigating the inference chain.

  7. Mathew says:

    November 20, 2019

    I’ve been a long time IDEA user and i also studied abit on compilers which use static Code analysis for such purposes to determine the reachability of code so I absolutely appreciate the convenience IDEA provides and I trust static code analysis and definitely IDEA hints.

    Coding should be fun and deployment should be painless. Checkout what darklang is doing. If the process of developing apps is simple enough, more people can start to do programming, not just IT professionals.

    Maybe if IDEA has an explain feature which let’s you pick a line and just churns out the CFGs with directed arrows. Or fill in the possible values of a variable at a particular line.

    I’ve seen many developers run a program(which might be slow) just to find out the value of a variable, to see what it looks like without having to think. Programmers are lazy and just want to see. IDEA is by far the best IDE, until something like darklang catches on.

  8. ManoDestra says:

    November 20, 2019

    The code is not written according to coding standards which makes it incredibly unmaintainable and prone to error. Not surprised it was abandoned.

    • opticyclic says:

      December 29, 2019

      I was going to say the same thing.
      It is very hard to read with variables named “str” and “res” and the weird mix of inlined if statements with ternary operators split over multiple lines.

      It highlights the point that code is read more than it is written and readability is very important in maintainability.

  9. Daniel Cuadra says:

    November 20, 2019

    I find myself that Checkmarx has a stunning presentation model for bugs and security issues, which allows a developer to be walked through how the system flows until you hit the issue. It is very nice because it gives you the start point, then it starts stacking until you hit the issue; it even shows some variables current value as they would be mutating, so it becomes evident how and when a bug will trigger.

    It doesn’t have any visual diagrams or anything, it’s more like if another developer was explaining you a reproducible bug during code review.

  10. Taso says:

    November 21, 2019

    You know what would be really cool. Jetbrains provide this powerful tool as a CLI decoupled from the big IDE. I want to run these on our CI pipeline but it is pretty difficult


Subscribe for updates