{"id":27488,"date":"2019-11-19T08:01:16","date_gmt":"2019-11-19T08:01:16","guid":{"rendered":"https:\/\/blog.jetbrains.com\/idea\/?p=20669"},"modified":"2019-11-19T10:13:23","modified_gmt":"2019-11-19T10:13:23","slug":"intellij-ideas-static-analysis-vs-the-human-brain","status":"publish","type":"idea","link":"https:\/\/blog.jetbrains.com\/zh-hans\/idea\/2019\/11\/intellij-ideas-static-analysis-vs-the-human-brain","title":{"rendered":"IntelliJ IDEA\u2019s Static Analysis vs. the Human Brain"},"content":{"rendered":"<p>A while ago, I was studying the output of IntelliJ IDEA\u2019s static analyzer for Java code and came across an interesting case. Since the code fragment was not open source, I\u2019ve anonymized it and removed the external dependencies. Let&#8217;s just say it looked something like this:<\/p>\n<pre>private static List process(Map<String, Integer> options, List inputs) {\r\n    List res = new ArrayList<>();\r\n    int cur = -1;\r\n    for (String str : inputs) {\r\n        if (str.startsWith(\"-\"))\r\n            if (options.containsKey(str)) {\r\n                if (cur == -1) cur = options.get(str);\r\n            }\r\n            else if (options.containsKey(\"+\" + str)) {\r\n                if (cur == -1) cur = res.isEmpty() ? -1 :\r\n                        res.remove(res.size() - 1);\r\n                if (cur != -1) res.add(cur + str.length());\r\n            }\r\n    }\r\n    return res;\r\n}<\/pre>\n<p><!--more-->As you can see, there is nothing special about this code: just a bunch of transformations and other operations. Nevertheless, the static analyzer didn\u2019t like it. We can see two warnings here:<\/p>\n<p><a href=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2019\/11\/idea-screen.png\" rel=\"attachment wp-att-20680\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-20680\" src=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2019\/11\/idea-screen.png\" alt=\"screen\" width=\"2776\" height=\"1172\" \/><\/a><\/p>\n<p>The IDE says that the <code>res.isEmpty()<\/code> condition is always true, and, when checking <code>cur<\/code>, it reports a meaningless assignment operation as this variable already has the same value. It&#8217;s easy to see that the problem with the assignment is a consequence of the first problem. If <code>res.isEmpty()<\/code> really is always true, then the line is reduced to:<\/p>\n<pre>if (cur == -1) cur = -1;<\/pre>\n<p>That&#8217;s redundant indeed. But why is the expression always true? <code>res<\/code> is a list populated in the same loop. The IDE doesn&#8217;t know how many times the loop iterates and which branches will be visited, as this depends on the input parameters. For example, if an element was added to <code>res<\/code> on the previous iteration, the list won&#8217;t be empty.<\/p>\n<p>This was the first time I saw this code, and I spent a lot of time thinking about it. Initially, I was almost sure I\u2019d stumbled upon a bug in the analyzer and I would have to fix it. Let&#8217;s see if that&#8217;s true.<\/p>\n<p>First, let&#8217;s mark all of the lines where the method state changes. There is a change in the variable <code>cur<\/code> or the changing list <code>res<\/code>:<\/p>\n<pre>private static List process(Map<String, Integer> options, List inputs) {\r\n    List res = new ArrayList<>();\r\n    int cur = -1;\r\n    for (String str : inputs) {\r\n        if (str.startsWith(\"-\"))\r\n            if (options.containsKey(str)) {\r\n                if (cur == -1) cur = options.get(str); \/\/ A\r\n            }\r\n            else if (options.containsKey(\"+\" + str)) {\r\n                if (cur == -1) cur = res.isEmpty() ? -1 : \/\/ B \r\n                        res.remove(res.size() - 1); \/\/ C\r\n                if (cur != -1) res.add(cur + str.length()); \/\/ D\r\n            }\r\n    }\r\n    return res;\r\n}<\/pre>\n<p>Lines &#8216;A&#8217; and &#8216;B&#8217; (&#8216;B&#8217; is the first branch of the conditional operator) change the variable cur; &#8216;D&#8217; changes the list; \u0430nd &#8216;C&#8217; (the second branch of the conditional operator) changes both the list and the variable <code>cur<\/code>. It is important whether or not <code>cur<\/code> equals -1 and whether the list is empty. That is, we have to watch four states:<\/p>\n<p><a href=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2019\/11\/idea-sep5.png\" rel=\"attachment wp-att-20673\"><img decoding=\"async\" class=\"alignnone size-medium wp-image-20673\" src=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2019\/11\/idea-sep5.png\" alt=\"sep5\" width=\"500\" \/><\/a><\/p>\n<p>Line &#8216;A&#8217; changes <code>cur<\/code> if it was equal to -1 before. We don&#8217;t know if the result will be -1 or not. Therefore, there are two possible options:<\/p>\n<p><a href=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2019\/11\/idea-sep4.png\" rel=\"attachment wp-att-20672\"><img decoding=\"async\" class=\"alignnone size-medium wp-image-20672\" src=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2019\/11\/idea-sep4.png\" alt=\"sep4\" width=\"500\" \/><\/a><\/p>\n<p>Line &#8216;B&#8217; also works only if <code>cur<\/code> equals -1. But, as we&#8217;ve already seen, it&#8217;s not doing anything. However, let&#8217;s mark this edge on the graph to get the whole picture:<\/p>\n<p><a href=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2019\/11\/idea-sep3.png\" rel=\"attachment wp-att-20671\"><img decoding=\"async\" class=\"alignnone size-medium wp-image-20671\" src=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2019\/11\/idea-sep3.png\" alt=\"sep3\" width=\"500\" \/><\/a><\/p>\n<p>Line &#8216;C&#8217;, like the previous ones, works when <code>cur == -1<\/code> and changes it arbitrarily (just as &#8216;A&#8217; does). Additionally, it can turn a non-empty <code>res<\/code> list into an empty one, or leave it non-empty if there was more than one element in it.<\/p>\n<p><a href=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2019\/11\/idea-sep2.png\" rel=\"attachment wp-att-20670\"><img decoding=\"async\" class=\"alignnone size-full wp-image-20670\" src=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2019\/11\/idea-sep2.png\" alt=\"sep2\" width=\"500\" \/><\/a><\/p>\n<p>Finally, line &#8216;D&#8217; increases the list size: it can turn an empty list into a non-empty one or enlarge a non-empty list. It can&#8217;t make a non-empty list empty, though:<\/p>\n<p><a href=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2019\/11\/idea-sep1.png\" rel=\"attachment wp-att-20681\"><img decoding=\"async\" class=\"alignnone size-medium wp-image-20681\" src=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2019\/11\/idea-sep1.png\" alt=\"sep1\" width=\"500\" \/><\/a><\/p>\n<p>What does this give us? Literally nothing. It&#8217;s confusing why <code>res.isEmpty()<\/code> is allegedly always true.<br \/>\nActually, we started out incorrectly. In this case, it\u2019s not enough to check the state of each variable separately. Correlating states play an important role here. Fortunately, <code>2+2 = 2*2<\/code>, so we only have four of them:<\/p>\n<p><a href=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2019\/11\/idea-full5.png\" rel=\"attachment wp-att-20679\"><img decoding=\"async\" class=\"alignnone size-medium wp-image-20679\" src=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2019\/11\/idea-full5.png\" alt=\"full5\" width=\"500\" \/><\/a><\/p>\n<p>I have marked the initial state with a double border when entering the method. Well, let&#8217;s try again. &#8216;A&#8217; changes <code>cur<\/code> or keeps it the same at any res, while <code>res<\/code> doesn\u2019t change:<\/p>\n<p><a href=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2019\/11\/idea-full4.png\" rel=\"attachment wp-att-20678\"><img decoding=\"async\" class=\"alignnone size-medium wp-image-20678\" src=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2019\/11\/idea-full4.png\" alt=\"full4\" width=\"500\" \/><\/a><\/p>\n<p>&#8216;B&#8217; only works when <code>cur == -1 &amp;&amp; res.isEmpty()<\/code> and doesn&#8217;t do anything. So, we add:<\/p>\n<p><a href=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2019\/11\/idea-full3.png\" rel=\"attachment wp-att-20677\"><img decoding=\"async\" class=\"alignnone size-medium wp-image-20677\" src=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2019\/11\/idea-full3.png\" alt=\"full3\" width=\"500\" \/><\/a><\/p>\n<p>&#8216;C&#8217; only works when <code>cur == -1 &amp;&amp; !res.isEmpty()<\/code>. Both <code>cur<\/code> and <code>res<\/code> change arbitrarily: after &#8216;C&#8217; we find ourselves in any state:<\/p>\n<p><a href=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2019\/11\/idea-full2.png\" rel=\"attachment wp-att-20676\"><img decoding=\"async\" class=\"alignnone size-medium wp-image-20676\" src=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2019\/11\/idea-full2.png\" alt=\"full2\" width=\"500\" \/><\/a><\/p>\n<p>At last, &#8216;D&#8217; can start at <code>cur != -1 &amp;&amp; res.isEmpty()<\/code> and make the list non-empty, or start with <code>cur != -1 &amp;&amp; !res.isEmpty()<\/code> and not go anywhere:<\/p>\n<p><a href=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2019\/11\/idea-full1.png\" rel=\"attachment wp-att-20675\"><img decoding=\"async\" class=\"alignnone size-medium wp-image-20675\" src=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2019\/11\/idea-full1.png\" alt=\"full1\" width=\"500\" \/><\/a><\/p>\n<p>At first glance, it seems like it&#8217;s only getting worse: the graph has become more complex and it\u2019s unclear how to use it. Nevertheless, we&#8217;re actually close to finding a solution. Arrows now show all of our method&#8217;s possible execution flows. Since we know where we\u2019ve started, let&#8217;s follow the arrows:<\/p>\n<p><a href=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2019\/11\/idea-dijkstra.png\" rel=\"attachment wp-att-20674\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-medium wp-image-20674\" src=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2019\/11\/idea-dijkstra.png\" alt=\"dijkstra\" width=\"500\" height=\"165\" \/><\/a><\/p>\n<p>And here&#8217;s where it starts getting strange. We can&#8217;t get into the lower left-hand corner. And since we can&#8217;t get into it, we can&#8217;t follow any of the &#8216;C&#8217; arrows. So, line &#8216;C&#8217; is completely unreachable, while &#8216;B&#8217; can be executed. That&#8217;s only possible if the condition <code>res.isEmpty()<\/code> really is always true! The IntelliJ IDEA analyzer is completely right. Sorry, analyzer, I should never have thought you were buggy. It\u2019s that you&#8217;re so smart, it&#8217;s hard for me, a mortal, to keep up with you.<\/p>\n<p>Our analyzer isn\u2019t based on any hyped-up AI technologies. Instead, it uses control flow analysis and data flow analysis approaches, both of which have been around for over 50 years. It does sometimes arrive at unexpected conclusions, but this is understandable: machines have long surpassed humans when it comes to drawing graphs and walking through them.<br \/>\nThere&#8217;s an important unresolved issue here: it&#8217;s not enough just to tell someone they have an error in their program. The artificial brain has to explain to the human brain why it has solved something the way it has, and go the extra mile to dumb it down. If you have any bright ideas about how to do this, our team would love to hear from you and look into working together!<br \/>\nOne of the acceptance tests is in front of you. For this example, the IDE should automatically generate a helpful explanation. It can be text, a graph, a tree, a picture with kittens \u2013 anything a human can wrap their head around.<\/p>\n<p>P.S. We still don\u2019t know what the method author meant anyway and how the code should look. The people in charge of the subsystem told me that this part has been kind of abandoned, and they have no idea how to fix the code or maybe just remove it.<\/p>\n","protected":false},"author":542,"featured_media":27494,"comment_status":"open","ping_status":"open","template":"","categories":[826],"tags":[2678,3369,40,155,2538],"cross-post-tag":[],"acf":[],"_links":{"self":[{"href":"https:\/\/blog.jetbrains.com\/zh-hans\/wp-json\/wp\/v2\/idea\/27488"}],"collection":[{"href":"https:\/\/blog.jetbrains.com\/zh-hans\/wp-json\/wp\/v2\/idea"}],"about":[{"href":"https:\/\/blog.jetbrains.com\/zh-hans\/wp-json\/wp\/v2\/types\/idea"}],"author":[{"embeddable":true,"href":"https:\/\/blog.jetbrains.com\/zh-hans\/wp-json\/wp\/v2\/users\/542"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.jetbrains.com\/zh-hans\/wp-json\/wp\/v2\/comments?post=27488"}],"version-history":[{"count":0,"href":"https:\/\/blog.jetbrains.com\/zh-hans\/wp-json\/wp\/v2\/idea\/27488\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blog.jetbrains.com\/zh-hans\/wp-json\/wp\/v2\/media\/27494"}],"wp:attachment":[{"href":"https:\/\/blog.jetbrains.com\/zh-hans\/wp-json\/wp\/v2\/media?parent=27488"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.jetbrains.com\/zh-hans\/wp-json\/wp\/v2\/categories?post=27488"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.jetbrains.com\/zh-hans\/wp-json\/wp\/v2\/tags?post=27488"},{"taxonomy":"cross-post-tag","embeddable":true,"href":"https:\/\/blog.jetbrains.com\/zh-hans\/wp-json\/wp\/v2\/cross-post-tag?post=27488"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}