{"id":26427,"date":"2018-01-24T10:06:22","date_gmt":"2018-01-24T10:06:22","guid":{"rendered":"https:\/\/blog.jetbrains.com\/idea\/?p=17006"},"modified":"2018-01-25T03:30:39","modified_gmt":"2018-01-25T03:30:39","slug":"fumigating-the-idea-ultimate-code-using-dataflow-analysis","status":"publish","type":"idea","link":"https:\/\/blog.jetbrains.com\/zh-hans\/idea\/2018\/01\/fumigating-the-idea-ultimate-code-using-dataflow-analysis","title":{"rendered":"Fumigating the IDEA Ultimate code using dataflow analysis"},"content":{"rendered":"<p>IntelliJ IDEA ships with thousands of Java code inspections. Most of them work like very advanced regular expressions: they look for specific patterns in the code which look like typos, are not very performant, confusing, or overly verbose. However one of them is not of that kind. Its name is \u201cConstant conditions &amp; exceptions\u201d which sounds somewhat confusing. In fact, it performs data flow analysis of Java methods using a technique called \u201csymbolic execution\u201d and reveals suspicious facts about the code. Here are some examples of the facts it can reveal:<\/p>\n<ul>\n<li>Dereference may cause a <code>NullPointerException<\/code> on some code path<\/li>\n<li>Condition is always true or false in all possible code paths<\/li>\n<li>Array index is always out of bounds in all possible code paths<\/li>\n<li>Type cast will always fail<\/li>\n<\/ul>\n<p>These facts do not always signal a bug. For example, one may want to add an always-true statement for code clarity, like in this case:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"default\" data-enlighter-title=\"\">if (obj == null) {\r\n  \u2026\r\n} \r\nelse if (obj != null) { \/\/ &#039;obj != null&#039; is always true reported here\r\n  \u2026\r\n}<\/pre>\n<p><!--more--><br \/>\nHowever often such warnings signal that code contains a typo or the code author did not examine the code behavior fully. This inspection may help to discover really weird bugs which are not covered by any patterns. Sometimes even we, the IDE authors, could not imagine the code snippet in which \u201cConstant conditions &amp; exceptions\u201d warning was issued.<\/p>\n<p>Over the years we have been constantly improving this inspection. It used to be mostly concentrated on nullability and type analysis. Later we added Optional presence analysis, integral ranges analysis, and array bounds checking. In the previous major release 2017.3, we improved integral ranges analysis and added basic support for Stream API and Optional call chains.<\/p>\n<p>The first big testing of \u201cConstant conditions &amp; exceptions\u201d is performed on the source code of IDEA Ultimate itself. It\u2019s quite a huge project as the codebase contains tens of thousands of classes, written by dozens of developers, over more than 15 years. After the next inspection improvement, it\u2019s always interesting to discover yet another bug in some ancient class. Let&#8217;s see what we have found after improving the dataflow analysis for the upcoming release.<\/p>\n<p>One of the most interesting dataflow analysis improvements in 2018.1 is the tracking of relations between variables. In older IDEA versions the inspection tracked the variables equality or inequality, so <code>if(a == b &amp;&amp; a != b)<\/code> was highlighted as always false. When we started tracking integral ranges, new warnings appeared in the code like <code>if(a &gt; 0 &amp;&amp; a &lt;= 0)<\/code>, because the first condition implies that the <code>a<\/code> value belongs to the range <code>[1..MAX_VALUE]<\/code> while the second condition requires the range <code>[MIN_VALUE..0]<\/code> and these ranges never intersect. However, up until now, there was no warning on an <code>if(a &gt; b &amp;&amp; a &lt;= b)<\/code> condition because we knew nothing more about the value of these two variables. Now we have started to track which variable is greater, so in 2018.1 this snippet will be highlighted as well. This might look pretty trivial and such mistakes would unlikely remain unnoticed in the code, but remember that dataflow analysis doesn\u2019t just track specific code patterns.<\/p>\n<p>First, it can easily detect an opposite case, and it was actually found in some SQL tables processing code:<\/p>\n<p><a href=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2018\/01\/idea-1_getancestorrefs.png\" rel=\"attachment wp-att-17011\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-17011\" src=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2018\/01\/idea-1_getancestorrefs.png\" alt=\"idx &lt; myAncestorRefs.length || idx &gt;= myAncestorRefs.length\" width=\"750\" height=\"127\" \/><\/a><\/p>\n<p>Yes, the method starts with an always true condition, so it always returns the empty list, because the <code>idx<\/code> variable is always either less than the <code>myAncestorRefs<\/code> array length, or greater, or equal to. The following ten lines of code were never executed. It\u2019s obviously a typo: <code>idx &lt; 0<\/code> was intended instead of <code>idx &lt; myAncestorRefs.length<\/code>.<\/p>\n<p>Speaking of array lengths, it appeared that the new feature nicely couples with already existing arrays bounds checking. The following code snippet was discovered in our Groovy plugin:<\/p>\n<p><a href=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2018\/01\/idea-2_getinitializerfor.png\" rel=\"attachment wp-att-17013\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-17013\" src=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2018\/01\/idea-2_getinitializerfor.png\" alt=\"if (initializers.length &lt; i) return initializers[i];\" width=\"750\" height=\"133\" \/><\/a><\/p>\n<p>Now at the array access expression, the inspection knows that the <code>i<\/code> value is always greater than the <code>initializers<\/code> array length. Thus an <code>ArrayIndexOutOfBoundsException<\/code> is inevitable if this code is executed. Clearly the condition was mistakenly flipped.<\/p>\n<p>Another array-related problem was discovered in the source code of some Java inspection which processes catch blocks:<\/p>\n<p><a href=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2018\/01\/idea-3_issuperclassexceptioncaughtlater.png\" rel=\"attachment wp-att-17014\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-17014\" src=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2018\/01\/idea-3_issuperclassexceptioncaughtlater.png\" alt=\"while (catchSections[index] != catchSection &amp;&amp; index &lt; catchSections.length)\" width=\"750\" height=\"82\" \/><\/a><\/p>\n<p>As you can see, the <code>index<\/code> bound is checked after it\u2019s used to access an array element. Thus the condition can never be false: if it\u2019s false, then <code>ArrayIndexOutOfBoundsException<\/code> would be thrown right before.<\/p>\n<p>Now in some cases, we track values from array initializer expressions. This has allowed us to discover some redundant code like this:<\/p>\n<p><a href=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2018\/01\/idea-4_copyclassesimpl.png\" rel=\"attachment wp-att-17015\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-17015\" src=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2018\/01\/idea-4_copyclassesimpl.png\" alt=\"boolean[] result = new boolean[] {false}; &lt;...&gt; if (result[0])\" width=\"750\" height=\"186\" \/><\/a><\/p>\n<p>Long ago the <code>result<\/code> one-element array was used inside the <code>command<\/code> lambda to record that editor activation is necessary. Later during code evolution, the editor activation became unnecessary, but the redundant condition was not reported by any existing inspection. Now we clearly see this.<\/p>\n<p>Boolean methods starting with \u201cis\u201d are now assumed to return the same result if called twice in a row on the same qualifier value. This is a heuristic, which is not necessarily true, but if it produces a false-positive warning, then it\u2019s probably a sign that the code would be confusing not only for the IDE, but for other teammates as well. Here\u2019s a bug which was discovered in CSS selectors handling code thanks to this improvement:<\/p>\n<p><a href=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2018\/01\/idea-5_getsourcerange.png\" rel=\"attachment wp-att-17017\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-17017\" src=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2018\/01\/idea-5_getsourcerange.png\" alt=\"localSelectorElement.isUniversalSelector() != remoteSelectorElement.isUniversalSelector()\" width=\"750\" height=\"234\" \/><\/a><\/p>\n<p>Even though the qualifiers are different here, they were assigned to the same value previously. Thus we assume that <code>isUniversalSelector<\/code> would return the same result. The typo is actually not in the condition, but in a line above: of course it should be <code>remoteSelectorElement = (CssSimpleSelector)remoteSelectorElements[j]<\/code>.<\/p>\n<p>Since 2018.1 we added another kind of warning. When a variable is assigned to a value, we warn you if it already contains the same value on every possible execution path. Again, it doesn\u2019t always indicate a bug, but it may help to remove redundant code. However real bugs also could be found with the help of this warning. For example, the following code was found in our HTML formatting tests:<\/p>\n<p><a href=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2018\/01\/idea-6_testindent.png\" rel=\"attachment wp-att-17018\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-17018\" src=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2018\/01\/idea-6_testindent.png\" alt=\"int indentSize = htmlIndentOptions.INDENT_SIZE = 2;\" width=\"750\" height=\"216\" \/><\/a><\/p>\n<p>The problem is three lines above the warning: the first line of the code fragment should not have \u201c<code>= 2<\/code>\u201d at the end. As a result, previous indent size is not restored in the finally block.<\/p>\n<p>Also, this warning sometimes allows you to spot duplicates in the field or array initialization series like here:<\/p>\n<p><a href=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2018\/01\/idea-7_abc.png\" rel=\"attachment wp-att-17019\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-17019\" src=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2018\/01\/idea-7_abc.png\" alt=\"defaults[CONSTANT_Int] = ints;\" width=\"750\" height=\"129\" \/><\/a><\/p>\n<p>This line is probably redundant, or something else was meant to be here instead, but it\u2019s clearly a mistake.<\/p>\n<p>Trivial improvement in <code>this<\/code> reference handling revealed a bug in our tabs UI control:<\/p>\n<p><a href=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2018\/01\/idea-8__findInfo.png\" rel=\"attachment wp-att-17020\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-17020\" src=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2018\/01\/idea-8__findInfo.png\" alt=\"while (component != this || component != null)\" width=\"750\" height=\"108\" \/><\/a><\/p>\n<p>As known, the disjunction operator is short-circuited: if the first condition is true, then the second is not checked. If the first condition is false, then <code>component<\/code> equals to <code>this<\/code>, but <code>this<\/code> is never null, which means the <code>component<\/code> is not null as well. Thus the whole condition is always true. It\u2019s a very common mistake when <code>||<\/code> was used instead of <code>&amp;&amp;<\/code> and vice versa.<\/p>\n<p>Finally, we improved Stream API processing inside dataflow analysis. In particular, it now works for incomplete stream chains as well (that is, stream chains not ending with the terminal operation). No actual bugs were discovered because of this improvement, but several redundant code occurrences were found. For example, a method like this appears in our code coverage processing:<\/p>\n<p><a href=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2018\/01\/idea-9_getmissingbranches.png\" rel=\"attachment wp-att-17021\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-17021\" src=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2018\/01\/idea-9_getmissingbranches.png\" alt=\"mapToObj(idx -&gt; Pair.create(...)).filter((x) -&gt; x != null &amp;&amp; ...);\" width=\"750\" height=\"98\" \/><\/a><\/p>\n<p>The <code>Pair.create<\/code> method is <a href=\"https:\/\/blog.jetbrains.com\/idea\/2014\/10\/automatic-notnullnullablecontract-inference-in-intellij-idea-14\/\">inferred<\/a> as <code>@NotNull<\/code>, thus <code>x != null<\/code> condition is known to be redundant. The inspection is aware that <code>x<\/code> has the same value which was returned by the previous lambda expression.<\/p>\n<p>The \u201cConstant conditions &amp; exceptions\u201d inspection surprisingly often finds bugs which are lurking in the big project for years. It\u2019s even more helpful when you\u2019re writing new yet untested code. While sometimes the warnings produced by this inspection might seem annoying, it really helps to prevent new bugs.<\/p>\n<p>We already <a href=\"https:\/\/blog.jetbrains.com\/idea\/2018\/01\/intellij-idea-starts-2018-1-early-access-program\/\">started a EAP program<\/a> for 2018.1, so you can try the new features right now. We are constantly improving this inspection, reducing false-positives and increasing the number of valuable reports. We have more ideas to implement in future versions of IntelliJ IDEA, so stay tuned.<\/p>\n","protected":false},"author":542,"featured_media":0,"comment_status":"open","ping_status":"open","template":"","categories":[826,808],"tags":[1522,685],"cross-post-tag":[],"acf":[],"_links":{"self":[{"href":"https:\/\/blog.jetbrains.com\/zh-hans\/wp-json\/wp\/v2\/idea\/26427"}],"collection":[{"href":"https:\/\/blog.jetbrains.com\/zh-hans\/wp-json\/wp\/v2\/idea"}],"about":[{"href":"https:\/\/blog.jetbrains.com\/zh-hans\/wp-json\/wp\/v2\/types\/idea"}],"author":[{"embeddable":true,"href":"https:\/\/blog.jetbrains.com\/zh-hans\/wp-json\/wp\/v2\/users\/542"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.jetbrains.com\/zh-hans\/wp-json\/wp\/v2\/comments?post=26427"}],"version-history":[{"count":0,"href":"https:\/\/blog.jetbrains.com\/zh-hans\/wp-json\/wp\/v2\/idea\/26427\/revisions"}],"wp:attachment":[{"href":"https:\/\/blog.jetbrains.com\/zh-hans\/wp-json\/wp\/v2\/media?parent=26427"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.jetbrains.com\/zh-hans\/wp-json\/wp\/v2\/categories?post=26427"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.jetbrains.com\/zh-hans\/wp-json\/wp\/v2\/tags?post=26427"},{"taxonomy":"cross-post-tag","embeddable":true,"href":"https:\/\/blog.jetbrains.com\/zh-hans\/wp-json\/wp\/v2\/cross-post-tag?post=26427"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}