Tips & Tricks

Living with Microsoft C++ Compiler Bugs and Ambiguities

It’s no secret that the Microsoft Visual C++ compiler has lots of non-standard behaviors. What’s even more unfortunate is that those behaviors are subsequently used by different libraries. Ultimately, for a tool vendor there’s no option in not supporting all of those peculiarities. Here are a few that we’ve come across:

Eliminated Types

In addition to using C++ macros to generate actual executable code, you can exploit a Microsoft-specific peculiarity to generate comments as well. This is possible thanks to simple concatenation. Define a CONCAT macro as follows:

#define CONCAT(A, B) A##B

Now, simply concatenating two slashes together generates a // line comment:

#define START_LINE_COMMENT CONCAT(/,/)
START_LINE_COMMENT this is a line comment;

This shouldn’t really work because the result of the ## operator must be a valid token. Similarly, you can make a block comment. However, you cannot terminate a macro comment with a macro call, only with */:

#define START_BLOCK_COMMENT CONCAT(/,*)
START_BLOCK_COMMENT  This is a block comment! */

So why is this relevant? Well because, if you look at e.g., the <oaidl.h> header, you will see declarations similar to the following:

union 
{
  VARIANT_BOOL boolVal;
  _VARIANT_BOOL bool;
}

Depending on the version of the compiler, _VARIANT_BOOL will either generate a comment or a ‘variant bool’ type:

#if !__STDC__ && (_MSC_VER <= 1000)
/* For backward compatibility */
typedef VARIANT_BOOL _VARIANT_BOOL;
#else
/* ANSI C/C++ reserve bool as keyword */
#define _VARIANT_BOOL    /##/
#endif

__VA_ARGS__ Chaos

Owing to a bug in Microsoft compiler’s refusal to expand __VA_ARGS__ into an argument list, the following definition is ambiguous:

#define A2(a1, a2) ((a1)+(a2))
#define A_VA(...) A2(__VA_ARGS__)

Now consider the statement printf("%d\n", A_VA(1, 2));. On GCC, Clang and similar, this would correctly expand to

printf("%d\n", ((1)+(2)));

However, the Microsoft compiler would instead expand it to

printf("%d\n", ((1, 2)+()));

Interestingly, Microsoft does not admit this is a bug. I guess it’s easier that way – if it’s not a bug, no need to fix it, right?

On the other hand, the Boost library is aware of this bug, so it checks explicitly whether the compiler is MSVC and, if it is, it tries to do things differently. On the ReSharper end, even though ReSharper C++ is adapted to MSVC (e.g., in terms of MSC_VER and MSC_FULL_VER definitions), we try to handle the VA_ARGS case as standard-compliant compilers which, unfortunately, leads to errors when parsing Boost.

ReSharper C++ meanwhile knows how to handle the MSVC case too, and this is precisely what we do when parsing the Microsoft standard libraries. However, if we adopt the MS-centric approach to all libraries, this would cause ReSharper C++ to fail in all sorts of cross-platform projects which are edited in VS but are built using NMake. And we cannot behave in a Microsoft-compliant way in Boost because errors occur not only in Boost itself but also in locations where those affected constructs are being used. So this is why in some Boost files ReSharper C++ pretends to be Clang and not MSVC.

Binding Rvalues to Lvalue References

This is an error according to the C++ standard, but the Microsoft compiler allows the binding of rvalues to lvalue references via an extension. Here is an example:

struct X{};
int update_X(X&);
int main()
{
    update_X(X());
}

The code above will not compile on either GCC or Clang.

So you might think that extending support for this situation in ReSharper is not a problem. Well, it turns out, things are a little bit trickier. Consider the following code:

struct Y {};
struct X
{
    operator Y();
};
int some_func(X&);
int some_func(Y);
int main()
{
    some_func(X());
}

Which overload does main() call? The C++ standard says that some_func(Y) must be chosen, since the overload binds an rvalue to an lvalue reference, which is not allowed. But if this situation is allowed, some_func(X&) must be chosen instead. The problem is that this would make a standard-compliant program behave differently under MSVC, so in this specific case the conversion is disallowed.

So, after some trial and error, we found a proper way to handle this: we first do standard-compliant overloading, i.e., disallowing the binding of rvalues to lvalue references. Then, if and only if we fail to find a viable function, we check the overload again, this time allowing the binding to happen.

As you can see, the tricky part here is figuring out how various MSVC extensions interact with the C++ standard and the way they are implemented. Expect a blog post on them too, soon!

Dreaded Two-Phase Lookup

The C++ standard states that lookup in template code must happen in two phases:

  • The first phase happens when the code of the template is parsed. All non-dependent names (those that do not depend on template parameters) are looked up straight away.
  • The second phase occurs when the template is being instantiated and this is when all dependent names are being looked up.

So, for example, the following code must produce an error according to the C++ standard.

template <class T>
struct X {
   void method() {
      a = 10; // a cannot be resolved
   }
};

This code should not compile because the variable a is not a dependent name and should be found at the declaration parsing stage. However, this is not the case with MSVC, which will happily compile the code and complain only when X is instantiated and method() is called, i.e. in something like

void test() {
  X().method();
}

But now things get even trickier, because if we declare a after the template declaration but before instantiation, everything works fine in MSVC, but must also fail according to the C++ standard:

template <class T>
struct X {
   void method() {
      a = 10; // a cannot be resolved
   }
};
int a = 0; // whoops!
void test() {
  X().method();
}

It gets even more confusing when base classes are involved:

struct A { int a; };
struct B { int b; };
template <class T>
struct X : A, T {
  int x;
  void method() {
     x = 10; // ok, x is found in X
     a = 10; // ok, a is found in A
     b = 10; // should be an error, b is non-dependent name but lookup won't find it
  }
};
void test() {
  X<B> x;
}

This code compiles without problem in MSVC, but should result in an error. The workaround to make the code standard-compliant is to access the variable b as:

this->b = 10;

So in ReSharper, we use the standard-compliant version of the lookup, so we show an error when accessing b without the this-> qualification:

ReSharper C++ unqualified dependent lookup error

However, a large number of programs written for MSVC adopt this style. And why wouldn’t they? After all, this seems like a very natural way to write code and, without proper compiler diagnostics, they would never have a clue that the code is not standard-compliant.

This is why we show warnings for these cases instead of errors by default. However, this behavior can be changed in ReSharper C++ settings by going into ReSharper settings (Code Editing → C++ → Inspections) and unchecking the appropriate flag:

ReSharper C++ Treat resolve errors as warnings in template code

We highly encourage everyone willing to write standard-compliant code to uncheck this flag!

In Closing

This may have been a bit of a discouraging post with respect to MSVC, but do not worry — in a subsequent post we’ll take a look at MSVC-specific constructs which, while not necessarily standard-compliant, are nevertheless interesting and often quite usable. ■

image description