C++ Annotated August 2021: Practical Modules, C++20 Attribute to Help with EBO, Valgrind, Intel Compiler, and CLion News
Our monthly C++ Annotated digest and its companion, the No Diagnostic Required show, released a new episode with August news!
If you have already subscribed, feel free to skip to the news. If you are new, you can explore all the formats we offer. Choose to read, listen, or watch our essential digest of this month’s C++ news:
- Read the monthly digest on our blog (use the form on the right to subscribe to the whole blog).
- Subscribe to C++ Annotated emails by filling out this form.
- Watch the No Diagnostic Required show on YouTube. To be notified of new episodes, follow us on Twitter.
- Listen to our podcast – just search for “No Diagnostic Required” in your favorite podcast player (see the list of supported players).
August news
- Language news
- Learning
- Tools
- And finally, 10 years of C++ support in JetBrains tools!
Watch the August episode of No Diagnostic Required below, or keep reading for all the latest news!
Language News
D2422R0 “Remove nodiscard annotations from the standard library specification”
This is another example of a paper that has recently been covered on CppCast, it appears to be an unusual paper at first glance. That was certainly my reaction when I first heard it discussed on CppCast, and that seemed to be Rob and Jason’s reaction, too.
It’s worth digging into this a bit more to find out what’s really being said in this paper. It’s definitely a case where the devil is in the details and a little more background is insightful, so let’s dive in.
[[nodiscard]]
is an attribute introduced in C++17 as a hint that compilers or other tools can choose to check whether the return value is used or explicitly cast to void
by the caller. It can be applied to a function (including member functions), in which case it refers to the return value of that function or to a type (including enums), in which case it applies to any function that returns values of that type – great for error codes or types. In C++20 it was extended to allow a message to be included that explains why the value should not be discarded.
Along with error types, functions that have no side-effects (are “pure”) also seem like good use cases. Back in May, we talked about a proposal for adding [[nodiscard]]
to the iterators library for this reason. There have been other proposals, many already accepted into the working draft for C++23, that add it to other functions, like memory allocation functions and empty()
member functions.
These all seem like great uses of [[nodiscard]]
– almost no-brainers, even (although there were some less obvious cases).
So what’s the problem? Rewind to where we said these should be considered “hints”. This is a definite case of “no diagnostic required”. On the other hand, standard library authors are entirely free to add [[nodiscard]]
anywhere they see fit, and many have done so. Microsoft’s stdlib contains over 400 such annotations, for example. Beyond that, even without [[nodiscard]],
tools (including compilers) are allowed to issue warnings when return values are discarded if they think there is a good case for doing so.
So that raises the question: should something that doesn’t specify anything be put into a specification? The only possible change is that some additional warnings may be issued. Useful warnings, for sure (assuming they’re not false positives), but at what cost?
The downsides include:
- Committee time. This is not as trivial as it sounds. The committee is already overstretched and these proposals take more time to review and process than you might think. If the goal is to get to 100% coverage on functions/types that should be annotated it may be worth the hit – at least at some point – but, as we’ll see, that may not be an appropriate goal.
- The danger of false positives and other unintended side-effects or bugs. The best way to minimize these is to learn from implementation experience. That can be put into motion now, by proposing patches to stdlib implementations to add
[[nodiscard]]
in the appropriate places. Then we wait. - A partial job, especially as coverage gets higher, raises questions about those places that have not received the annotation. Does it mean they have side effects? Is the return value unimportant?
- Verbosity. All these annotations add up, especially when added alongside const, constexpr, noexpect, and so on. There’s a can of worms here, but an important one, which we’ll come back to.
To sum up so far, D2422 is not proposing that we don’t want warnings for misuse of these functions and types. Rather it is saying that specifying them in the standard is not the best or most practical way to do so, so let’s slow down and reconsider. At the very least, let’s implement them directly in the major stdlibs, see how it shakes out, then eventually standardize the existing practice.
There may also be better ways that are being discussed within the committee, but for which there are no papers written yet. It’s too early to say what will come of these, but it seems important not to run in the opposite direction just yet. One of these ideas is: specifying (possibly in a separate document) en masse categories of functions for which [[nodiscard]]
-like behavior in the tooling may be applied (e.g. all empty() methods) rather than specifying that they each have the [[nodiscard]]
attribute. Another possibility is something like a [[discardable]]
attribute that we could start applying in cases where we know we want that ability. Then, existing compiler flags such as -Wunused become more useful due to less noise, but remain an opt-in option. There are also references to a proposal to add a [[pure]]
attribute. These are all ideas and opinions in the mix, but the underlying thought is that annotating [[nodiscard]]
everywhere is not a practical approach, and therefore spending committee time on doing so is inefficient at best and self-limiting at worst!
P2372r2 “Fixing locale handling in chrono formatters”,
P2419r0 “Clarify handling of encodings in localized formatting of chrono types”
After talking so much about one proposal (which itself is talking about the removal of proposals already adopted), let’s talk a bit about these couple of papers that, on the surface, relate to how std::format
works with std::chrono
. In fact these are really about text encoding. We’ve talked many times about std::format
and at least once about std::chrono,
but not enough about text encoding. It’s a really tricky, deep subject that is often overlooked, especially by application developers who very often like to pretend that it doesn’t exist at all! This is not helped by the standard library traditionally brushing all the handling under the locales carpet. The string classes (even std::wstring
) say nothing at all about encodings. There’s a lot of ongoing, largely unsung, work being done by SG16 (the Unicode Study Group) to bring stronger, fuller Unicode support to C++. Some of that work has been bearing fruit in the current standards, but the biggest payoffs are quite a way off yet. In the meantime, let’s at least try to remember that using UTF-8 doesn’t mean we don’t have to worry about text encodings.
P2388R1 “Minimum Contract Support: either Ignore or Check_and_abort”
Last month we looked at R0 of this paper, which at the time was called “Abort-only contract support”. It’s not just the name that has changed, though. The paper has been substantially re-organized and expanded, with most review actions addressed and wording added. There is a joke within the committee that it’s easy to encourage paper authors to explore “further work in this direction”, but in this case, I see this as a sign that things are actively moving along and this is being taken seriously. Maybe some form of Contracts support is on track for C++23, after all?
Now, back to standards that can be used today…
Empty Base Class Optimisation, no_unique_address
and unique_ptr
If you haven’t heard about this new C++20 attribute – [[no_unique_address]]
– you definitely should check the article by Bartlomiej Filipek. He talks about Empty Base Class Optimization first, which allows saving memory on empty structs via inheritance. The idea is that if you know that your class is empty, then you can inherit from that class, and the compiler won’t enlarge your derived class.
However, C++20 brought an easier way to achieve a similar effect. A new attribute indicates that a unique address is not required for a non-static data member of a class. Interestingly, ABI might be affected as the usage of the [[no_unique_address]]
attribute changes the struct layout. This situation is discussed in the GitHub issue for Microsoft STL. It was noticed there that Clang listed this attribute as supported in v.9, but not when targeting Windows. And while Stephan T. Lavavej stated that the attribute is now supported in MSVC, it’s too late to request Clang support to take advantage of it before the C++20 ABI lockdown.
Don’t explicitly instantiate std
templates
The major outcome of the new article by Arthur O’Dwyer that C++ developers should take as a rule is in the title. With a good set of examples (from simple ones to real-life ones) he explains why it’s bad. The major idea is that implicit instantiation is lazy and can only instantiate the required parts, while explicit instantiation does so for every member. The standard library relies on this laziness a lot, so when you explicitly instantiate std
templates you get yourself into trouble.
For Clang users, there is an attribute that is probably not so widely known, which helps protect code from such explicit instantiations and related issues: exclude_from_explicit_instantiation
.
C++20 modules with GCC 11
As modules-related topics are so hot now, we’ll discuss a few recent posts dedicated to this long-awaited C++20 feature. Niall Cooling discusses the C++20 modules in his new article, focusing on two different approaches to organizing the module structure – single file modules and separate interface, and implementation files – to manage the structure more easily.
He builds a very basic Hello, World! example using these approaches and GCC 11, providing tips on constructing modules with this compiler specifically. There are some differences from Microsoft’s implementation, so check them out carefully if you have tried modules with MSVC before. Pay special attention to the lack of file name extensions and common agreements.
Niall shows how to build a simple module, export functions, namespaces, and types, import things, and build everything together. Module partitions are left out of the blog post scope and will be discussed at a later date.
Moving a project to C++ named Modules
The Microsoft team has also published a post on new module practices, which you can use as an example-based tutorial on building a named module for the existing code. The original project is published on GitHub, so you can play with it along with reading the article. Interestingly, the project is CMake-based, but to introduce modules you’ll have to switch to msbuild (which can be generated from CMake), as CMake still doesn’t have support for C++20 modules.
The newly created modules mimic the project’s include directories structure, and the modules are created from the corresponding header files. But the most interesting part is dedicated to the modularization of the 3rd part code, and this is where some non-trivial work is required. For example, static constants have to be wrapped with functions to import later. This is because an internal linkage entity can’t be exported.
All the efforts are rewarded in the end with a significant compilation time improvement!
Valgrind
After posting in the craft::cpp blog about sanitizers, Marin Peko did a dive into the Valgrind tool. You might say it’s too old, but it still can work better than sanitizers in a few use cases.. The most obvious one is catching issues in a library whose source code is inaccessible. Sanitizers require recompilation so they capitulate here immediately. While Valgrind works and provides meaningful results, suppression files help tune this result and make it even more useful. Another case is to search for memory errors with address sanitizers and at the same time detect uninitialized memory reads with memory sanitizers. That’s simply not possible with sanitizers, but no such problem exists with Valgrind.
Another interesting observation is that Valgrind can be customized. Its core part loads the software and disassembles it, and the tool plugin adds the instrumentation and assembles it back. This other part can serve different purposes: checking for memory leaks, detecting data races in multi-threaded applications, analyzing the heap memory usage, and so on. Valgrind is actually not a tool, but a family of tools based on the same core.
There are known limitations of the Valgrind approach and they are discussed in the article. For example, the execution times and memory usage are significantly larger than in the sanitizers case. Valgrind also won’t help you catch overflows in stack and global variables. This is because it only has access to the heap allocations performed by the malloc function. Before making a choice between Sanitizers and Valgrind, read through the article to learn the Valgrind basics.
Intel C/C++ compilers complete adoption of LLVM
Big news was announced by Intel – they moved their compiler to the LLVM infrastructure. Intel’s compiler might not be in the top-3 most used C++ compilers (Clang, GCC, MSVC), but it is still very popular and obviously an essential choice to get the best performance on Intel processors.
Moving to LLVM infrastructure is definitely a trend among C++ tool vendors. There are obvious reasons for that – there is a huge community caring and contributing to it, and it’s fully open-source which makes it a perfect choice for tooling. Intel got the latest C++ language standards nearly for free as a result of the migration.
Intel recommends users migrate to a new compiler as the old one will soon be moved to a legacy mode with no updates. The migration guide with many useful details is published. The Intel migration announcement also shares a set of benchmarks showing the compile time and performance benefits of the new Intel LLVM compiler.
CLion 2021.3 Roadmap
Following the CLion 2021.2 release in July, we published the vision for our next CLion release coming at the end of 2021. Our major focus is still on performance and eliminating freezes. In addition to that we’ll do our best to simplify user configuration efforts in several ways:
- Bundle the MinGW toolchain in CLion installer on Windows so you need fewer manual downloads and installations when starting with CLion on that platform.
- Add the ability to configure the toolchain environment via script (for example, if you use a script to initialize all the compiler environmental variables, including the addition of the bin and lib paths).
- Bundle Ninja and use it as the default generator for CMake projects, which is an essential default for many CMake projects nowadays.
- Finalize and release our long ongoing work on custom compilers. When it’s finished, you’ll be able to fill in configuration files (likely in the
yaml
format) for the compiler not natively supported by CLion, provide supported features, header search paths, defines, etc., and then use them in CLion to get your custom compiler “supported”. - For Makefile project users, automatically find executables corresponding to Makefile build targets.
There are also several pain points in the debugger we plan to address, like long STL type names, “show as array” mode for pointer variables, and hex formatting for numerics. Check out the full plans in the blog post.
And finally, 10 years of C++ support in JetBrains tools!
On August 25, 2021 we celebrated 10 years of public support for C++ in JetBrains tools. It all started with AppCode. We were not expecting to come up with decent C++ support, but it turned out that it’s required for proper Objective-C++ support. So we started with macros in Objective-C++ code, STL auto-import, and some C++ completion. In later AppCode versions, we added libc++ support, correct template parsing, some C++11 features support, and Implement/Override for C++ code. But only when Google Test support landed in AppCode did it become serious enough and we started considering a standalone C++ IDE for the JetBrains family.
That’s how the idea of CLion was born. One interesting fact is that the first CLion demo was in September, 2013, by Dmitry Jemerov at JetBrains Day at FooCafé in Malmo, Sweden. Other names we considered for the IDE were CIDELight, Voidstar, Hexspeak, GottaC, and CTrait. Let us know what other facts you’d like to learn about our C++ Tools, CLion, and ReSharper C++.
About the authors
Anastasia Kazakova (@anastasiak2512) As a C/C++ software developer in Telecom and Embedded systems, Anastasia was involved in research estimating the broadband capacity of home networks and participated in launching 4G networks. She is now the Product Marketing Manager for JetBrains C++ tools. |
|
Phil Nash (@phil_nash) Phil is the original author of the C++ test framework Catch2. As Developer Advocate at SonarSource he’s involved with SonarQube, SonarLint, and SonarCloud, particularly in the context of C++. He’s also the organizer of C++ London and C++ on Sea, as well as co-host and producer of the cpp.chat podcast. More generally, Phil is an advocate for good testing practices, TDD, and using the type system and functional techniques to reduce complexity and increase correctness. He’s previously worked in Finance and Mobile and offers training and coaching in TDD for C++. |