Qodana logo

Qodana

The code quality platform for teams

How-To's

Static Code Analysis and the Rules of Zero, Three, and Five

The Rule of Zero, Three, and Five, sometimes written 0/3/5, is a set of C++ guidelines about resource ownership and the special member functions. They exist because the alternatives are double-frees, dangling pointers, and silently broken copies. In this post we’ll work through the rules by tripping over each of those bugs in turn, then look at how to enforce them automatically with static analysis. Our colleague Anna will take us through her thoughts on this.

Anna Zhukova

Anna Zhukova is a Software Developer in Qodana and brings a powerful mix of backend expertise, cross-platform skills, and a genuine passion for static analysis to the fold. She’s currently working on building and maintaining Qodana for C++ and advancing our static analysis tools to help developers ship better, more secure code faster.

What are the Rules of Zero, Three, and Five in C++?

Let’s imagine a common scenario that most non-trivial codebases have to solve: You are writing a class and that class owns a resource. A resource in this context is anything that needs to be cleaned up when no longer needed: a file handle will need closing, a pointer will need freeing, and so on. At first glance, this seems like an easy job for the destructor:

// This is BAD CODE and SHOULD NOT BE COPIED
struct TreeVertex {
    vector<TreeVertex*> children;
    ...
	
    ~TreeVertex() {
        for (auto& vertex : children) {
            delete vertex;
        }
    }
};

Seems simple enough. Although, note that there is nothing preventing us from copying an instance of `TreeVertex`. What happens then?

Rules of Zero, Three and Five

Uh oh.

Your program blew up because when you copied a TreeVertex, you implicitly made both the original and the copy believe they own pointers to the (singular) set of children and are free to delete them if needed. At destruction time, the first object to be destroyed will delete the children, and the second object will cause the program to segfault because it tried to free memory that was already freed. This is known as a “Double Free” bug – a classic amongst us memory managers.

Q: Why is nothing preventing us from copying an instance of TreeVertex?

A: To copy a class, you need to have a copy constructor. We never defined one, but luckily for us (ha!) the compiler will generate one if no user-defined copy constructors exist.

Anyway, a savvy reader that has watched a CppCon keynote or two will immediately notice a code smell: we are managing a lifetime of a bunch of pointers by hand, calling low-level functions like delete. Surely all life problems will go away if we use, say, a unique_ptr?

class TreeVertex {
    vector<unique_ptr<TreeVertex>> children;
public:
    ...
    
    // ~TreeVertex()  // Not even necessary anymore!
};

The savvy reader was technically correct – this is better, in the sense that the Earth no longer blows up. However, now your class can’t be copied at all, because unique_ptr solves the problem of two owners by deleting its copy constructor altogether.

We are talking about it like it’s a bad thing, but in real life this might actually be what you want. Writing a fancy copy constructor that duplicates a resource isn’t always the right decision. Say you have an OS window as your resource – would it make sense to implicitly open new windows when you accidentally passed an object by value?

If you have decided that your tree vertices are non-copyable, and that you will let unique_ptr manage your memory for you, then congratulations! You’ve just implemented the Rule of Zero: If you can avoid defining default operations, do.

Let’s say we do want to copy our vertices though. Also, let’s say that we would like to have an internal access function for our lower-level API such that we cannot use unique pointers.

class TreeVertex {
    vector<TreeVertex*> children;
public:
    TreeVertex** children_data() { return children.data(); }
	
    TreeVertex(TreeVertex const& other) {
        children.reserve(other.children.size());
        for (auto const& vertex : other.children) {
            children.push_back(new TreeVertex(*vertex));
        }
    }

    ~TreeVertex() {
        for (auto const& vertex : children) {
            delete vertex;
        }
    }
};

The copy constructor now works okay, but this wouldn’t be C++ if there wasn’t another footgun hidden inside the initial footgun. Something that may not be obvious at first glance is that these two lines call two different functions:

TreeVertex my_vertex = other;
my_vertex = other;

The first one is indeed a copy constructor which we just meticulously defined. The second one calls a copy assignment operator – which luckily for us (ha!) the compiler generated in a similar way to the copy constructor:

// This is BAD ~~compiler-generated~~ CODE and SHOULD NOT BE COPIED
TreeVertex& operator=(TreeVertex const& other) {
    children = other.children;
    return *this;
}

Notice how just by defining a destructor and following the string of bugs, we had to define three functions: the destructor, the copy constructor, and the copy assignment operator. Finally we have arrived at…

The Rule of Three

If you define a copy constructor, copy assignment, or destructor, you should define them all.

A seemingly correct copy assignment operator looks like this:

// This is BAD CODE and SHOULD NOT BE COPIED
TreeVertex& operator=(TreeVertex const& other) {
    // Clear existing children
    for (auto const& vertex : children) {
        delete vertex;
    }
    children.clear();

    // Copy other's children
    children.reserve(other.children.size());
    for (auto const& vertex : other.children) {
        children.push_back(new TreeVertex(*vertex));
    }
    return *this;
}

Let’s forget for a moment how many times we duplicated code in this simple class (we’ll come back to that). If you want to be mean to your junior devs, ask them to find a bug here.

The bug comes out from the fundamental difference between a constructor and an assignment operator – the this object existed before you called the function. This means that you can write something like this:

my_vertex = my_vertex;

In other words, both this and other are the same thing. And instead of doing nothing, our copy assignment operator just obliterated all the children.

Luckily, some smart people came up with a solution to both the self-assignment problem and the code duplication problem. It’s called the copy-and-swap idiom.

A swap function is a special kind of function in C++ that “swaps” the contents of two objects. There is no compiler magic going on here, we just all collectively agreed to use this name. The signature looks like this:

void swap(T& a, T& b);

A default template std::swap does the generic swapping using a temporary variable, but if we know how our class works we can actually do better. Let’s define our own swap as a free friend function:

class TreeVertex {
    ...
public:
    friend void swap(TreeVertex& a, TreeVertex& b) noexcept {
        using std::swap;
        swap(a.children, b.children);
    }
};

First, we added std::swap to the list of functions for overload resolution (this is not necessary in our example, but a good general rule of thumb). Then we swap our children, letting the compiler find the most appropriate overload of swap, which turns out to be an efficient specialization created specifically for std::vector.

We can now rewrite our copy assignment operator like so:

TreeVertex& operator=(TreeVertex other) {
    swap(*this, other);
    return *this;
}

Notice that the signature has changed: we now take other by value, delegating the copying work to the compiler. Then we swap our children, and leave other to be destructed, delegating the deleting work to the destructor. No code duplication, and no bugs with self-assignment.

Q: What’s the deal with defining the body of swap inside of the class?

A: This is another idiom called hidden friends: since swap is a free function, it might be considered for overload resolution even when you are swapping things that are completely unrelated to your TreeVertex. Making the function a hidden friend prevents possible accidental implicit conversions, improves compilation times, cleans up compilation errors, and makes your cat love you.

Since defining swap is so useful, the Rule of Three is also sometimes called the Rule of Three and a Half.

We are almost at the end of today’s discussion, but there is one more thing we need to talk about: the move semantics introduced in C++11. If your company has not adopted C++11 yet, you should stop reading this blog and go update your CV.

Unlike a copy, a move operation is allowed to steal the contents from the other object. In fact, the standard says that after a move operation other is left in a valid, but unspecified state, usually only good for destruction. Allowing an object to be moved is a very powerful optimization, so let’s define a move constructor and a move assignment operator using our amazing swap function:

TreeVertex(TreeVertex&& other) noexcept {
    swap(*this, other);
}

TreeVertex& operator=(TreeVertex&& other) noexcept {
    swap(*this, other);
    return *this;
}

Piece of cake, and now you know the modern version of the Rule of Three, the Rule of Five:

If you have a destructor or either copy function (constructor or assignment), you should probably define both copy functions and both move functions.

*also known as rule of Five and a Half because of the swap function.

Q: Is noexcept necessary? What does it achieve?

A: noexcept is an important performance optimization. Standard containers like std::vector aim to have a strong exception guarantee during reallocation, and will discard your throwing move entirely in favor of an expensive-but-recoverable copy.

This is mentioned in the notes for std::move_if_noexcept. Our swap is also marked noexcept. We don’t need it as such for our own code, since what’s important for us is just that it does not actually throw anything. But third-party code which conditionally optimizes based on std::is_nothrow_swappable would care about the correct label, so it’s a good practice.

Q: Would ASan scream at us for a third time if I didn’t define these two functions?

A: Not this time! Unlike the copy constructor and copy assignment operator, the move versions are not going to be generated by default if you have the copy versions of either.

Big thanks to the excellent Stack Overflow post by GManNickG, which was my reference on this topic for the past 10 years.

Static code analysis and the Rules of Zero, Three, and Five in practice

The bugs in this post, double-free, broken self-assignment, a move constructor that quietly throws, survive code review and surface in production. Static analysis catches them at commit time.

Qodana for C++ wraps Clang-Tidy (among other things), and three checks map directly onto what we walked through above:

  • cppcoreguidelines-special-member-functions is the Rule of Five at its source: if you define a destructor, copy constructor, or copy assignment operator, this check requires you to define (or `= default` / `= delete`) the rest. It’s C.21 from the C++ Core Guidelines.
  • bugprone-unhandled-self-assignment catches the exact self-assignment bug from earlier in this post: a user-defined copy assignment operator with no self-check and no copy-and-swap.
  • performance-noexcept-move-constructor flags non-noexcept move constructors and assignment operators, so std::vector reallocation actually uses your move instead of silently falling back to copy.

All three are enabled by default (in the qodana.starter profile), or you can add them to your .clang-tidy to be sure:

Checks: >
  cppcoreguidelines-special-member-functions,
  bugprone-unhandled-self-assignment,
  performance-noexcept-move-constructor

Then just run Qodana — locally with qodana scan, or in CI via GitHub Actions, GitLab, Jenkins, or TeamCity. 

As for the Rule of Zero: there’s no single check that says “delete this destructor and use a unique_ptr instead” — Rule of Zero is more of a design discipline. What Qodana can do is point you toward it: cppcoreguidelines-owning-memory flags raw owning pointers, and the modernize-* family (modernize-make-unique, modernize-make-shared, modernize-avoid-c-arrays) nudges code toward RAII types that remove the need for special members in the first place. Just add them to your .clang-tidy and Qodana will pick them up automatically:

Checks: >
  ...
  modernize-make-unique,
  modernize-make-shared,
  modernize-avoid-c-arrays,
  cppcoreguidelines-owning-memory

Find out more about improving your code quality with Qodana.

P.S. When you should NOT use copy-and-swap

Despite being a safe bet for 99% of use cases, the copy-and-swap idiom has a problem: copy performance. Recall that the copy assignment operator we wrote takes other by value, which means the object and any underlying storage is copied regardless of whether it is actually necessary. A smarter copy assignment could do something like this:

TreeVertex& operator=(TreeVertex const& other) {
    if (this == &other) return *this;  // Self-assignment check

    auto const common_size = std::min(children.size(), other.children.size());

    for (size_t i = 0; i < common_size; ++i) {
        *children[i] = *other.children[i];
    }

    for (size_t i = common_size; i < children.size(); ++i) {
        delete children[i];
    }
    children.resize(common_size);
    children.reserve(other.children.size());
    for (size_t i = common_size; i < other.children.size(); ++i) {
        children.push_back(new TreeVertex(*other.children[i]));
    }

    return *this;
}

It’s certainly more complex but achieves something important: the storage that was already present in the copied-to object can be reused, partially or fully, depending on who has more children. For our example in particular, the gains are as follows:

Given N = children.size() and M = other.children.size():

Textbook copy-and-swapCustom copy function
calls to newMmax(0, M-N)
calls to deleteNmax(0, N-M)

Also note that these gains are recursive, since each child has its own children. Assigning a subtree to itself is completely free with the custom copy function.

Know this caveat but always remember: premature optimization is the root of all evil.


Thank you Anna Zhukova for your contribution to the blog! Anna works on Qodana for C++. Try it today if you haven’t already or view the documentation.

Qodana for C++