Safe resource allocation and deallocation in C++ (C++11 and C++03)

Introduction

Back then before 2010, one always used Valgrind to make sure that their C++ program isn't leaking memory. There's a whole modern way of using C++ that doesn't even make you need that. If you're using a C++11 program (and even if you're not), and you still require to keep checking whether your code is leaking memory, then you're most likely doing it all wrong. Keep reading if you wanna know the right way that saves you the trouble having to memory check all the time, and uses true object-oriented programming.

Basically this artice is about the concept named RAII, Resource Allocation Is Initialization, and why it's important from my perspective.

The golden rules of why you should do this

My Golden Rules in C++ development:

  1. Humans do mistakes, and just an arrogant would claim that he doesn't (why else do we have something called "error handling"?)
  2. Things can go wrong, no matter how robust your program is, and how careful you are, and you can't plan every possible outcome
  3. A problem-prone program is no better than a program with a problem; so why bother writing a program with problems?

Once you embrace these 3 rules, you'll never write bad code, because your code will be ready for worst case scenario.

In the last few years, I never found a single lost byte in my Valgrind-analysed programs; and I'm talking about big projects, not in the single class level. The difference will be clear soon.

I'm going to start from simple cases, up to more complicated scenarios.

Scenario 1: If you're creating and deleting objects under pointers

Consider the following example:

void DoSomethingElse(int* var)
{
    std::cout << *x << std::endl;
    //... do stuff with x
}
void DoSomething()
{
    int* x = new int;
    DoSomethingElse(x);
    delete x;
}

Let me make this as clear as possible: If you ever, ever use a new followed by delete… you're breaking all the 3 rules we made up there. Why? Here are the rules and how you're breaking them:

1. You may do the mistake of forgetting to write that delete
2. DoSomethingElse() might throw an exception, and hence that delete may not be called.
3. This is a problem prone design, so it's a program with a problem.

The right way to do this: Smart pointers!

What are smart pointers?

I'm sure you've heard of them before, but if you haven't, the idea is very simple. If you define, for example, an integer like this:

void SomeFunction()
{
    int i = 0;
    // do stuff with i
} //here you're going out of the scope of the function

You never worry about deleting i. The reason is that once i goes out of scope, it's deleted automatically (through a destructor). Smart pointers are just the same. They wrap your pointer, such that they are deleted once they are out of scope. Let's look at our function DoSomething() again with smart pointers:

void DoSomething()
{
    std::unique_ptr<int> x(new int); //line 1
    DoSomethingElse(x.get());        //line 2
} //once gone out of scope, x will be deleted automatically, 
  //the destructor of unique_ptr will delete the integer

That's all the change you have to do, and you're done! In line 1, you're creating a unique_ptr, which will encapsulate your pointer. The reason why it's "unique" will become soon clear. Once the a unique_ptr goes out of scope, it'll delete the object under it. So you don't have to worry! This way, the 3 Golden Rules are served. In line 2, we're using x.get() instead of x, because the get() method will return the raw pointer stored inside the unique_ptr. If you'd like to delete the object manually, use the method x.reset(). The method reset() can take a parameter to another pointer, or can be empty to become nullptr.

PS: unique_ptr is C++11. If you're using C++03, you could use unique_ptr from the boost library

Why is it called "unique"?

Generally, multiple pointers can point to the same object. So, going back to the initial example, the following is a possible scenario:

void DoSomething()
{
    int* x = new int(1);
    int* y = x; //now x and y, both, point to the same integer
    std::cout << *x << "\t" << *y << std::endl; //both will print 1
    *x = *x + 1; //add 1 to the object under x
    std::cout << *x << "\t" << *y << std::endl; //both will print 2
    delete x; //you delete only 1 object, not 2!
}

But can you do this with unique_ptr? The answer is *no*! That's why it's called unique, because it's a pointer that holds complete *ownership* of the object under it (the integer, in our case), and it's unique in that. If you try to do this:

void DoSomething()
{
    std::unique_ptr<int> x(new int);
    std::unique_ptr<int> y = x; //compile error!
}  

your program won't compile! Think about it… if this were to compile, who should delete the pointer when the function ends, x or y? It's ambiguous and dangerous. In fact, this is exactly why auto_ptr was deprecated in C++11. It allowed the operation mentioned before, which effectively *moved* the object under it. This was dangerous and unclear semantically, which is why it's deprecated.

On the other hand, you can move an object from one unique_ptr to another! Here's how:

void DoSomething()
{
    std::unique_ptr<int> x(new int);
    std::unique_ptr<int> y = std::move(x);
    //now x is empty, and y has the integer, 
    //and y is responsible for deleting the integer
}

with std::move(x), you convert x to an rvalue reference, indicating that it can be safely moved/modified.

Shared pointers

Since we established that unique pointers are "unique", let's introduce the solution to the case where multiple smart pointers can point to the same object. The answer is: shared_ptr. Here's the same example:

void DoSomething()
{
    std::shared_ptr<int> x(new int(2)); //the value of *x is 2
    std::shared_ptr<int> y = x; //this is valid!
    //now both x and y point to the integer
}

Who is responsible for deleting the object now? x or y? Generally, any of them! The way shared pointers work is that they have a common reference counter. They count how many shared_ptrs point to the same object, and once the counter goes to zero (i.e., the last object goes out of scope), the last object is responsible for deleting.

In fact, using the new operator manually is highly discouraged. The alternative is use make_shared, which covers some corner cases of possible memory leaks. For example, if the constructor of the class use in shared_ptr has a multi-parameter constructor that may throw an exception. Here's how make_shared is used:

void DoSomething()
{
    std::shared_ptr<int> x = std::make_shared<int>(2); //the value of *x is 2
    std::shared_ptr<int> y = x; //this is valid!
    //now both x and y point to the integer
}

 

Note: Shared pointers change a fundamental aspect of C++, which is "ownership". When using shared pointers, it may be easy to lose track of the object. This is a common problem in asynchronous applications. This is a story for another day though.

Note 2: The reference counter of shared_ptr is thread-safe. You can pass it among threads with no problems. However, the thread-safety of the underlying object it points to is your responsibility.

Scenario 2: I don't have C++11 and I can't use boost

This is a common scenario in organizations that maintain very old software. The solution to this is very easy. Write your own smart pointer class. How hard can it be? Here's a simple quick-and-dirty example that works:

template <typename T>
class SmartPtr
{
    T* ptr;
    // disable copying by making assignment and copy-construction private
    SmartPtr(const SmartPtr& other) {} 
    SmartPtr& operator=(const SmartPtr& other) {return SmartPtr();}
public:
    SmartPtr(T* the_ptr = NULL)
    {
        ptr = NULL;
        reset(the_ptr);
    }
    ~SmartPtr()
    {
        reset();
    }
    void reset(T* the_ptr = NULL)
    {
        if(ptr != NULL)
        {
            delete ptr;
        }
        ptr = the_ptr;
    }
    T* get() const //get the pointer
    {
        return ptr;
    }
    T& operator*()
    {
        return *ptr;
    }
    T* release() //release ownership of the pointer
    {
        T* ptr_to_return = ptr;
        ptr = NULL;
        return ptr_to_return;
    }
};

 

and that's it! The method release(), I haven't explained. It simply releases the pointer without deleting it. So it's a way to tell the unique_ptr: "Give me the pointer, and forget about deleting it; I'll take care of that myself".

You can now use this class exactly like you use unique_ptr. Creating your own shared_ptr is a little more complicated though, and depends on your needs. Here's the questions you need to ask yourself on how to design this:

  1. Do you need multithreading support? shared_ptr supports thread-safe reference counting.
  2. Do you need to just count references, or also track them? For some cases, one might need to track all references with something like a vector of references or a map.
  3. Do you need to support release()? Releasing is not supported in shared_ptr, since it depends on reference counting, there's no way to tell other instances to release.

More requirements will require more work, especially that prior to C++11, multithreading was not in the C++ standard, meaning that you're gonna have to use system-specific C++.

For a strictly single-threaded application with C++03, I created a shared pointer implementation that supports releasing. Here's the source code.

Scenario 3: Enable a flag, do something, then disable it again

Consider the following code, which is common in GUI applications:

void GUIClass::addTheFiles(const std::vector<FileType>& files)
{
    this->disableButton();
    for(unsigned i = 0; i < files.size(); i++)
    {
        fileManager.addFile(files[i]);
    }
    this->enableButton();
}

While this looks legitimate way to do things, it's not. This is absolutely no different that the pointer situation. What if adding fails? Either because of a memory problem, or because of some exception? The function will exit without reenabling that button, and your program will become unusable and the user will probably have to restart it.

Solution? Just like before. Don't do it yourself, and get the destructor of some class to do it for you. Let's do this. What do we need? We need a class that will call a function with a reference to some variable on exit. Consider the following class:

class AutoHandle
{
    std::function<void()> func;
    bool done = false; //used to make sure the call is done only once
    // disable copying and moving
    AutoHandle(const AutoHandle& other) = delete;
    AutoHandle& operator=(const AutoHandle& other) = delete;
    AutoHandle(AutoHandle&& other) = delete;
    AutoHandle& operator=(AutoHandle&& other) = delete;
public:
    AutoHandle(const std::function<void()>& the_func)
    {
        func = the_func;
    }
    void doCall()
    {
        if(!done) 
        {
            func();
            done = true;
        }
    }
    ~AutoHandle()
    {
        doCall();
    }
};

 

Let's use it!

void GUIClass::addTheFiles(const std::vector<FileType>& files)
{
    this->disableButton();
    AutoHandle ah([this](){this->enableButton();}); //lambda function that contains the function to be called on exit
    for(unsigned i = 0; i < files.size(); i++)
    {
        fileManager.addFile(files[i]);
    }
} //Now, the function enableButton() will definitely be called definitely on exit.

 

This way, you guarantee that enableButton() will be called when the function exits. This whole thing here is C++11, but doing it in C++03 is not impossible, though I completely sympathize with you if you feel it's too much work for such a simple task, because:

  1. Since there's no std::function in C++03, we're gonna have to make that class template that accepts functors (function objects)
  2. Since there's no lambda functions in C++03, we're gonna have to make the call a new functor for every case (depending on how much you would like to toy with templates, also another big topic)

Just for completeness, here's how you could use AutoHandle in C++03 with a Functor:

template <typename CallFunctor, typename T>
class AutoHandle
{
    bool done; //used to make sure the call is done only once
    CallFunctor func;
    // disable copying by making assignment and copy-construction private
    AutoHandle(const AutoHandle& other) {}
    AutoHandle& operator=(const AutoHandle& other) {return *this;}
public:
    AutoHandle(T* caller) : func(CallFunctor(caller))
    {
        done = false;
    }
    void doCall()
    {
        if(!done)
        {
            func();
            done = true;
        }
    }
    ~AutoHandle()
    {
        doCall();
    }
};

struct DoEnableButtonFunctor
{
    GUIClass* this_ptr;
    DoEnableButtonFunctor(GUIClass* thisPtr)
    {
        this_ptr = thisPtr;
    }
    void operator()()
    {
        this_ptr->enableButton();
    }
};

 

Here's how you can use this:

void GUIClass::addTheFiles(const std::vector<FileType>& files)
{
    this->disableButton();
    AutoHandle<DoEnableButtonFunctor,GUIClass> ah(this); //functor will be called on exit
    for(unsigned i = 0; i < files.size(); i++)
    {
        fileManager.addFile(files[i]);
    }
} //Now, the function enableButton() will definitely be called definitely on exit.

Again, writing a functor for every case is a little painful, but depending on the specific case, you may decide. However, in C++11 projects, there's no excuse. You can easily make your code way more reliable with lambdas.

Remember, you're not bound to the destructor to do the calls. You can also call doCall() yourself anywhere (equivalent to reset() in unique_ptr). But the destructor will *guarantee* that the worst case scenario is covered if something went wrong.

Scenario 4: Opening and closing resources

This could even be more dangerous than the previous cases. Consider the following:

void ReadData()
{
    int handle = OpenSerialPort("COM3");
    ReadData(handle);
    Close(handle);
}

 

This form is quite common in old libraries. I faced such a format with the HDF5 library. If you read the previous sections, you'll get the problem with such usage and the idea on how to fix it. It's all the same. You *should never* close resources manually. For my HDF5 problem, I wrote a SmartHandle class that guarantees that HDF5 resources are correctly closed. Find it here. Of course, the formal right way to do this is to write a whole wrapper for the library. This may be an over-kill depending on your project constraints.

Notes on Valgrind

If you follow these rules, you'll be 100% safe with the resources you use. You rarely will ever need to use Valgrind. However, when you write the classes that we talked about (such as AutoHandle, SmartPtr, etc), it's very, very important not only to test it with Valgrind, but also to write good tests that will cover every corner case. Because once you do these classes right, you never have to worry about them. If you do them wrong, the consequences could be catastrophic. Surprising? Welcome to object-orient programming! This is exactly what "separation of concerns" mean.

Conclusion

Whenever you have to do, then undo something, then keep in mind that you shouldn't have this manually done. Sometimes it's safe and trivial, but many times it may lead to simply bad and error-prone design. I covered a few cases and different ways to tackle the issue. By following these examples, I guarantee that your code will become more compact (given that you're using C++11) and way more reliable.

Just a side-note on standards and rules

Sometimes when I discuss such issues related to modern designs, people claim that they have old code, and they don't want to change the style of the old code for consistency. Other times they claim that this would be violating some standard issued by some authority. All these reasons don't show that doing or not doing this is right or wrong. It just shows that someone doesn't want to do them because they're "following the rules". To me (and I can't emphasize enough that this is my opinion), it just is like asking someone "Why do you do this twice a day, how does it help you?", and get the answer "Because my religion tells me I have to do it". While I respect all ideologies, such an answer is not a rational answer that justifies the pros and cons of doing something. That answer doesn't tell why doing that twice a day helps that guy's health, physically or mentally or otherwise. He just is following a rule *that shouldn't be discussed*. I'm a rational person and I like discussing things by putting them on the table, which helps in achieving the best outcome. If someone's answer is "I have a standard I need to follow it", then that effectively and immediately closes the discussion. There's nothing else to add. Please note that I'm not encouraging breaking the rules here. If your superior tells you how to do it, and you couldn't convince them otherwise, then just do what he tells you, because most likely your superior has a broader picture of other aspects of a project that don't depend on code only.

 

Leave a Reply

Your email address will not be published. Required fields are marked *

CAPTCHA ImageChange Image

This site uses Akismet to reduce spam. Learn how your comment data is processed.