Introduction
Back then before 2010, one always used Valgrind to make sure that their C++ program isn't leaking memory. There's a whole modern way of using C++ that doesn't even make you need that. If you're using a C++11 program (and even if you're not), and you still require to keep checking whether your code is leaking memory, then you're most likely doing it all wrong. Keep reading if you wanna know the right way that saves you the trouble having to memory check all the time, and uses true object-oriented programming.
Basically this artice is about the concept named RAII, Resource Allocation Is Initialization, and why it's important from my perspective.
The golden rules of why you should do this
My Golden Rules in C++ development:
- Humans do mistakes, and just an arrogant would claim that he doesn't (why else do we have something called "error handling"?)
- Things can go wrong, no matter how robust your program is, and how careful you are, and you can't plan every possible outcome
- A problem-prone program is no better than a program with a problem; so why bother writing a program with problems?
Once you embrace these 3 rules, you'll never write bad code, because your code will be ready for worst case scenario.
In the last few years, I never found a single lost byte in my Valgrind-analysed programs; and I'm talking about big projects, not in the single class level. The difference will be clear soon.
I'm going to start from simple cases, up to more complicated scenarios.
Scenario 1: If you're creating and deleting objects under pointers
Consider the following example:
void DoSomethingElse(int* var)
{
std::cout << *x << std::endl;
//... do stuff with x
}
void DoSomething()
{
int* x = new int;
DoSomethingElse(x);
delete x;
}
Let me make this as clear as possible: If you ever, ever use a new
followed by delete
… you're breaking all the 3 rules we made up there. Why? Here are the rules and how you're breaking them:
1. You may do the mistake of forgetting to write that delete
2. DoSomethingElse() might throw an exception, and hence that delete may not be called.
3. This is a problem prone design, so it's a program with a problem.
The right way to do this: Smart pointers!
What are smart pointers?
I'm sure you've heard of them before, but if you haven't, the idea is very simple. If you define, for example, an integer like this:
void SomeFunction() { int i = 0; // do stuff with i } //here you're going out of the scope of the function
You never worry about deleting i
. The reason is that once i
goes out of scope, it's deleted automatically (through a destructor). Smart pointers are just the same. They wrap your pointer, such that they are deleted once they are out of scope. Let's look at our function DoSomething()
again with smart pointers:
void DoSomething() { std::unique_ptr<int> x(new int); //line 1 DoSomethingElse(x.get()); //line 2 } //once gone out of scope, x will be deleted automatically, //the destructor of unique_ptr will delete the integer
That's all the change you have to do, and you're done! In line 1, you're creating a unique_ptr
, which will encapsulate your pointer. The reason why it's "unique" will become soon clear. Once the a unique_ptr
goes out of scope, it'll delete the object under it. So you don't have to worry! This way, the 3 Golden Rules are served. In line 2, we're using x.get()
instead of x
, because the get()
method will return the raw pointer stored inside the unique_ptr
. If you'd like to delete the object manually, use the method x.reset()
. The method reset()
can take a parameter to another pointer, or can be empty to become nullptr
.
PS: unique_ptr
is C++11. If you're using C++03, you could use unique_ptr
from the boost library
Why is it called "unique"?
Generally, multiple pointers can point to the same object. So, going back to the initial example, the following is a possible scenario:
void DoSomething()
{
int* x = new int(1);
int* y = x; //now x and y, both, point to the same integer
std::cout << *x << "\t" << *y << std::endl; //both will print 1
*x = *x + 1; //add 1 to the object under x
std::cout << *x << "\t" << *y << std::endl; //both will print 2
delete x; //you delete only 1 object, not 2!
}
But can you do this with unique_ptr
? The answer is *no*! That's why it's called unique, because it's a pointer that holds complete *ownership* of the object under it (the integer, in our case), and it's unique in that. If you try to do this:
void DoSomething() { std::unique_ptr<int> x(new int); std::unique_ptr<int> y = x; //compile error! }
your program won't compile! Think about it… if this were to compile, who should delete the pointer when the function ends, x or y? It's ambiguous and dangerous. In fact, this is exactly why auto_ptr
was deprecated in C++11. It allowed the operation mentioned before, which effectively *moved* the object under it. This was dangerous and unclear semantically, which is why it's deprecated.
On the other hand, you can move an object from one unique_ptr
to another! Here's how:
void DoSomething()
{
std::unique_ptr<int> x(new int);
std::unique_ptr<int> y = std::move(x);
//now x is empty, and y has the integer,
//and y is responsible for deleting the integer
}
with std::move(x)
, you convert x
to an rvalue reference, indicating that it can be safely moved/modified.
Shared pointers
Since we established that unique pointers are "unique", let's introduce the solution to the case where multiple smart pointers can point to the same object. The answer is: shared_ptr
. Here's the same example:
void DoSomething() { std::shared_ptr<int> x(new int(2)); //the value of *x is 2 std::shared_ptr<int> y = x; //this is valid! //now both x and y point to the integer }
Who is responsible for deleting the object now? x or y? Generally, any of them! The way shared pointers work is that they have a common reference counter. They count how many shared_ptrs
point to the same object, and once the counter goes to zero (i.e., the last object goes out of scope), the last object is responsible for deleting.
In fact, using the new
operator manually is highly discouraged. The alternative is use make_shared
, which covers some corner cases of possible memory leaks. For example, if the constructor of the class use in shared_ptr
has a multi-parameter constructor that may throw an exception. Here's how make_shared
is used:
void DoSomething() { std::shared_ptr<int> x = std::make_shared<int>(2); //the value of *x is 2 std::shared_ptr<int> y = x; //this is valid! //now both x and y point to the integer }
Note: Shared pointers change a fundamental aspect of C++, which is "ownership". When using shared pointers, it may be easy to lose track of the object. This is a common problem in asynchronous applications. This is a story for another day though.
Note 2: The reference counter of shared_ptr
is thread-safe. You can pass it among threads with no problems. However, the thread-safety of the underlying object it points to is your responsibility.
Scenario 2: I don't have C++11 and I can't use boost
This is a common scenario in organizations that maintain very old software. The solution to this is very easy. Write your own smart pointer class. How hard can it be? Here's a simple quick-and-dirty example that works:
template <typename T>
class SmartPtr
{
T* ptr;
// disable copying by making assignment and copy-construction private
SmartPtr(const SmartPtr& other) {}
SmartPtr& operator=(const SmartPtr& other) {return SmartPtr();}
public:
SmartPtr(T* the_ptr = NULL)
{
ptr = NULL;
reset(the_ptr);
}
~SmartPtr()
{
reset();
}
void reset(T* the_ptr = NULL)
{
if(ptr != NULL)
{
delete ptr;
}
ptr = the_ptr;
}
T* get() const //get the pointer
{
return ptr;
}
T& operator*()
{
return *ptr;
}
T* release() //release ownership of the pointer
{
T* ptr_to_return = ptr;
ptr = NULL;
return ptr_to_return;
}
};
and that's it! The method release()
, I haven't explained. It simply releases the pointer without deleting it. So it's a way to tell the unique_ptr
: "Give me the pointer, and forget about deleting it; I'll take care of that myself".
You can now use this class exactly like you use unique_ptr
. Creating your own shared_ptr
is a little more complicated though, and depends on your needs. Here's the questions you need to ask yourself on how to design this:
-
Do you need multithreading support?
shared_ptr
supports thread-safe reference counting. - Do you need to just count references, or also track them? For some cases, one might need to track all references with something like a vector of references or a map.
-
Do you need to support
release()
? Releasing is not supported inshared_ptr
, since it depends on reference counting, there's no way to tell other instances to release.
More requirements will require more work, especially that prior to C++11, multithreading was not in the C++ standard, meaning that you're gonna have to use system-specific C++.
For a strictly single-threaded application with C++03, I created a shared pointer implementation that supports releasing. Here's the source code.
Scenario 3: Enable a flag, do something, then disable it again
Consider the following code, which is common in GUI applications:
void GUIClass::addTheFiles(const std::vector<FileType>& files) { this->disableButton(); for(unsigned i = 0; i < files.size(); i++) { fileManager.addFile(files[i]); } this->enableButton(); }
While this looks legitimate way to do things, it's not. This is absolutely no different that the pointer situation. What if adding fails? Either because of a memory problem, or because of some exception? The function will exit without reenabling that button, and your program will become unusable and the user will probably have to restart it.
Solution? Just like before. Don't do it yourself, and get the destructor of some class to do it for you. Let's do this. What do we need? We need a class that will call a function with a reference to some variable on exit. Consider the following class:
class AutoHandle
{
std::function<void()> func;
bool done = false; //used to make sure the call is done only once
// disable copying and moving
AutoHandle(const AutoHandle& other) = delete;
AutoHandle& operator=(const AutoHandle& other) = delete;
AutoHandle(AutoHandle&& other) = delete;
AutoHandle& operator=(AutoHandle&& other) = delete;
public:
AutoHandle(const std::function<void()>& the_func)
{
func = the_func;
}
void doCall()
{
if(!done)
{
func();
done = true;
}
}
~AutoHandle()
{
doCall();
}
};
Let's use it!
void GUIClass::addTheFiles(const std::vector<FileType>& files)
{
this->disableButton();
AutoHandle ah([this](){this->enableButton();}); //lambda function that contains the function to be called on exit
for(unsigned i = 0; i < files.size(); i++)
{
fileManager.addFile(files[i]);
}
} //Now, the function enableButton() will definitely be called definitely on exit.
This way, you guarantee that enableButton() will be called when the function exits. This whole thing here is C++11, but doing it in C++03 is not impossible, though I completely sympathize with you if you feel it's too much work for such a simple task, because:
- Since there's no std::function in C++03, we're gonna have to make that class template that accepts functors (function objects)
- Since there's no lambda functions in C++03, we're gonna have to make the call a new functor for every case (depending on how much you would like to toy with templates, also another big topic)
Just for completeness, here's how you could use AutoHandle
in C++03 with a Functor:
template <typename CallFunctor, typename T>
class AutoHandle
{
bool done; //used to make sure the call is done only once
CallFunctor func;
// disable copying by making assignment and copy-construction private
AutoHandle(const AutoHandle& other) {}
AutoHandle& operator=(const AutoHandle& other) {return *this;}
public:
AutoHandle(T* caller) : func(CallFunctor(caller))
{
done = false;
}
void doCall()
{
if(!done)
{
func();
done = true;
}
}
~AutoHandle()
{
doCall();
}
};
struct DoEnableButtonFunctor
{
GUIClass* this_ptr;
DoEnableButtonFunctor(GUIClass* thisPtr)
{
this_ptr = thisPtr;
}
void operator()()
{
this_ptr->enableButton();
}
};
Here's how you can use this:
void GUIClass::addTheFiles(const std::vector<FileType>& files)
{
this->disableButton();
AutoHandle<DoEnableButtonFunctor,GUIClass> ah(this); //functor will be called on exit
for(unsigned i = 0; i < files.size(); i++)
{
fileManager.addFile(files[i]);
}
} //Now, the function enableButton() will definitely be called definitely on exit.
Again, writing a functor for every case is a little painful, but depending on the specific case, you may decide. However, in C++11 projects, there's no excuse. You can easily make your code way more reliable with lambdas.
Remember, you're not bound to the destructor to do the calls. You can also call doCall()
yourself anywhere (equivalent to reset()
in unique_ptr
). But the destructor will *guarantee* that the worst case scenario is covered if something went wrong.
Scenario 4: Opening and closing resources
This could even be more dangerous than the previous cases. Consider the following:
void ReadData()
{
int handle = OpenSerialPort("COM3");
ReadData(handle);
Close(handle);
}
This form is quite common in old libraries. I faced such a format with the HDF5 library. If you read the previous sections, you'll get the problem with such usage and the idea on how to fix it. It's all the same. You *should never* close resources manually. For my HDF5 problem, I wrote a SmartHandle
class that guarantees that HDF5 resources are correctly closed. Find it here. Of course, the formal right way to do this is to write a whole wrapper for the library. This may be an over-kill depending on your project constraints.
Notes on Valgrind
If you follow these rules, you'll be 100% safe with the resources you use. You rarely will ever need to use Valgrind. However, when you write the classes that we talked about (such as AutoHandle, SmartPtr, etc), it's very, very important not only to test it with Valgrind, but also to write good tests that will cover every corner case. Because once you do these classes right, you never have to worry about them. If you do them wrong, the consequences could be catastrophic. Surprising? Welcome to object-orient programming! This is exactly what "separation of concerns" mean.
Conclusion
Whenever you have to do, then undo something, then keep in mind that you shouldn't have this manually done. Sometimes it's safe and trivial, but many times it may lead to simply bad and error-prone design. I covered a few cases and different ways to tackle the issue. By following these examples, I guarantee that your code will become more compact (given that you're using C++11) and way more reliable.
Just a side-note on standards and rules
Sometimes when I discuss such issues related to modern designs, people claim that they have old code, and they don't want to change the style of the old code for consistency. Other times they claim that this would be violating some standard issued by some authority. All these reasons don't show that doing or not doing this is right or wrong. It just shows that someone doesn't want to do them because they're "following the rules". To me (and I can't emphasize enough that this is my opinion), it just is like asking someone "Why do you do this twice a day, how does it help you?", and get the answer "Because my religion tells me I have to do it". While I respect all ideologies, such an answer is not a rational answer that justifies the pros and cons of doing something. That answer doesn't tell why doing that twice a day helps that guy's health, physically or mentally or otherwise. He just is following a rule *that shouldn't be discussed*. I'm a rational person and I like discussing things by putting them on the table, which helps in achieving the best outcome. If someone's answer is "I have a standard I need to follow it", then that effectively and immediately closes the discussion. There's nothing else to add. Please note that I'm not encouraging breaking the rules here. If your superior tells you how to do it, and you couldn't convince them otherwise, then just do what he tells you, because most likely your superior has a broader picture of other aspects of a project that don't depend on code only.