Wonderful Coding: C++11 benchmarks

The last week I been running some benchs to compare the new C++ standard, C++11, against its predecessor. The feature I was testings was the new move semantics and, as I expected, the new move semantic is way ahead in performance over its predecessor.

The tests were written for Linux, since it is very common in any distro to find the tools that the program depends to run, and, most important: I am using GCC to compile both version, C++11 and the old standard. I just turn on/off the new standard flag (-std=c++11), and I compile each version.
Using the same compiler to build each version, I avoid any difference I could face for using different compiler version.

The environment

The machine where the tests were ran:
Phenom II x4 955
SSD Corsair GT 160 GB
4 GB of RAM memory
Arch Linux Kernel 3.17.6-1-ARCH #1 SMP PREEMPT
gcc (GCC) 4.9.2 20141224 (prerelease)
GNU bash, version 4.3.33(1)-release (x86_64-unknown-linux-gnu)

Our program has:
An dummy object, called Dummy. It has implemented all the move semantics for the new standard, and, to put the things interesting, it has a double pointer that points to an array of 67108864 double, all of them genereted dynamically in it constructor.
Some foo functions. One of those take a Dummy object by value (lvalue), and other take a Dummy as a rvalue.

The features we are testing

The idea for this section is to clarify how the things had been working before the new standard, and how thing could be now, of course, if we do the things well :)

We are going to see three features:
- Perfect forwarding.
- Move constructors.
- Move assignment.

Perfect forwarding:
Let's say we have a function that takes an object Dummy by value, and other one that takes a Dummy object as a rvalue:

Before the new standard, we just call foo passing our object:

Here, a temporary object is created, calling the Dummy's copy constructor, and going on a deep copy to duplicate its array member, which is very expensive.

In C++11, we can use std::move(d1) to pass to foo:

The move call returns a rvalue, this way, we perform a Perfect forwarding, because no temporary object is created, instead, the rvalue version of foo is called, calling the move constructor of Dummy, going on a shallow copy, which does not duplicate the member array, instead, takes the same memory address of its source.

Based on this, for our tests, we can conclude that the rival of Perfect forwarding is going to be the Copy constructor. Because before perfect forwarding, we had no option more than create the temporary object.

Move constructor
Let's say we want to create a new instance of our Dummy object, but we want to base this one on an already existing Dummy object.


Dummy interface

Before the new standard, when we want to create an object based on another object, our best option is deep copying the old one to the new one. As I said before, it is very expensive, and unnecessary if we are not going to use our old object any more.

The new standard allow us to create an object stealing the members from the object we are basing the new one, by calling the Dummy's move constructor.

Just, after this, be careful when you refer to d1. It is better if you don't.

Based on this, for our tests, we can conclude that the rival of Move constructor is going to be the Copy constructor. Because before move constructos, we had no option more than do a deep copy of an object.

Move assignment
Let's we have already created d1 and d2, and we want to copy d2 to d1. Or we just want to assign a default Dummy object to d1.

As we can see on the Dummy interface image, Dummy implements the move semantics overloading the operator=.

Before the new standard, when we want to assign an already existing object to another existing one, we just assign d2 to d1, which goes on a deep copy. And the same happens when we want to assign an object created on the fly:

In C++11 we can overload the operator= for an rvalue, this allow us move assignments.

Since the Dummy() call returns a rvalue, the move assignment operator= is called.
And as Dummy() is a rvalue, when don't have to have the same caution that we have to have when we call the move constructor in the previous test example (which is a lvalue), when don't have the address of Dummy() to make any mistakes ;)

Based on this, for our tests, we can conclude that the rival of Move assignment is going to be the Copy assignment. Because before move assignment, we had no option more than do a copy assignment.

The benchs

Well, finally we have the results of the benchs. We run the benchs using the command time, which shows the wall clock time, the user time, and the sys time.
The wall clock time is the the human perception time. The user time is the time that the program runs on user space of the operative system. The sys time is the time that the program runs on kernel space (system calls).
Each test was ran 20 times, and I took the average.

First Test: Perfect Forwarding vs Temporaries

Unbelieve!, no? It is beacuse Perfect forwarding does not duplicate memory.

Second test: Move Constructor vs Copy Constructor

Same monster difference. Same motive.

Third test: Move Assignment vs Copy Assignment

And finally we can see that the same happens with the assignments.

Summary

Overload the operator= to achieve the move semantics.
Use move semantics whenever the situation allow you. They are way faster than duplicating blocks of memory.

You can download the benchs and run on you computer: c9xvsc++11

Wonderful Coding

Sunday, January 18, 2015

C++11 benchmarks

The environment

The features we are testing

The benchs

Summary

No comments:

Post a Comment