C++ Concurrency and Utilities

Posted on
Programming Classes C++ Concurrency RAII

I have been reading Bjarne Stroustrup’s “A tour of C++” recently and wanted to make notes on it’s last section: Concurrency and utilities Below are my notes.

Resource Management

It is important that a program doesn’t leak resources. This may cause security vulnerabilities, as much as program efficiency drop by orders of magnitude.

To ensure resources are correctly used, we can handle them using the RAII (Resource Aquisition Is Initialization) technique. The RAII technique binds the lifecycle of a resource that must be acquired before use to the lifetime of an object. An example of this technique can be seen next:

mutex m; // Used to protect access to shared data

void f(){
  lock_guard<mutex>lck{m}; // Ackuire the mutex m.
                           // This is the constructor
                           // that aquires the mutex
                           // following RAII technique
}

A thread will not proceed until lck constructor has acquired it’s mutex, m. The corresponding destructor releases the resource.

In the example above, lock_guard’s destructor releases the mutex when the thread of control leaves f().

unique_ptr And shared_ptr

The standard library provides two interfaces to manage objects allocated using the free store. These are unique_ptr and shared_ptr.

  • unique_ptr: represents unique ownership
  • shared_ptr: represents shared ownership

The most basic use of one of these smart pointers is to prevent memory leaks caused by careless programming.

It is often the case that we overuse pointers initialized by new. In the next example, we could have simply used a variable as we don’t have the need for a pointer. In any case, we should use smart pointers instead of simply using new. They ensure that the object is properly destroyed whichever way we exit f().

void f(int i, int j){
  X* p = new X;           // Allocate a new X
  unique_ptr<X>sp{new X}; // Allocate a new X and give it's pointer to unique_ptr
  // ...
  if(i < 99) throw Z{};   // May throw an exception
  if(j < 77) return;      // May throw an exception
  // ...
  delete p;               // Destroy *p
}

unique_ptr further uses include passing free-store objects in and out of functions.

unique_ptr<X> make_x(int i){
  return unique_ptr<X>{new X{i}};
}

The shared_ptr is similar to unique_ptr except that shared_ptr’s are copied rather than moved. The shared_ptr’s of an object share ownership of that object and that object is destroyed when the last shared_ptr of that object is destroyed.

Smart pointers are still pointers, so they should be considered a second choice for resource management. After containers and other types that manage their resources at a higher conceptual level.

Therefore, where do we need smart pointers? Where we need pointer semantics:

  • When we share a object, we need pointers (or references) to refer to the shared object (shared_ptr)
  • When we refer to a polymorphic object, we need pointers (or references) because we don’t know the default type of the object, or even it’s size. (unique_ptr)
  • A shared polymorphic object typically requires shared_ptrs.

We do not need a pointer to return a collection of objects from a function; A container that is a resource handle will do that simply and efficiently.

Concurrency

The execution of several tasks simultaneously

The standard library support is aimed at supporting systems-level concurrency rather than directly providing sophisticated higher-level concurrency models; Those can be supplied as libraries built using the standard library facilities.

Tasks And *thread*s

  • Task: A computation that can potentially be executed concurrently with other computations.
  • Thread: A system-level representation of a task in a program.

To launch a task, we construct a std::thread with the task as it’s argument. A task is a function or a function argument.

void f(){}           // Function

struct F{            // Function Object
  void operator()()  // F's call operator 
}

void User(){
  thread t1 {f}      // f executes in a separate thread
  thread t2 {F()}    // F()() executes in a separate thread

  t1.join()          // wait for t1
  t2.join()          // wait for t2
}

The join() ensures we don’t exit user() until the threads have completed.

Threads of a program share a address space. Since they share a address space, they can communicate through shared objects. Such communications are typically controlled by locks or other mechanisms to prevent data races (Uncontrolled, concurrent access to a variable).

When tasks of a concurrent program, our aim is to keep tasks completely separate except when they communicate in simple and obvious ways.

Passing Arguments

We can pass data (or pointers or references to data) to a thread as arguments.

void f(vector<int>& v){}

struct F{
  vector<int>& v;
  F(vector<int>& vv):v{vv}{}
  void operator()();
};

int main(){
  vector<int> v1{0, 1, 2, 3, 4, 5, 6, 7};
  vector<int> v2{0, 1, 2, 3, 4, 5, 6, 7};

  thread t1 {f,v1};
  thread t2 {F{v2}};

  t1.join();
  t2.join();
}

Returning Results

If the arguments to the function have been passed through reference, the function can directly modify the original objects passed to it. If, in the other hand, the function arguments are passed as const objects, the result location can be also passed as a function parameter.

void f(const vector<int>& v, int* res); // Take input from v, place result in res

To call the thread: thread(f, some_vec, &res);

Sharing Data

Sometimes, tasks need to share data. If the tasks are reading a immutable object, there is no risk of corruption, but, we have to use a mutex if we want to ensure that at most one task at a time is going to write to a shared object. A mutex is Mutual Exclusion Object. A thread acquires a mutex using a lock() operation.

mutex m;  // Controlling mutex
int sh;   // Shared data

void f(){
  lock_guard<mutex> lck {m}; // Aquire mutex
  sh += 7;
} // Release mutex implicitly

Following RAII, the lock_guard’s constructor acquires the mutex through a call to m.lock(). If another thread has already acquired the mutex, the thread waits (blocks) until the other thread completes it’s access. Once the thread has completed it’s access to the shared data, the lock_guard releases the mutex with a call to m.unlock()

The correspondence between shared data and a mutex is conventional. The programmer simply has to know which mutex corresponds to which data.

The combination of defer_lock and lock() are used to combat a deadlock. (Use of two resources by two different processes in alternate order)

void f(){
  lock_guard<mutex> lck1 {m1, defer_lock}; // Dont aquire mutex just yet
  lock_guard<mutex> lck2 {m2, defer_lock};
  lock_guard<mutex> lck3 {m3, defer_lock};
  // ...
  lock(m1, m2, m3);   // Aquire all 3 locks
  // ... Manipulate shared data
} // Implicitly release all mutexes

Locking and unlocking are pretty expensive operations and modern machines are very good at copying data, especially compact data, such as vectors, so don’t choose shared data for communication because of ‘efficiency’ without thought and preferably not without measurement.

Waiting For Events

The simplest event is time passing. (this_thread::sleep_for(milliseconds{20})) The basic support for communication using external events is provided by condition_variables A condition_variable is a mechanisms for allowing one thread to wait for a condition to occur as a result of work done by another.

#include <iostream>
#include <condition_variable>
#include <mutex>
#include <ostream>
#include <string>
#include <thread>
#include <queue>

class Message{
  public:
    int message;
    Message(int msg) :message{msg} {};
};

std::ostream& operator<<(std::ostream& output, const Message& message){
  output << message.message;
  return output;
}

// Three global variables
std::queue<Message> messageQueue;
std::condition_variable mcond; // The variable communicating events
std::mutex mmutex;  // The locking mechanism

void emmiter(){
  for(int i{0}; true; ++i){
    Message m{i};
    //std::cout << m << std::endl;
    std::unique_lock<std::mutex> lck{mmutex}; // Protect operations
    messageQueue.push(m);
    mcond.notify_one();
  }
}

void reciever(){
  while (true) {
    std::unique_lock<std::mutex> lck{mmutex}; // Aquire mmutex
    mcond.wait(lck); // release lck and wait until event wake-up
    auto m = messageQueue.front();  // get the message
    std::cout << m << std::endl;
    messageQueue.pop();   // Remove the message from the queue
    lck.unlock();   // Release the lock
  }
}

int main(int argc, char *argv[])
{
  std::thread t1{ emmiter };
  std::thread t2{ reciever };

  t1.join();
  t2.join();
  return 0;
}

Communicating Tasks

The standard library provides a few facilities for programmers to operate at the conceptual level of a task (Work to potentially be done concurrently) rather than directly at the lower level (thread or lock).

  • future and promise: For returning a value from a task spawned on a separate thread.
  • packaged_task: Help launch tasks and connect the mechanisms for returning a result.
  • async() Launching a task in a manner very similar to calling a function.

future And promise

They enable the transfer of a value between two tasks without the explicit use of a lock. When a task wants to pass a value to another, it puts it into a promise. When the value is computed, it is stored in the future. The other task can access the value stored in future. To get() the future’s value:

future<X> fx; // Create future
// Assign a value to the future
X v = fx.get() // If necessary, wait until the value has finished being computed.

packaged_task

The packaged_task type is provided to simplify setting up tasks connected with futures and promises to be run on threads.

async()

async() provides a very simple way of executing a task asynchronously.

async() is a high level abstraction of a threaded implementation, therefore, we can’t use it with shared memory, where we would have to use locking mechanisms.

double comp4(vector<double>& v){
  if(v.size() < 10000) { return accum(v.start(), v.end(), 0.0); }

  auto v0 = &v[0];
  auto sz = v.size();

  auto f0 = async(accum, v0, v0+sz/4, 0.0);         // First quarter
  auto f1 = async(accum, v0+sz/4, v0+sz/2, 0.0);    // Second quarter
  auto f2 = async(accum, v0+sz/2, v0+3*sz/4, 0.0);  // Third quarter
  auto f3 = async(accum, v0+3*sz/4, v0+sz, 0.0);    // Fourth quarter

  return f0.get() + f1.get() + f2.get() + f3.get(); // Collect and return results
}

We can also use async() to spawn a task to get information from the user, while the program is doing another task

Small Utility Components

Time

Basic time facilities:

using namespace std::chrono;

auto t0 = high_resolution_clock::now(); 
// Do work
auto t1 = high_resolution_clock::now(); 
cout << duration_cast<milliseconds>(t1-t0).count() << "msec\n";

Type Functions

Is evaluated at compile time given a type as it’s argument or returning a type.

One example of a type function could be: numeric_limits. It presents useful information of the type passed: constexpr float min = numeric_limits<float>::min(); // Smallest possible float

Use of these features is often called metaprogramming

iterator_traits

The standard library provides iterator_traits to allow us to check which kind of iterator is supported. With this information, we can create a wrapper around sort() to support either vector or forward_list.

#include <iostream>
#include <forward_list>
#include <ostream>
#include <vector>
#include <iterator>
#include <algorithm>

template<typename C> using Iterator_type = typename C::iterator;
template<typename Iter> using Iterator_category = typename std::iterator_traits<Iter>::iterator_category;

template<class C> void my_sort(C&);
template <typename For> void sort_helper(For , For , std::forward_iterator_tag);
template <typename Ran> void sort_helper(Ran , Ran , std::random_access_iterator_tag);
void test(std::vector<int>&, std::forward_list<int>&);

int main(int argc, char *argv[])
{
  std::forward_list<int>l{5, -1, 9, 1, 8, 10, 4, 18, 6};
  std::vector<int>v{5, -1, 9, 1, 8, 10, 4, 18, 6};
  test(v, l);
  std::cout << "list in main: ";
  for(auto i : l) std::cout << i << ", ";
  std::cout << std::endl;
  std::cout << "Vector in main: ";
  for(auto i : v) std::cout << i << ", ";
  std::cout << std::endl;
  return 0;
}

template <typename For> // for forward iterators
void sort_helper(For begin, For end, std::forward_iterator_tag){
  // Copy forward iterator container into random access container
  std::vector<typename std::iterator_traits<For>::value_type>v{begin, end}; // Initialise a vector of the type of the contents of begin
  // sort random access container
  sort(v.begin(), v.end());
  // copy sorted container into forward iterator container
  copy(v.begin(), v.end(), begin);
}

template<class C>
void my_sort(C& c){
  using iter = Iterator_type<C>;
  sort_helper(c.begin(), c.end(), Iterator_category<iter>{});
}

template <typename Ran> // Random access iterators
void sort_helper(Ran begin, Ran end, std::random_access_iterator_tag){
  sort(begin, end);
}

void test(std::vector<int>& v, std::forward_list<int>& l){
  my_sort(v);
  my_sort(l);
}

Type Predicates

Simple type functions that answer fundamental questions about types.

They are especially useful with templates, because they can inform a user of misuse of the template class. Take into consideration a template that only provides support for arithmetic containers. One could check against this like so:

template<class Scalar>
class complex{
  Scalar re, im;
  public:
    static_assert(is_arithmetic<Scalar>(), "Sorry, I only support complex of arithmetic types");
}

pair And tuple

A pair is a container of two values, thay can be anything. It can be used to define a subset of a ordered range, providing iterators to the first element of the range and iterator to the last element of the range. The two elements of a pair are accessed using decorators .first and .second. A pair provides operators <, = and == if it’s elements support them. To create a pair, we can auto p = make_pair(v2.begin(), 2);. Creating a pair with the first element of the vector v2 and 2, which could be used to indicate the length of the subset desired.

If we want more than two values, we have to use a tuple. A tuple is a heterogeneous sequence of elements. For example: auto t1 = make_tuple(string("Hello"), 1.23, 123); It’s type is deduced.

To get it’s values, we have to use: auto v1 = get<1>(t1); for the second element in the tuple. They can also be assigned and compared if their elements support it.

A pair is commonly used in interfaces, whereas a tuple is commonly found in implementations of common algorithms.

Regular Expressions

Regular expressions are a powerful tool for text processing. They provide a way to simply and tersely describe and find patterns in text.

A simple regex example:

regex pat(R"(\W{2}\s*\d{5}(-\d{4})?)");
// This looks for: XXddddd-dddd

string line;

for(auto i : std::getline(std::cin, line)){
  smatch matches; // Matched strings go here
  if(regex_search(line, matches, pat))
    cout << matches[0];
}

smatch is a vector of matches. The first element includes the whole match, and the subsequent elements of the vector, the individual matches.