Member Access Operators

One of the most confusing operators families are the member access operators:

  • subscript []
  • indirection *
  • address-of &
  • member-of-object .
  • member-of-pointer -> It’s often confused which operators are suitable to use. Fortunately, compilers and IDE are very good at telling the types and largely ease the pain.

Manual Memory Management by Operators

One way to manually manage free store (which is also known as heap in CPP, but I don’t really like this ambiguous name) is to overload the operator new.

Motivation

Each time program invokes new , what it actually happens is CPP asks the OS to allocate a new free store in memory (HeapAlloc on Windows or malloc on Unix). In some settings, free store allocations simply involve too much latency - i.e. high-frequency trading so one may want to avoid it. A general way to do this is to allocate a big space on the launch of the program and then allocate to variables who need storage.

Memory Fragmentation and Buckets

Manage the memory is a surprisingly complicated task. One of the issue is Memory Fragmentation. Memory fragmentation is a problem when the allocation and deallocation of blocks of memory lead to a state where available memory is divided into small, disjoint segments, which are not able to be allocate to a large object even if the capacity of them is enough. Memory Fragmentation

Memory Fragmentation, From Wikipedia

One way to tackle it is to chop memory into buckets, which are fixe-sized blocks in memory. The OS will not allocate memory fewer than the size of a bucket. This scheme prevents memory fragmentation to some extent, but with an overhead with possible(and very likely) waste of memory.

Free Store Operators

There are four operators that are needed to be overload to implement manual memory management:

void* operator new(size_t);
void operator delete(void*);
void* operator new[](size_t);
void operator delete(void*);

Note that the return type of new and the parameter type of delete is void*. It means that the free store operators deal in raw, uninitialized memory.

include <cstddef>
#include <new>

struct Bucket {
  const static size_t data_size{ 4096 };
  std::byte data[data_size];
};

struct Heap {
  void* allocate(size_t bytes) {
    if(bytes > Bucket::data_size)
      throw std::bad_alloc();
    for(size_t i{}; i < n_heap_buckets; i++) {
      if(!bucket_used[i]) {
        bucket_used[i] = true; // 6
        return buckets[i].data;
      }
    }
    throw std::bad_alloc();
  }

  void free(void* p) {
    for(size_t i{}; i < n_heap_buckets; i++) {
      if(buckets[i].data == p) {
        bucket_used[i] = false; // 7
        return;
      }
    }
  }
  static const size_t n_heap_buckets{ 10 }; // 4
  Bucket buckets[n_heap_buckets]{};
  bool bucket_used[n_heap_buckets]{}; // 5
};

Heap heap; // 1

void* operator new(size_t n_bytes) { // 2
  return heap.allocate(n_bytes);
}

void operator delete(void* p) { // 3
  return heap.free(p);
}

Line 1 create the heap at the namespace scope. Its lifetime begins when the program starts. Line 2 and 3 overloads the new and delete operators in the namespace; Now if one use new and delete, dynamic memory management will use heap instead. In heap, the memory is chop into 10 buckets (line 4). The information whether the bucket is allocated is recorded in an array of bool (line 5 6 7).

The other parts of the program can use the pre-allocated memory of heap by using the new and delete operators now. When there is no free bucket left, heap can tells the main program by the std::bad_alloc exceptions.

int main() {
  printf("Buckets:   %p\n", heap.buckets);
  auto breakfast = new unsigned int{ 0xC0FFEE };
  auto dinner = new unsigned int{ 0xDEADBEEF };
  printf("Breakfast: %p 0x%x\n", breakfast, *breakfast);
  printf("Dinner:    %p 0x%x\n", dinner, *dinner);
  delete breakfast;
  delete dinner;
  try {
    while(true) {
      new char;
      printf("Allocated a char.\n");
    }
  } catch(const std::bad_alloc&) {
    printf("std::bad_alloc caught.\n");
  }
}

Placement Operators

Another interesting set of operators that can manage memory is the Placement Operators.

void* operator new(size_t, void*);
void operator delete(size_t, void*);
void* operator new[](size_t, void*);
void operator delete[](size_t, void*);

Although they are similar to free store operators, placement operators are used to construct objects in arbitrary memory.

#include <iostream>
#include <new>  // Required for placement new

class MyClass {
public:
    MyClass(const char* tag) {
        strcpy_s(this->tag, tag);
        this->tag[strlen(tag)] = 0;
        std::cout << this->tag << ":Constructor of MyClass called." << std::endl;
    }
    ~MyClass() {
        std::cout << this->tag << ":Destructor of MyClass called." << std::endl;
    }
private:
    char tag[3]{};
};

int main() {
    const auto classSize = sizeof(MyClass);
    // Allocate a buffer to store 3 objects
    char buffer[3 * classSize];

    // Construct objects in the buffer
    MyClass* myObject1 = new (&buffer) MyClass("o1");
    MyClass* myObject2 = new (&buffer[classSize]) MyClass("o2");
    MyClass* myObject3 = new (&buffer[classSize * 2]) MyClass("o3");

    // deallocate the memory
    myObject2->~MyClass();
    myObject3->~MyClass();
    myObject1->~MyClass();
    return 0;
}

In this example, we use placement operators to construct objects in the buffer which is allocated at the beginning of the program. To de-allocate the memory, programmers must call the object’s destructor directly and exactly once. The output shows

o1:Constructor of MyClass called.
o2:Constructor of MyClass called.
o3:Constructor of MyClass called.
o2:Destructor of MyClass called.
o3:Destructor of MyClass called.
o1:Destructor of MyClass called.

Precedence and Evaluation Order

A fact: CPP language standard explicitly defines the precedence of operators (e.g. In the expression a + b*c, the product operator has higher precedence than the sum operator), but it does not define the evaluation order. This is because the language wants to give compiler writers to find clever optimization opportunities.

For example:

stop() + drop() * roll()

It is not guaranteed that the evaluation order is : drop(), roll(), stop().

Some exceptions are:

  • a && b and a || b guarantees that a evaluate before b.
  • a ? b :c guarantees that a evaluate before b and c
  • a,b,c guarantees that the order is a, then b, then c.

Type Conversions

Operators often involve type conversions in CPP. And unfortunately, CPP is overzealous to do conversions implicitly. It’s not a good idea and programmers should pay attention to it.

Integer Promotion

Integer promotion refers to the process when a smaller integer (such as char, short, and in some cases bool) encounters an operator that evaluate outside its range, then it is “promoted” to a larger integer type.

char a = 10;
char b = 20;
auto result = a * b;
std::cout << "Type of result: " << typeid(result).name() << std::endl; 
// Type of result: int

Silent Truncation

When a number is assigned to a variable which cannot represent it, it be will silently truncated.

#include <cstdint>
#include <cstdio>

int main() {
  // 0b111111111 = 511
  uint8_t x = 0b111111111; // 255
  int8_t y = 0b111111111; // Implementation defined.
  printf("x: %u\ny: %d", x, y);
}
  • If the destination is unsigned, the result is as many bits as it can fits.
  • If the destination is signed, the result is undefined.

Conversion to bool

Pointers, integers and float-point numbers can be implicitly converted to bool. The conversion is true is the value is nonzero.

Pointers to void*

Pointers can always be implicitly converted to void*.

Explicit Type Conversion

Braced Initialization ensures that only safe conversions are allowed. This is why modern CPP prefers braced initialization.

int main() {
  int32_t a = 100;
  int64_t b{ a };
  if(a == b)
    printf("Non-narrowing conversion!\n");
  int32_t c{ b }; // Bang!
}

In this situation, explicit type conversions are needed.

One way is C-Style Casts, which is dangerous and not recommended. You need to ensure what is in the memory and what’s the behaviour of a specific compiler.

void trainwreck(const char* read_only) {
  auto as_unsigned = (unsigned char*)read_only;
}

A more civil way is to use reinterpret_cast or static_cast

int* ptr = new int(10);
char* chPtr = reinterpret_cast<char*>(ptr);

double d = 10.5;
int i = static_cast<int>(d);

The reinterpret_cast is used for low-level reinterpreting of the bit pattern of an object, while static_cast performs conversions according to the language’s conversion rules and is safer in comparison.

To use static_cast in user-defined class, one needs to implement User-Defined Type Conversion:

class Celsius {
    float temp;

public:
    Celsius(float t) : temp(t) {}
    // Conversion operator to convert Celsius to Fahrenheit
    operator float() const {
        return temp * 1.8f + 32;
    }
};

int main() {
    Celsius degree_c{ 26.0 };
    auto degree_f = static_cast<float>(degree_c);
    printf("Degree Fahrenheit: %f", degree_f);
}

In this example, class Celsius defines a conversion to float Fahrenheit and the program employs static_cast to perform casting.

Summary

This article discussed 2 topics: memory management and type conversions, which are great examples of how operators are used in modern CPP.