Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

By building an interface in C first, you will realize that classes in object oriented languages are not doing anything magical, they are just providing syntactic sugar for patterns that are tedious to write in C.

term Syntactic Sugar
Is syntax within a programming language that is designed to make things easier to read or to express. It makes the language “sweeter” for human use: things can be expressed more clearly, more concisely, or in an alternative style that some may prefer. Syntactic sugar is usually a shorthand for a common operation that could also be expressed in an alternate, more verbose, form. [Wikipedia]
Some example of syntactic sugar that all of you are familiar with: x += y; (x = x + y;), foo->bar (*foo).bar

Under the Hood of OOP

In this introductory module, the goal is to simulate an interface and polymorphism using C programming language.

We shall start by explaining what is polymorphism.

What is Polymorphism?

Polymorphism - The “Many Forms” Principle

In C, functions are usually bound at compile-time. If you call calculate_circle_area(), the compiler knows exactly which memory address to jump to. Polymorphism breaks this static link.

unnecessary_info Polymorphism
derived from Greek for “many shapes”

Polymorphism allows a single interface to represent different underlying implementations. E.g. in your C code, this means you can hold a pointer to a generic shape struct, but when you call area(shape), the program decides at runtime whether to execute the circle logic or the square logic.

Function Pointers

Pillar of Dynamic Dispatch

In C, a function pointer is a variable that stores the memory address of a function. This allows you to decide at runtime which function to call. Its syntax is a bit messy but you will get it after writing a few function pointers.

Syntax Reference:

// Declaring a function pointer
double (*operation)(double, double);

// Assigning a function to it
double add(double a, double b) { return a + b; }
operation = add;

// Calling through the pointer
double result = operation(5.0, 3.0);  // Calls add(5.0, 3.0)

We put parenthesis to distinguish return type from function pointer syntax:

// concatenate_arrays variable is a pointer to a function that returns an int
// pointer
int *(*concatenate_arrays)(int *arr1, int *arr2, int size1, int size2);

The key insight: function pointers let you treat functions as variable. You can pass them around, store them in structs, and decide which one to call based on runtime conditions.

Here is a quick reference for function pointers.


Enough introduction. Now you need to implement a dynamic dispatch logic to practice OOP logic - with what you have already known.

Traditional Approach

You will build a simple message relay in this part that reads messages from a file and processes them based on their type. This milestone will show you the traditional approach to handling different message types using if/else logic.

Later, you will see how function pointers can make this code cleaner and more extensible.

Objective

Write a C program that reads a file line by line and processes different types of messages.

Payload Types:

  1. Commands - Lines starting with /
    • These are system commands. Commands may continue with arguments, ignore them and just print the command name.
    • Print "Command: <command_name>"
    • Example: /quit prints "Command: quit"
  2. Direct Messages - Lines starting with @
    • Private messages to a specific user
    • Print "Direct message to <username>: <content>"
    • Example: @alice Hey there! prints "Direct message to alice: Hey there!"
  3. Group Messages - Lines starting with #
    • Messages to a group channel
    • Print "Group message to <channel>: <content>"
    • Example: #general Hello everyone prints "Group message to general: Hello everyone"
  4. Global Messages - All other lines
    • Broadcast messages to everyone
    • Print "Global message: <content>"
    • Example: Hello world prints "Global message: Hello world"

Your program should:

  1. Accept a filename as a command-line argument
  2. Read the file line by line
  3. Identify the payload type based on the first character
  4. Process and print each payload according to its type
  5. Handle empty lines gracefully (skip them)
  6. Close the file and exit cleanly

You do not need to write production-grade codes. You are allowed to leak memory, ignore boundary checks etc.

Example

Input file payloads.txt

/login metw SuperSecretP4%%w0rd
Hello, world!
@bob How are you doing?
#announcements Server maintenance tonight
This is a global broadcast
/logout
@alice Thanks for your help
#general See you all later

Expected Output

Command: login
Global message: Hello, world!
Direct message to bob: How are you doing?
Group message to announcements: Server maintenance tonight
Global message: This is a global broadcast
Command: logout
Direct message to alice: Thanks for your help
Group message to general: See you all later

What You will Learn

This exercise demonstrates the traditional approach to handling different types:

  • Lots of if/else or switch statements
  • Duplicated logic for similar operations
  • Hard to extend when adding new payload types

After completing this, you’ll appreciate how function pointers can simplify this pattern!


Implement logic for this task yourself, and then inspect the solution 01. After that, you can start working on the second milestone.

Extending a Non-OOP Project

In Ch01, you built a basic message relay. Now you need to extend it with new features. This exercise will demonstrate how difficult it becomes to maintain and extend code that relies on traditional if/else dispatching.

Objective

Your program needs to handle two new capabilities:

1. Command Arguments: Commands now need to parse and store their arguments:

  1. Login - /login <username> <password>
    • Store username and password
  2. Join - /join <channel>
    • Store channel name
  3. Logout - /logout
    • No arguments

Output format:

Command: login
  Arguments: [username: metw, password: SuperSecretP4%%w0rd]
Command: join
  Arguments: [channel: general]
Command: logout
  Arguments: []

2. Payload Buffering: Instead of printing payloads immediately, you need to

  1. Read and parse all payloads from the file into a buffer (array/list)
  2. After reading everything, print (process) payloads one by one from the buffer
  3. Print “Processing payload X of Y” before each message/command

This simulates how real servers queue payloads before processing. As we do not have any networking capabilities yet, we simply print a payload out to represent how it may be processed (e.g. print formatting).

Example

(using payloads.txt from previous example)
Expected Output:

--- Reading payloads ---
Read 8 payloads

--- Processing payloads ---
Processing payload 1 of 8
Command: login
  Arguments: [username: metw, password: SuperSecretP4%%w0rd]

Processing payload 2 of 8
Global message: Hello, world!

Processing payload 3 of 8
Group message to announcements: Server maintenance tonight

...

Requirements

Your program should:

  1. Create payload and buffer structures:
    • Store the payload type (login / join / logout / direct message / group message / global message)
    • For commands, store parsed arguments
  2. Do two-pass processing:
    • First pass: Read file and populate buffer. You should allocate new char * for each string (receiver name, command arguments etc.)
    • Second pass: Process each buffered message, print them out
  3. Command argument parsing:
    • Extract command and all arguments
    • Store arguments in an union structure
  4. Maintain compatibility with previous version:
    • Your code should still handle direct messages, group messages, and global messages from Ch01

Hints

  • Recap: union
    union payload_data {
        struct { char *username; char *password; } login;
        struct { char *content; } message;
    };
    
  • Recap: enum
    enum payload_kind {
        COMMAND_LOGIN, COMMAND_LOGOUT /* ... */
    };
    
  • Consider using a struct to represent a buffered payload and enum for type of it:
    struct payload {
        enum payload_kind kind;  /* Type discriminator */
        union payload_data data;  /* Type-specific payload data */
    }
    
  • Dynamic allocation (malloc) will be necessary for the buffer

The problem this reveals is that the traditional approaches make extensions exponentially harder:

  • Ch01: Handle 4 message types -> ~50 lines
  • Ch02: Add arguments + buffering + separate processing -> ~300+ lines
Spoiler: Do not inspect the requirements for Ch03 before completing this chapter if you value your mental health!

The agony of extending your code will be unbearable. You will eventually be forced to either perform a complete rewrite to accommodate new features or fundamentally change your design principles to escape the impending misery.

Inspect the solutions for this chapter. Current design principle becoming increasingly unmaintainable with new requirements.

term Static and Dynamic Dispatch
We called the enum-based approach “traditional dispatch” because it was the familiar starting point, not because it is a standard pattern. It is actually called static dispatch, the compiler determines which code to execute at compile time based on the enum value. With function pointers, you are using dynamic dispatch, the decision of which function to call is made at runtime based on the function pointer stored in the object. Both achieve the same goal (executing type-specific behavior), but dynamic dispatch provides more flexibility at the cost of a small runtime overhead.

Refactoring with Function Pointers

New Requirements

Multiple receivers: Messages can now target multiple users/channels

  • Example: @alice @bob #general Hello everyone!
  • A single message goes to alice, bob, AND the general channel

Stop! Before You Continue…

Question: How would you implement these in your Ch02 code?

Think about it for a moment… The painful reality:

  • You would need to rewrite push_payload() parsing logic, completely.
  • The payload_data union would need a new receiver list structure.
  • Multiple switch cases would change to handle arrays instead of single values.
  • Processing logic would need loops for each receiver.
  • Command parsing would need string splitting on semicolons.

In other words: you would need to touch almost every function, risking bugs and spending hours debugging edge cases! This is the moment to ask: “Is there a better way to organize my code?”

The answer is yes: function pointers.

The Current Problem

Look at your code from Ch02. Notice how every time you want to add a new payload type, you have to modify code in multiple places:

  1. Add a new enum value to payload_kind
  2. Add a new struct to the payload_data union
  3. Add a parsing case in push_payload()
  4. Add a processing case in process_next()
  5. Add a cleanup case (if you are properly freeing memory)

This is called tight coupling. Everything is intertwined, one change ripples through the entire codebase.

Imagine more future requirements, what if we need to add:

  • Timestamps for each payload?
  • Priority levels (urgent, normal, low)?
  • Validation before processing?
  • Logging when payloads are processed?
  • JSON export of payloads?
  • Filtering payloads by type?

With the current design, each feature requires changes in almost all functions. This does not scale! Real applications have dozens or hundreds of types, which requires continuous maintenance and extensions.

The Function Pointer Solution

Instead of using switch statements everywhere, we can store behavior with data, recall function pointers.

Core Idea: What if each payload could “know how to process itself”?

// Traditional approach: separate data and behavior
switch (payload->kind) {
    case COMMAND_LOGIN:
        printf("Command: login\n");
        printf("  Arguments: [username: %s, password: %s]\n", ...);
        break;
    // ... more cases
}

// Function pointer approach: behavior stored with data
payload->process(payload);  // Each payload knows how to process itself!

Your Mission

Refactor your Ch02 code to use function pointers instead of switch statements, and then you will requested to implement features in “New Requirements” section.

New Design Principle “Dynamic Dispatch”

  1. Remove the gigantic payload_kind enum:
    • No more type enumeration!
    • Each payload type is self-contained.
  2. Add a function pointer to each payload:
    struct payload {
        void (*process)(const struct payload *self);
        // ... data
    }
    
    We use the self reference because the process function is a standalone function that exists outside of the struct payload instance. We must explicitly pass the memory address of the struct so the function knows which specific object’s data it should operate on.
  3. Each payload type sets its own processing function:
    • Login command -> uses process_login() function
    • Direct message -> uses process_direct_message() function
    • … etc.
  4. Simplify process_next():
    void process_next(struct payload_buffer *buf) {
        struct payload *p = &buf->payloads[buf->process_base];
        p->process(p);  // That's it! No switch needed.
        buf->process_base++;
    }
    

Benefits You will See

  • No more repetitive switch statements in processing code.
  • Behavior and data are together: Each payload is self-contained.
  • Extensibility: Adding new operations (like validate or to_json) is straightforward.
  • Separation of Concerns: Adding new types is easier, as it does not require lots of changes in process orchestrator. Just implement business logic in a function and assign it.

What You will Learn

This exercise demonstrates the core principle of object-oriented programming:

“Objects bundle data and behavior together, allowing them to be treated uniformly.”

In C++/Java/Python, this is called polymorphism with virtual methods. In C, we achieve the same thing with function pointers.

Important Notes

  • We are keeping this simple, just a few function pointers per payload.
  • We are not building a full “virtual table” system yet. You will learn details of that in upcoming milestones.
  • Focus on understanding how function pointers enable polymorphic behavior.
  • Later exercises will show more sophisticated patterns.

Hints

  • Keep your existing payload_data union structure, simplify it by eliminating duplicate logic
  • Add a void (*process)(struct payload *self) function pointer to struct payload
  • Create separate process_login(), process_direct_message(), etc. functions
  • In push_payload(), after parsing, assign the appropriate function pointer
  • Your process_next() becomes much simpler!

Expected Outcome

Your code should:

  • Handle all the same payloads as Ch01.
  • Have no switch statements in process_next().
  • Be easier to extend with new payload types.
  • Demonstrate how function pointers enable polymorphism.

After completing this, you will understand why object-oriented languages use virtual methods. They solve the exact same problem you just solved with function pointers!

After completing the exercise on your own, inspect the notes on refactored and extended with new requirements solutions for this chapter.


We complained about switch/case being repetitive, but now that we have defined so many functions, is not this even more repetitive?

  • Is it reasonable to store a pointer for each function inside the struct?
  • No type-safety. What prevents conflicting behaviors (e.g., a command’s process function but a message’s destructor)?
  • We still had problems in the old approach…

Up Next: A detailed discussion on dynamic dispatch.

Refactored Solution: From Static to Dynamic Dispatch

This directory contains the refactored version of Ch02, demonstrating how function pointers eliminate switch statements and improve code organization.

What Changed from Ch02?

1. Removed the Type Enum

Ch02 (Static Dispatch):

enum payload_kind {
    COMMAND_LOGIN, COMMAND_JOIN, COMMAND_LOGOUT,
    MESSAGE_DIRECT, MESSAGE_GROUP, MESSAGE_GLOBAL
};

struct payload {
    enum payload_kind kind;  // Type tag
    union payload_data data;
};

Ch03 (Dynamic Dispatch):

struct payload {
    void (*process)(const struct payload *self);  // Behavior
    void (*destroy)(const struct payload *self);  // Cleanup
    union payload_data data;
};

Impact: No more type enumeration needed. Each payload carries its own behavior through function pointers.

2. Eliminated Switch Statements for Destroy and Process

Ch02:

void process_next(struct payload_buffer *buf) {
    struct payload *p = &buf->payloads[buf->process_base];

    switch (p->kind) {
        case COMMAND_LOGIN:
            printf("Command: login\n");
            printf("  Arguments: [username: %s, password: %s]\n",
                   p->data.command_login.username,
                   p->data.command_login.password);
            break;
        case COMMAND_JOIN:
        // ... more cases
        // ... 6 total cases
    }

    buf->process_base++;
}

Ch03:

void process_next(struct payload_buffer *buf) {
    struct payload *p = &buf->payloads[buf->process_base];

    p->process(p);  // Dynamic dispatch!

    buf->process_base++;
}

Impact: From 30 lines with 6 cases to 3 lines. Each payload knows how to process itself.

3. File Organization for Isolation

Ch02 Structure:

  • traditional_dispatch.h: Everything (types, buffer, functions)
  • traditional_dispatch.c: All logic mixed together

Ch03 Structure:

  • payload.h / payload.c: Payload types and behaviors (process/destroy functions)
  • dynamic_dispatch.h / dynamic_dispatch.c: Buffer management and orchestration

Impact: Better isolation between concerns. Payload behavior is separated from buffer orchestration. We will elaborate isolation on later chapters, but here is a quick explaination:

What is Isolation?

Isolation means separating different concerns so changes in one area do not ripple through unrelated areas.

In Ch02:

  • Adding a new payload type required changes in multiple functions
  • Buffer logic and payload logic were intertwined
  • Everything depended on the shared enum

In Ch03:

  • Payload behaviors are self-contained (each has its own process/destroy functions)
  • Buffer management is isolated in dynamic_dispatch.c
  • Adding a new payload type only requires:
    1. Create new process/destroy functions in payload.c
    2. Modify push_payload() to assign those functions
    3. Done! No changes to process_next() or destroy()

Benefits of isolation:

  • Easier to understand: Each file has a single, clear purpose
  • Safer to modify: Changes don’t accidentally break unrelated code
  • Better for teams: Different developers can work on payloads vs. buffer logic without conflicts

Side Quest: Further Isolation

Currently, push_payload() does two things:

  1. Parses the raw string and constructs a payload
  2. Adds it to the buffer

Challenge: Extract parsing logic into a separate function:

struct payload parse_payload(const char *raw);

This function should:

  • Parse the input string
  • Assign appropriate function pointers
  • Return a constructed payload

Then push_payload() becomes simpler:

void push_payload(struct payload_buffer *buf, const char *raw) {
    struct payload parsed;

    bool is_parsing_successful = parse_payload(&parsed, raw);  // Construct

    // ... add to buffer (if is_parsing_successful)
}

Why? Parsing logic is now isolated from buffer management. You can test payload construction independently, and changing the buffer structure does not affect parsing.

Remaining Limitations

Even with function pointers, we still have problems:

  1. Construction is still static: push_payload() still uses if/else to assign function pointers
  2. Repetitive function pointers: Each payload struct stores two pointers (process + destroy)
  3. No type safety: Nothing prevents mixing incompatible function pointers

Next: The extended version implements the new requirements (multiple receivers) to further demonstrate the benefits of dynamic dispatch.

Hold on! Lots of Repetition, Again!

Our code base has once again become impossible to maintain! This solution only implements the first part of the task, multiple receivers. For function batching, try it yourself!

What Changed in This Solution?

This solution extends 01 with support for multiple receivers:

Example Input:

@alice @bob #general Hello everyone!

Output:

Direct message to alice: Hello everyone!
Direct message to bob: Hello everyone!
Group message to general: Hello everyone!

1. Nested Polymorphism

We introduced two levels of polymorphic behavior:

Level 1 - Payloads:

struct payload {
    void (*process)(const struct payload *self);
    void (*destroy)(const struct payload *self);
    union payload_data data;
};

Level 2 - Message Receivers:

struct message_receiving_entity {
    void (*transmit_message)(const struct message_receiving_entity *self,
                             const char *content);
    void (*destroy)(const struct message_receiving_entity *self);
    char *additional_info;
};

Each receiver is polymorphic! Direct messages, group messages, and global messages all use the same interface but behave differently.

2. Architecture

File separation:

  • payload_constructor.c - Parsing and construction
  • payload_behaviors.c - Behavioral implementations
  • dynamic_dispatch.c - Orchestration (clean!)

This demonstrates composition: payloads contain arrays of polymorphic receivers. We will discuss composition later on.

The Problem: Too Much Repetition!

Let’s count the repetition we have encountered:

1. Manual Function Pointer Assignment

For every payload type, we manually assign function pointers:

if (strcmp("login", command_name) == 0) {
    *p = (struct payload) {
        .process = process_command_login,  // manual assignment
        .destroy = destroy_command_login,  // manual assignment
        // ...
    };
} else if (strcmp("join", command_name) == 0) {
    *p = (struct payload) {
        .process = process_command_join,   // manual assignment again!
        .destroy = destroy_command_join,   // manual assignment again!
        // ...
    };
}
// ... repeat for EVERY type

2. Repetitive Function Declarations

For every message receiver type:

void transmit_direct_message(const struct message_receiving_entity *self, ...);
void transmit_group_message(const struct message_receiving_entity *self, ...);
void transmit_global_message(const struct message_receiving_entity *self, ...);

void destroy_group_or_direct_message(const struct message_receiving_entity *self);
void destroy_global_message(const struct message_receiving_entity *self);

Every receiver needs its own function declarations!

3. No Type Safety

Nothing prevents this disaster:

receivers[0] = (struct message_receiving_entity) {
    .transmit_message = transmit_direct_message,
    .destroy = destroy_global_message  // WRONG! Mixing behaviors!
};

The compiler will not complain, but this creates inconsistent objects.

4. Verbose Struct Initialization

Every construction site repeats the pattern:

receivers[receiver_count] = (struct message_receiving_entity) {
    .additional_info = receiver_name,
    .transmit_message = /* pick one */,
    .destroy = /* pick one */
};

Enter: Syntactic Sugar

As mentioned in introduction part, syntactic sugar is a anguage feature that makes code easier to write without changing what it actually does.

What we have been doing in C is writing OOP patterns by hand:

  • Manually assigning function pointers
  • Manually managing polymorphic behavior
  • Manually ensuring consistency between related functions

What C++ provides is syntactic sugar for these patterns:

  • Classes: Group data and functions together automatically
  • Constructors: Initialize objects with correct function pointers automatically
  • Virtual functions: Polymorphism without manual function pointer management
  • Type safety: Compiler ensures you do not mix incompatible behaviors

The Limitation We Have Hit

Function pointers solve the processing problem (no more giant switch statements!), but they create new problems:

  1. Construction is verbose and error-prone
  2. No compiler help to ensure consistency
  3. Every new type requires manual wiring
  4. Repetitive patterns everywhere

These are not fundamental limitations of OOP. they’re limitations of implementing OOP manually in C.

Time to Switch to C++!

You have now experienced why OOP languages exist. They did not invent new concepts - they automated the patterns we’ve been writing by hand.

Next steps:

  1. Translate this exact solution to C++
  2. See how classes eliminate the repetition
  3. Understand that C++ is just syntactic sugar over what you have been doing

The core concepts remain the same. C++ just handles the plumbing for you.


But before switching to C++, we need to do a detailed discussion on dynamic dispatch.

Questions Arise

In Ch03, we eliminated giant switch statements using function pointers. But we have created new problems and reached limits of manual polymorphism.

The Paradox of Repetition

We complained about switch/case being repetitive. But look at what we wrote:

Before (Switch Statement):

switch (p->kind) {
    case COMMAND_LOGIN: /* ... */ break;
    case COMMAND_JOIN:  /* ... */ break;
    case MESSAGE_DIRECT: /* ... */ break;
    // ... 6 cases total
}

After (Function Pointers):

// Declaration repetition
void process_command_login(const struct payload *self);
void process_command_join(const struct payload *self);
void process_command_logout(const struct payload *self);
void process_message(const struct payload *self);
void destroy_command_login(const struct payload *self);
void destroy_command_join(const struct payload *self);
void destroy_command_logout(const struct payload *self);
void destroy_message(const struct payload *self);

// Assignment repetition
if (strcmp("login", command_name) == 0) {
    *p = (struct payload) {
        .process = process_command_login,
        .destroy = destroy_command_login,
        // ...
    };
} else if (strcmp("join", command_name) == 0) {
    *p = (struct payload) {
        .process = process_command_join,
        .destroy = destroy_command_join,
        // ...
    };
}
// ... endless if/else

Question: Did we just trade one type of repetition for another?

The Type Safety Problem

Nothing prevents this disaster:

struct payload broken = {
    .process = process_command_login,  // Command behavior
    .destroy = destroy_message,        // Message behavior!
};

The compiler won’t complain. At runtime:

  • process expects data.command_login.username
  • destroy expects data.message.receivers
  • Result: Memory corruption, segfault, undefined behavior

Question: How reasonable is it to store multiple function pointers in a struct when there is no mechanism to ensure they are consistent?

The Structural Repetition Problem

Every payload carries two function pointers:

struct payload {
    void (*process)(const struct payload *self);  // 8 bytes
    void (*destroy)(const struct payload *self);  // 8 bytes
    union payload_data data;
};

For 1000 messages, we store 16,000 bytes of function pointers. But here is the thing: All messages of the same type have THE SAME function pointers!

payload[0] = { .process = process_message, .destroy = destroy_message, ... };
payload[1] = { .process = process_message, .destroy = destroy_message, ... };
payload[2] = { .process = process_message, .destroy = destroy_message, ... };
// ... all 1000 messages store identical pointers

Question: Why are we duplicating the same pointers in every instance?


Enter: The Virtual Function Table (vtable)

The solution is to share function pointers across instances of the same type.

Concept: Separate Type Information from Instance Data
Before (current approach):

Each payload = [process*, destroy*, data]
                ^^^^^^^^^^^^^^^  stored per-instance

After (vtable approach):

Each payload = [vtable*, data]
                ^^^^^^^  single pointer to shared table

vtable_message = [process_message*, destroy_message*]
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^  shared by ALL messages

C Implementation of vtables:

// Step 1: Define the vtable structure
struct payload_vtable {
    void (*process)(const struct payload *self);
    void (*destroy)(const struct payload *self);
};

// Step 2: Create static vtables (one per type)
const struct payload_vtable vtable_command_login = {
    .process = process_command_login,
    .destroy = destroy_command_login
};

const struct payload_vtable vtable_message = {
    .process = process_message,
    .destroy = destroy_message
};

// Step 3: Payload now stores a vtable pointer
struct payload {
    const struct payload_vtable *vtable;  // Single pointer!
    union payload_data data;
};

// Step 4: Usage
payload->vtable->process(payload);  // Dynamic dispatch!

// ... same steps for receivers

Using vtables, we reduced pointers to one pointer per instance instead of N pointers. Also, a partial type safety is ensured as all functions for a type live in one vtable.

Remaining Problems

  1. Still manual: We still write if/else to assign vtables
  2. No enforcement: Can still assign wrong vtable to wrong data
  3. Constructor repetition: Every type needs init code

What Have We Learned?

Function pointers solve the processing problem but create new problems:

ProblemSolutionNew Problem Created
Giant switch statementsFunction pointers in structsRepetitive declarations
Adding new types modifies many placesEach type self-containedVerbose construction
Type/behavior couplingDynamic dispatchNo type safety
--Memory overhead (partially solved by vtables)

The pattern: Each solution introduces new complexity.

The Fundamental Limitation

We have been manually implementing OOP patterns in C. This works, but it is verbose - requires lots of boilerplate.

C++ doesn’t invent new concepts. It automates what we have been doing. Recall syntactic sugar!

What We Did ManuallyC++ Keyword
struct payload with function pointersclass
Assigning function pointers in constructorConstructor automatically sets up vtable
Our vtable structCompiler-generated vtable (hidden)
payload->vtable->process(payload)payload->process() (syntactic sugar)
Manual type safety checksCompiler enforces at compile-time

Objective

Implement vtables in C. Refactor Ch03 to use vtables. You should:

  1. Define struct payload_vtable & struct message_receiving_entity_vtable
  2. Create vtables for each payload & receiver type
  3. Change struct payload & struct message_receiving_entity_ to use a single vtable pointer
  4. Update construction logic to assign the correct vtable
  5. Change process_next() to call through vtable

After completing this, you will understand exactly what C++ virtual functions do under the hood.


You now have implemented:

  1. Polymorphism with enums + switch (Ch01-02)
  2. Polymorphism with function pointers (Ch03)
  3. Polymorphism with vtables (Ch04)

We will translate Ch03 directly to C++ and see:

  • How class eliminates manual vtable management
  • How constructors automate function pointer assignment
  • How virtual provides type-safe polymorphism
  • How inheritance shares behavior without code duplication

OOP languages exist because manual polymorphism is tedious and error-prone. Every limitation we have hit is a feature C++ automates. You are ready to appreciate what C++ actually does.

A brief detour from polymorphism: Before discussing C++ polymorphism features, we should first cover how resources are managed using RAII.

term Resource Acquisition is Initialization (RAII)
Describe a particular language behavior for managing resources. In RAII, holding a resource is a class invariant, and is tied to object lifetime. Resource allocation (or acquisition) is done during object creation (specifically initialization), by the constructor, while resource deallocation (release) is done during object destruction (specifically finalization), by the destructor. In other words, resource acquisition must succeed for initialization to succeed. Thus, the resource is guaranteed to be held between when initialization finishes and finalization starts (holding the resources is a class invariant), and to be held only when the object is alive. Thus, if there are no object leaks, there are no resource leaks. [Wikipedia]

We have been implementing RAII principle manually:

  1. parse_payload() construct payload object - that is data + behavior, and allocate resources.
  2. Resources held during object lifetime (until buffer is destroyed via buffer_destroy)
  3. Destructor methods deallocate resources used by payload objects.

In this chapter, we will inspect how C++ syntactic sugar automates this steps.

From Manual Memory Management to RAII

You have built manual polymorphism in C. You have seen how verbose it is to initialize and clean up resources. Now let’s see how C++ automates this with constructors and destructors.

The Problem We Had in C

Initialization:

struct payload *p = malloc(sizeof(struct payload));
assert(p);
p->vtable = &vtable_message;
p->data.message.content = malloc(strlen(raw) + 1);
strcpy(p->data.message.content, raw);
// ... more initialization

Cleanup:

p->vtable->destroy(p);  // Must remember to call!
free(p);

What could go wrong?

  • Forget to initialize a field - undefined behavior
  • Forget to call destroy() - memory leak
  • Call destroy() twice - double-free
  • Use after free() - use-after-free

Before Switching to C++…

To familiarize yourself with C++ syntax, take a quick glance to links:

  1. Output (Print Text)
  2. New and Delete
  3. Classes/Objects
  4. Class Methods
  5. Access Specifiers

Implementing RAII: String Class

Let’s build a RAII class from scratch using char *. Here is a string class with manual memory management:

class String {
private:
    char *data;

public:
    // Constructor: Acquire resource (allocate memory)
    String(const char *str) {
        data = new char[strlen(str) + 1];  // Allocate
        strcpy(data, str);
        std::cout << "String created: " << data << "\n";
    }

    // Destructor: Release resource (free memory)
    ~String() {
        std::cout << "String destroyed: " << data << "\n";
        delete[] data;  // Free automatically!
    }

    const char* c_str() const { return data; }
};

Usage

{
    String username("alice");
    String message("Hello!");
    std::cout << "Processing: " << username.c_str() << "\n";
}  // Destructors called automatically here!

Output:

String created: alice
String created: Hello!
Processing: alice
String destroyed: Hello!     ← Reverse order!
String destroyed: alice      ← Reverse order!

What C++ does behind the scenes:

// Equivalent C code:
{
    struct String username;
    String_constructor(&username, "alice");  // Constructor

    struct String message;
    String_constructor(&message, "Hello!");  // Constructor

    printf("Processing: %s\n", String_c_str(&username));

    String_destructor(&message);   // Automatic!
    String_destructor(&username);  // Automatic, reverse order!
}

syntactic sugar: RAII = Constructor allocates, Destructor frees. No manual free!

Construction and Destruction Order

Rule 1: Local Variables

Objects are destroyed in reverse order of their construction:

{
    String a("first");   // Constructed 1st
    String b("second");  // Constructed 2nd
    String c("third");   // Constructed 3rd
}
// Destroyed: c (3rd), b (2nd), a (1st) - REVERSE!

Why? Later objects might depend on earlier ones. Reverse order ensures dependencies are destroyed last.

Rule 2: Member Variables

Members are destroyed in reverse declaration order:

class Payload {
private:
    String username;  // Declared 1st
    String content;   // Declared 2nd

public:
    Payload(const char *user, const char *msg)
        : username(user), content(msg) {}
    // Destructor automatically destroys: content, then username
};

syntactic sugar:

~Payload() {
    content.~String();   // Destroy 2nd-declared member first
    username.~String();  // Destroy 1st-declared member last
}

Dependency Chains: RAII Composition

RAII objects can contain other RAII objects. Cleanup happens automatically in the correct order.

Example: Command with Multiple Strings

On upcoming exercises, we will implement command with inheritance. But since we have not covered inheritance yet, this example does not use it:

class LoginCommand {
private:
    String username;   // RAII object 1
    String password;   // RAII object 2

public:
    LoginCommand(const char *cmd, const char *user, const char *pass)
        : username(user), password(pass) {
        std::cout << "LoginCommand constructed\n";
    }

    ~LoginCommand() {
        std::cout << "LoginCommand destructed\n";
        // String destructors called automatically after this!
    }

    void process() {
        // ... business logic for login
    }
};

Full Execution:

int main() {
    LoginCommand p("alice", "s3cr3t");
    p.process();
}

Output:

String created: alice       ← username constructor
String created: s3cr3t      ← password constructor
Payload constructed
Processing: login
Payload destructed
String destroyed: s3cr3t    ← password destructor (reverse!)
String destroyed: alice     ← username destructor

syntactic sugar: You never called delete or wrote cleanup code. The compiler generated the entire destruction chain!

The Dependency Chain in Detail

Construction order (top-to-bottom):

  1. username member constructed
  2. password member constructed
  3. LoginCommand constructor body runs

Destruction order (bottom-to-top, reverse!):

  1. LoginCommand destructor body runs
  2. password member destroyed
  3. username member destroyed

Why reverse? If LoginCommand constructor uses username and password in its body, username and password must outlive the LoginCommand destructor body. Reverse order guarantees this.

Comparison: C vs C++

Here is a simplified version of what we have been doing in C (Manual dependency tracking):

struct login_command {
    char *username;
    char *password;
};

void payload_init(struct login_command *cmd,
                  const char *user, const char *pass) {
    cmd->username = malloc(strlen(user) + 1);
    strcpy(cmd->username, user);

    cmd->password = malloc(strlen(pass) + 1);
    strcpy(cmd->password, pass);
}

void payload_destroy(struct payload *p) {
    free(cmd->password);   // Manual order!
    free(cmd->username);   // Manual order!
}

C++ (Automatic dependency tracking):

class LoginCommand {
private:
    String username;
    String password;

public:
    LoginCommand(const char *user, const char *pass)
                : username(user), password(pass) {}
    // Destructor generated automatically with correct order!
};

What About Standard Library?

std::string and std::vector are RAII classes, just like our custom String.

class Message {
private:
    std::string content;  // RAII, manages char * internally
    std::vector<MessageReceivingEntity> receivers;  // RAII manages list and
                                                    // receivers internally

public:
    Message(const std::string& str) : content(str) {}

    // std::vector destructor called and it called MessageReceivingEntity's
    // destructor
    // std::string destructor called automatically
};

Under the hood, std::string does:

class string {
private:
    char *data;
public:
    string(const char *str) { data = new char[strlen(str) + 1]; }
    ~string() { delete[] data; }
};

Same pattern we just built, standard library = collection of well-tested RAII classes.


You have now seen RAII with manual char *. Next:

  1. Implement your payload classes using custom String (do not implement all of them, just a few to get used to defining classes)
  2. See how construction/destruction order prevents bugs, by adding print statemets to constructors/destructors. (in provided solution, this does not implemented to make code easier to read.)
  3. Replace custom String with std::string (same behavior, more features)
  4. Understand: Standard library is built on RAII principles

RAII is not a C++ feature or a magic, it is a design pattern. C++ makes it the default through automatic constructor/destructor calls. This is syntactic sugar that prevents entire categories of memory bugs.

We will continue our discussion with C++ virtual tables on next chapter.

Virtual Methods and Inheritance

In Ch04, we discussed vtables and how they solve the problem of storing duplicate function pointers in every object. We also saw that manual vtable assignment is error-prone.

In Ch05, we learned how C++ automates resource management with constructors and destructors.

Now let’s see how C++ automates dynamic dispatch using virtual methods and code reuse using inheritance.

Recall: Our Manual vtable in C

// Step 1: Define the vtable structure
struct payload_vtable {
    void (*process)(const struct payload *self);
    void (*destroy)(const struct payload *self);
};

// Step 2: Create static vtables (one per type)
const struct payload_vtable vtable_command_login = {
    .process = process_command_login,
    .destroy = destroy_command_login
};

const struct payload_vtable vtable_message = {
    .process = process_message,
    .destroy = destroy_message
};

// Step 3: Payload stores a vtable pointer
struct payload {
    const struct payload_vtable *vtable;
    union payload_data data;
};

// Step 4: Manual assignment in constructor
void parse_payload(struct payload *p, const char *raw) {
    if (raw[0] == '/') {
        // ... parsing logic
        p->vtable = &vtable_command_login;  // Manual!
    } else {
        p->vtable = &vtable_message;  // Manual!
    }
}

// Step 5: Dynamic dispatch through vtable
void process_next(struct payload_buffer *buf) {
    struct payload *p = &buf->payloads[buf->process_base];
    p->vtable->process(p);  // Dynamic dispatch
}

Recall: Problems of manual vtables

  1. Manual vtable assignment. Easy to get wrong!
  2. No compile-time checks, type-safety. Can assign wrong vtable!
  3. Repetitive vtable creation for every type.
  4. No shared code between similar types.

C++ Syntactic Sugar: Virtual Methods!

Basic Virtual Method Syntax:

class Payload {
public:
    // Virtual method = automatically uses vtable
    virtual void process() {
        cout << "Processing generic payload" << endl;
    }

    virtual ~Payload() = default;
};

class Message : public Payload {
public:
    // Override base class method
    void process() override {
        cout << "Processing message" << endl;
    }
};

What C++ Does Behind the Scenes:

// Compiler generates vtable automatically
struct Payload_vtable {
    void (*process)(Payload *self);
    void (*destructor)(Payload *self);
};

const struct Payload_vtable vtable_Message = {
    .process = Message_process,  // Automatically assigned!
    .destructor = Message_destructor
};

struct Payload {
    const struct Payload_vtable *vptr;  // Hidden! Added by compiler
    // ... your data members
};

// Constructor automatically sets vptr
void Message_constructor(struct Message *m) {
    m->vptr = &vtable_Message;  // Automatic assignment!
}

Unlike manual vtables:

  • Compiler generates vtable structs
  • Compiler populates vtable entries
  • Compiler inserts vptr into every object
  • Constructor automatically assigns correct vtable
  • override keyword provides compile-time checking

Inheritance: Sharing Code

The Problem with Repetition

In our C implementation, we had lots of repeated code:

void process_command_login(const struct payload *p) {
    // Command-specific processing
    printf("Command: login\n");
    // ... print arguments
}

void process_command_join(const struct payload *p) {
    // Command-specific processing
    printf("Command: join\n");
    // ... print arguments
}

void process_command_logout(const struct payload *p) {
    // Command-specific processing
    printf("Command: logout\n");
    // No arguments
}

Every command has similar structure but different details.

Inheritance Solution

Base class contains shared behavior:

class Command : public Payload {
public:
    Command(const char *command_name_)
        : command_name { command_name_ } {}

    // Shared implementation
    void process() override {
        cout << "Command: " << command_name << endl;
        process_arguments();  // Call derived class method
    }

    virtual ~Command() {}

protected:
    // Derived classes must implement this
    virtual void process_arguments() = 0;  // Pure virtual, check end of this
                                           // README for details
    // Syntax is not so important. When you are not sure about syntax, Google
    // it.

    string command_name;
};

Derived classes implement specific behavior:

class LoginCommand : public Command {
public:
    LoginCommand(const char *username_, const char *password_)
        : Command { "login" }, username { username_ }, password { password_ } {}

private:
    // Implement required method
    void process_arguments() override {
        cout << "  Arguments: [username: " << username
             << ", password: " << password << "]" << endl;
    }

    string username;
    string password;
};

class JoinCommand : public Command {
public:
    JoinCommand(const char *channel_)
        : Command { "join" }, channel { channel_ } {}

private:
    void process_arguments() override {
        cout << "  Arguments: [channel: " << channel << "]" << endl;
    }

    string channel;
};

class LogoutCommand : public Command {
public:
    LogoutCommand() : Command { "logout" } {}

private:
    void process_arguments() override {
        cout << "  Arguments: []" << endl;
    }
};
// Polymorphism: store different types in same pointer type
Payload *payloads[3];
payloads[0] = new LoginCommand { "alice", "pass123" };
payloads[1] = new JoinCommand { "general" };
payloads[2] = new LogoutCommand {};

// Dynamic dispatch - each calls correct method
for (int i = 0; i < 3; i++) {
    payloads[i]->process();  // Calls through vtable!
}

// Cleanup
for (int i = 0; i < 3; i++) {
    delete payloads[i];  // Virtual destructor ensures correct cleanup!
}
Command: login
  Arguments: [username: alice, password: pass123]
Command: join
  Arguments: [channel: general]
Command: logout
  Arguments: []

Syntax Reference

  1. Base and Derived Classes
class Base {
    // Shared functionality
};

class Derived : public Base {
    // Specific functionality + inherits from Base
};
  1. Virtual Functions
virtual return_type method_name();  // Can be overridden
  1. Pure Virtual Functions (Abstract Classes)
virtual void method_name() = 0;  // MUST be overridden
  1. Override Keyword
void process() override;  // Compile error if not overriding
  1. Virtual Destructor
virtual ~ClassName() {}  // Critical for polymorphic classes!

Why critical? Without virtual destructor:

Payload *p = new Message { ... };
delete p;  // Only calls ~Payload(), leaks Message members!

With virtual destructor:

Payload *p = new Message { ... };
delete p;  // Calls ~Message() then ~Payload() - correct order!
  1. Access Specifiers
class Example {
public:
    // Accessible from anywhere

protected:
    // Accessible from this class and derived classes

private:
    // Accessible only from this class
};

You are now familiar with how C++ automates virtual table construction. Next:

  1. Implement command hierarchy. Create base Command class and derived classes: LoginCommand, JoinCommand, and LogoutCommand.
  2. Implement message hierarchy. Create base Message class with: DirectMessage, GroupMessage, and GlobalMessage. Use inheritance to share message content storage and common processing logic.
  3. Create polymorphic buffer that stores Payload * pointers (base class pointers) and can hold any derived type (commands or messages).
  4. Test virtual destruction via adding print statements in destructors to verify derived destructor runs first and base destructor runs second. (This task not implemented in provided solution to make code easier to read.)

virtual is syntactic sugar for the vtable pattern you built manually. Inheritance is syntactic sugar for sharing vtable entries and data. C++ did not invent polymorphism, it automated the plumbing.

On next chapter, we will continue our journey with references and copy management.

Pure Virtual Methods

In our Paload, Command, and Message examples, we used a specific syntax: virtual void <method_name>() = 0;. This is called a pure virtual method.

A pure virtual method is a declaration that a function must exist, but the base class provides no implementation for it. It acts as a mandatory contract: any class that inherits from Payload or Command is required to implement this method to be considered complete.

When a class contains at least one pure virtual method, it becomes an Abstract Class.

  • No Instantiation: You cannot create an object of an abstract class (e.g., new Command("login") will result in a compiler error).
  • Incomplete Blueprint: The class exists solely to provide a common interface and shared data for its children.

Under the Hood: The C Parallel

In your manual C vtable implementation, a pure virtual method is like defining a function pointer in your vtable struct but purposefully leaving it unassigned in the base “type”.

In C: If you accidentally called a NULL: function pointer, your program would crash at runtime.

In C++: The compiler prevents this crash by refusing to compile any code that tries to create an object that hasn’t fulfilled its pure virtual requirements.

As you work through Ch06, notice that if you forget to implement process_arguments() in classes that inherit Command, or process_recipient() in classes that inherit Message, your code will not compile. This is the “syntactic sugar” of compile-time safety replacing manual runtime checks.

References and Copy Management

In Ch05, we built a custom String class with RAII. In Ch06, we used it with polymorphism. But there is a hidden danger: the default copy behavior can cause double-free crashes.

This chapter covers two essential topics:

  1. References to avoid expensive copies
  2. Copy control for making RAII classes safe to copy

The Hidden Bug in Our Code

Recall the custom String class from Ch05:

class String {
public:
    String(const char *data_) {
        data = new char[strlen(data_) + 1];
        strcpy(data, data_);
    }

    ~String() { delete[] data; }

    const char* c_str() const { return data; }

private:
    char *data;
};

This code has a critical bug: The Double-Free Disaster

void print_string(String s) {  // !bug: passes by value (creates copy)
    cout << s.c_str() << endl;
}  // s destroyed here, deletes[] data called

int main() {
    String name { "alice" };
    print_string(name);  // Creates copy with same data pointer
    // name destroyed here
    // delete[] data called again on the same pointer. Double-free, CRASH!
}

Step-by-step:

  1. name created, allocates memory for “alice”
  2. print_string(name) copies name:
    • Default copy constructor copies data pointer
    • Both name and s point to same memory
  3. s destroyed at end of function: deletes memory
  4. name destroyed at end of main: deletes already-freed memory

This is the shallow copy problem: copying the pointer, not the data.

Solution 1: References - Stop Copying

Instead of creating a copy, pass a reference (alias) to the original:

void print_string(String &s) {  // Reference, not copy
    cout << s.c_str() << endl;
}  // Nothing destroyed, just alias goes away

int main() {
    String name { "alice" };
    print_string(name);  // No copy created!
}  // name destroyed once, everything is fine

String &s means “s is just another name for the original String, and I promise not to modify it via RAII methods.”

Reference Syntax

int x = 42;
int &ref = x;     // ref is an alias for x
ref = 100;        // x is now 100

cout << x;        // Prints 100
cout << ref;      // Prints 100 (same variable!)

References vs Pointers

FeaturePointerReference
Can be null?Yes (nullptr)No (must refer to something)
Can change target?YesNo (always refers to same object)
Syntax*ptr, ptr->memberJust use like variable, no field dereference operator (->)
Use caseOptional parameters, arraysFunction parameters (avoid copies)

When to Use References

Pass by const reference: Default for objects:

void process(const Message &msg);  // Read-only, no copy
void send(const String &content);

Pass by reference, only when you need to modify:

void update_content(Message &msg, const String &newContent) {
    msg.set_content(newContent);  // Modifies original
}

Pass by value, small types only:

void set_port(int port);      // int is cheap to copy
void set_flag(bool enabled);  // bool is cheap to copy

Rule of thumb: If it has a destructor (manages resources), use &.

const Correctness

Mark methods that do not modify the object as const:

class String {
public:
    const char *c_str() const { return data; }  // Does not modify

    void clear() { delete[] data; data = nullptr; }  // Not const, modifies
};

Why? const methods can be called on const references:

void print_string(const String &s) {
    cout << s.c_str();  // OK, c_str() is const
    s.clear();          // ERROR: clear() is not const
}

Exercise 01: Fix the References

Refactor this code to use references:

// Current code (inefficient and buggy):
void display_message(GlobalMessage msg) {  // Copies entire message!
    cout << "Message: " << msg << endl;
}

// Copies both!
void append_content(GlobalMessage msg, /* string from std */ string suffix) {
    msg.setContent(msg + suffix);  // Modifies copy, not original!
    // std::string overloaded + operator for string concatenation.
    // We will discuss operator overloading later on.
}

int main() {
    GlobalMessage m { "Hello" };
    display_message(m);  // Wasteful copy
    append_content(m, "!");  // Does not modify m!
    display_message(m);  // Still prints "Hello", not "Hello!"
}

Tasks:

  1. Fix display_message to avoid copying
  2. Fix append_content to modify the original message
  3. Add const to methods that don’t modify objects

Solution 2: Deep Copy - Rule of Three

References solve the “passing to functions” problem. But what about this?

String name1 { "alice" };
String name2 = name1;  // What should happen here?

Two options:

Option 1: Prevent Copying

class String {
public:
    // Delete copy operations:
    String(const String &) = delete;
    String &operator=(const String &) = delete;

    // Other members...
};

String name1 { "alice" };
String name2 = name1;  // COMPILE ERROR: copy deleted

Use when: The resource cannot be meaningfully shared (e.g. file handles, synchronization primitives).

Option 2: Deep Copy

class String {
public:
    // Copy constructor: creates independent copy
    String(const String &other) {
        data = new char[strlen(other.data) + 1];
        strcpy(data, other.data);  // Copy the DATA, not the pointer
        cout << "String copied: " << data << "\n";
    }

    // Copy assignment: replaces content with copy
    String &operator=(const String &other) {
        if (this != &other) {  // Self-assignment check
            delete[] data;     // Free old data
            data = new char[strlen(other.data) + 1];
            strcpy(data, other.data);  // Copy new data
        }
        return *this;
    }

    // Destructor (already had this)
    ~String() {
        delete[] data;
    }

private:
    char *data;

};

Now copying is safe:

String name1 { "alice" };
String name2 = name1;  // Deep copy, independent memory
name1.~String();       // Deletes name1's memory
// name2 still valid, has its own copy!

The Rule of Three

If you define any of these, you should define all three:

  1. Destructor ~T()
  2. Copy constructor T(const T&)
  3. Copy assignment T& operator=(const T&)

Why? If your class needs a custom destructor (manages resources), the default copy operations are almost certainly wrong.

Shallow copy (default, WRONG for our String):

Before copy:
  name1 -> [data ptr] -> "alice"

After: String name2 = name1;
  name1 -> [data ptr] --
                       |-> "alice" SHARED! BAD!
  name2 -> [data ptr] --

After name1 destroyed:
  name2 -> [data ptr] -> <freed memory>  CRASH!

Deep copy (correct):

Before copy:
  name1 -> [data ptr] -> "alice"

After: String name2 = name1;
  name1 -> [data ptr] -> "alice"
  name2 -> [data ptr] -> "alice"  Independent copy, GOOD!

After name1 destroyed:
  name2 -> [data ptr] -> "alice"  Still valid!

The Rule of Five (C++11)

Modern C++ adds move operations:

  1. Move constructor T(T &&)
  2. Move assignment T &operator=(T &&)

For now, we will stick with rule of three. Move semantics is an optimization (transfer instead of copy), but not essential for correctness. Here is a quick reference for rule of five, if you are into it.

The Rule of Zero

Best practice: Avoid manual resource management entirely.

// Instead of custom String with rule of three:
class Message {
public:
    Message(const char *content_);
    Message(const Message&);  // Copy constructor
    Message& operator=(const Message&);  // Copy assignment
    ~Message();  // Destructor

private:
    char *content;  // Manual memory. Rule of three needed
};

// Use std::string (already implements rule of three):
class Message {
public:
    Message(const char *content_) : content { content_ } {}
    // No custom destructor/copy needed! Compiler defaults work!

private:
    string content;  // RAII type, handles copying automatically
};

Rule of Zero: If all your members are RAII types (std::string, std::vector, std::unique_ptr), you don’t need custom copy/destructor.

Exercise 02: Implement Rule of Three

class Buffer {
public:
    Buffer(size_t size_) : size { size_ } {
        data = new char[size];
        memset(data, 0, size);
    }

    ~Buffer() {
        delete[] data;
    }

    // TODO: Implement copy constructor
    // TODO: Implement copy assignment

    char *get_data() { return data; }
    size_t get_size() const { return size; }

private:
    char *data;
    size_t size;
};

// This should work after you implement copy operations:
void test_buffer() {
    Buffer buf1 { 100 };
    buf1.get_data()[0] = 'X';

    Buffer buf2 = buf1;  // Copy constructor
    buf2.get_data()[0] = 'Y';

    cout << buf1.get_data()[0];  // Should print 'X' (not 'Y'!)
    cout << buf2.get_data()[0];  // Should print 'Y'
}

Tasks:

  1. Implement copy constructor that performs deep copy
  2. Implement copy assignment operator (don’t forget self-assignment check!)
  3. Verify no memory leaks with valgrind

Bonus: Rewrite using std::vector<char> (Rule of Zero).


For references, avoid expensive copies by passing const T &. Use T & only when you need to modify. Small types (int, bool) can still pass by value.

Rule of Three:

  • If you write a destructor, write copy constructor and copy assignment
  • Prevents shallow copy bugs (double-free)
  • Implement deep copy: allocate new memory, copy data

Rule of Zero:

  • Prefer RAII types (std::string, std::vector)
  • Let compiler generate correct copy operations (syntactic sugar!)
  • Only write rule of three when you must manage raw resources

Pattern Reference

// 1. Read-only parameter:
void process(const Message &msg);

// 2. Modifiable parameter:
void update(Message &msg);

// 3. RAII class that owns memory:
class MyClass {
public:
    MyClass(const MyClass &);  // Deep copy
    MyClass &operator=(const MyClass &);  // Deep copy
    ~MyClass();  // Free memory

private:
    char *data;
};

// 4. RAII class that CANNOT be copied
class Socket {
public:
    Socket(const Socket &) = delete;
    Socket &operator=(const Socket &) = delete;
    ~Socket();

private:
    int fd;
};

// 5. Rule of Zero (preferred):
class Message {
    std::string content;  // Handles its own copying
    // No custom copy/destructor needed!
};

These are fundamental C++ idioms you will use in every project. Combined with RAII (Ch05) and polymorphism (Ch06), you can now begin writing safe, modern C++ code.

Future topics (you will learn during your hands-on project):

  • Move semantics (optimization for transfers)
  • Smart pointers (std::unique_ptr, std::shared_ptr)
  • Templates and generic programming
  • Exception safety

But with references and proper copy control, you have the foundation for safe C++ programming! In the next chapter, we will reinforce what we have learned so far with a more theoretical explanation, and then we begin developing pracical applications.

The Four Pillars of OOP

You have spent the previous chapters manually building the mechanics of object-oriented languages using C. You have seen that modern languages do not invent new logic; they simply automate the tedious plumbing of pointers and memory management. This chapter formally defines the four conceptual pillars that support every OOP design.

Before diving into practical applications, let’s consolidate what you have learned by mapping your manual implementations to the standard OOP terminology.

Encapsulation

term Encapsulation
The bundling of data and the functions that operate on that data into a single unit (a class or struct), while restricting direct access to some of the object’s components.

You implemented encapsulation throughout this module, even before switching to C++.

In C (Ch03): You bundled data and behavior by storing function pointers alongside data in the same struct:

struct payload {
    void (*process)(const struct payload *self);
    void (*destroy)(const struct payload *self);
    union payload_data data;
};

The payload struct encapsulated both state (data) and behavior (process, destroy). This prevented scattering related logic across multiple files.

In C++ (Ch05): You formalized encapsulation with access specifiers:

class String {
public:
    String(const char *str);  // Controlled interface
    ~String();
    const char *c_str() const;

private:
    char *data;  // Hidden from external access
};

By marking data as private, you enforced a critical rule: only the String class itself can manage its memory. External code cannot accidentally corrupt the pointer, preventing entire categories of bugs.

Why? Without encapsulation, your char *data would be public, and any code could call free(data) or reassign the pointer. Encapsulation creates a protective barrier around your resources.

Abstraction

term Abstraction
The process of hiding implementation details and exposing only a simple, high-level interface to the user.

While encapsulation bundles data with behavior, abstraction focuses on what the user sees versus what the object does internally. We do not deal with implementations, with abstraction we interact with ideas.

In C (Ch03): Your process_next() function demonstrated abstraction:

void process_next(struct payload_buffer *buf) {
    struct payload *p = &buf->payloads[buf->process_base];
    p->process(p);  // Abstract interface
    buf->process_base++;
}

The caller of process_next() does not know or care whether the payload is a login command or a direct message. The function pointer process abstracts away the implementation details. The interface is simple: “process this payload.” In C++ (Ch06): Virtual methods provide the same abstraction with cleaner syntax:

for (Payload *p : payloads) {
    p->process();  // Abstract interface
}

Each payload knows how to process itself, but the calling code is abstracted from those details. You can add new payload types without changing the loop.

The key difference:

  • Encapsulation answers: “How do I protect my data?”
  • Abstraction answers: “How do I hide my complexity?”

Your String class encapsulates its char *data while abstracting the memory management details behind c_str() and the destructor.

Inheritance

term Inheritance
A mechanism where a class (derived/child) acquires properties and behaviors from another class (base/parent), enabling code reuse and establishing “Is-A” relationships.

In C (Ch03): Inheritance was manual and error-prone. You had to duplicate the vtable structure across every payload type or use complex casting tricks to simulate base types.

In C++ (Ch06): Inheritance became declarative, i.e. we have not required to implement underlying polymorphism manually:

class Command : public Payload {
public:
    void process() override {
        cout << "Command: " << command_name << endl;
        process_arguments();  // implemented in derived classes
    }

    virtual void process_arguments() = 0;

protected:
    string command_name;
};

class LoginCommand : public Command {
    // Inherits: command_name, process()
    // Implements: process_arguments()
};

LoginCommand is a Command, which is a Payload. This hierarchy eliminates repetition: all commands share the command_name field and the common process() logic.

Type substitution: Because LoginCommand inherits from Payload, you can store it in a Payload * pointer:

Payload *p = new LoginCommand("alice", "secret");
p->process();  // Calls LoginCommand's version

This is the “Is-A” relationship in action.

Composition

term Composition
A design principle where complex objects are built by combining simpler ones, establishing “Has-A” relationships.

While inheritance models “Is-A,” composition models “Has-A.”

Example from Ch05 (RAII):

class LoginCommand {
private:
    String username;  // LoginCommand *has a* String
    String password;  // LoginCommand *has a* String
};

LoginCommand is composed of String objects. It does not inherit from String; it contains them.

Composition in practice (Ch03):

struct payload_buffer {
    struct payload *payloads;  // Buffer *has* payloads
    size_t capacity;
    size_t size;
};

The buffer has payloads and has capacity tracking. These are distinct concerns managed through composition.

Guideline: Prefer composition over inheritance when:

  • The relationship is “Has-A” rather than “Is-A”
  • You want to avoid tight coupling between classes
  • You need flexibility to change parts independently

Polymorphism

term Polymorphism
The ability to treat different types uniformly through a common interface while maintaining their unique behaviors.

You implemented polymorphism in three evolutionary stages:

1. Static Dispatch (Ch01-02): Decision made at compile-time using enum tags:

switch (payload.kind) {
    case COMMAND_LOGIN:
        process_login();
        break;
    case MESSAGE_DIRECT:
        process_direct_message();
        break;
}

The compiler generates separate code paths. No runtime flexibility and adding a new type requires recompiling and modifying the switch statement.

2. Manual Dynamic Dispatch (Ch03-04): Runtime decision using function pointers:

struct payload {
    void (*process)(const struct payload *self);
};

p->process(p);  // Calls different function depending on runtime type

You manually assigned function pointers during construction. This was flexible but error-prone: nothing stopped you from assigning the wrong function pointer.

3. Automatic Dynamic Dispatch (Ch06): C++ virtual keyword automates vtable management:

class Payload {
public:
    virtual void process() = 0;
};

class LoginCommand : public Payload {
public:
    void process() override { /* ... */ }
};

Payload *p = new LoginCommand(...);
p->process();  // Compiler generates vtable lookup

The compiler guarantees the correct function is called. Vtable assignment happens automatically in constructors. The override keyword catches typos at compile-time.

The progression: From explicit type checks (Ch01) -> manual vtables (Ch03) -> automated vtables (Ch06). Same underlying mechanism, progressively less manual work.

Bringing It All Together

These four pillars are not independent features. They work together to enable extensible designs.

Example: Your Chapter 06 command hierarchy

  • Encapsulation: Command class hides command_name as protected
  • Abstraction: process() provides a simple interface that hides the argument parsing complexity
  • Inheritance: LoginCommand inherits common structure from Command
  • Polymorphism: A Payload * can point to any command, and process() calls the correct implementation

From manual to automatic:

PillarManual (C)Automatic (C++)
EncapsulationStruct groupingprivate/protected/public
AbstractionFunction pointersVirtual methods
InheritanceManual struct layoutclass Derived : public Base
PolymorphismManual vtable assignmentCompiler-generated vtables

What You Have Accomplished

You did not just learn C++ syntax. You built the fundamental patterns that all object-oriented languages implement. These patterns exist in Java, Python, C#, and every other OOP language. The syntax changes, but the underlying mechanics, the mechanics you implemented by hand, remain the same.


With these theoretical foundations solidified, you are ready to transition from “how objects work” to “how objects solve problems.” In the next part, we will apply these pillars to build a robust, real-world networking application.

You now understand what your compiler is doing behind the scenes. Let’s build something real.