By building an interface in C first, you will realize that classes in object oriented languages are not doing anything magical, they are just providing syntactic sugar for patterns that are tedious to write in C.
- term Syntactic Sugar
- Is syntax within a programming language that is designed to make things
easier to read or to express. It makes the language “sweeter” for human
use: things can be expressed more clearly, more concisely, or in an
alternative style that some may prefer. Syntactic sugar is usually a
shorthand for a common operation that could also be expressed in an
alternate, more verbose, form. [Wikipedia]
Some example of syntactic sugar that all of you are familiar with:x += y;(x = x + y;),foo->bar(*foo).bar
Under the Hood of OOP
In this introductory module, the goal is to simulate an interface and polymorphism using C programming language.
We shall start by explaining what is polymorphism.
What is Polymorphism?
Polymorphism - The “Many Forms” Principle
In C, functions are usually bound at compile-time. If you call
calculate_circle_area(), the compiler knows exactly which memory address to
jump to. Polymorphism breaks this static link.
- unnecessary_info Polymorphism
- derived from Greek for “many shapes”
Polymorphism allows a single interface to represent different underlying
implementations. E.g. in your C code, this means you can hold a pointer to a
generic shape struct, but when you call area(shape), the program decides at
runtime whether to execute the circle logic or the square logic.
Function Pointers
Pillar of Dynamic Dispatch
In C, a function pointer is a variable that stores the memory address of a function. This allows you to decide at runtime which function to call. Its syntax is a bit messy but you will get it after writing a few function pointers.
Syntax Reference:
// Declaring a function pointer
double (*operation)(double, double);
// Assigning a function to it
double add(double a, double b) { return a + b; }
operation = add;
// Calling through the pointer
double result = operation(5.0, 3.0); // Calls add(5.0, 3.0)
We put parenthesis to distinguish return type from function pointer syntax:
// concatenate_arrays variable is a pointer to a function that returns an int
// pointer
int *(*concatenate_arrays)(int *arr1, int *arr2, int size1, int size2);
The key insight: function pointers let you treat functions as variable. You can pass them around, store them in structs, and decide which one to call based on runtime conditions.
Here is a quick reference for function pointers.
Enough introduction. Now you need to implement a dynamic dispatch logic to practice OOP logic - with what you have already known.
Traditional Approach
You will build a simple message relay in this part that reads messages from a file and processes them based on their type. This milestone will show you the traditional approach to handling different message types using if/else logic.
Later, you will see how function pointers can make this code cleaner and more extensible.
Objective
Write a C program that reads a file line by line and processes different types of messages.
Payload Types:
- Commands - Lines starting with
/- These are system commands. Commands may continue with arguments, ignore them and just print the command name.
- Print
"Command: <command_name>" - Example:
/quitprints"Command: quit"
- Direct Messages - Lines starting with
@- Private messages to a specific user
- Print
"Direct message to <username>: <content>" - Example:
@alice Hey there!prints"Direct message to alice: Hey there!"
- Group Messages - Lines starting with
#- Messages to a group channel
- Print
"Group message to <channel>: <content>" - Example:
#general Hello everyoneprints"Group message to general: Hello everyone"
- Global Messages - All other lines
- Broadcast messages to everyone
- Print
"Global message: <content>" - Example:
Hello worldprints"Global message: Hello world"
Your program should:
- Accept a filename as a command-line argument
- Read the file line by line
- Identify the payload type based on the first character
- Process and print each payload according to its type
- Handle empty lines gracefully (skip them)
- Close the file and exit cleanly
You do not need to write production-grade codes. You are allowed to leak memory, ignore boundary checks etc.
Example
Input file payloads.txt
/login metw SuperSecretP4%%w0rd
Hello, world!
@bob How are you doing?
#announcements Server maintenance tonight
This is a global broadcast
/logout
@alice Thanks for your help
#general See you all later
Expected Output
Command: login
Global message: Hello, world!
Direct message to bob: How are you doing?
Group message to announcements: Server maintenance tonight
Global message: This is a global broadcast
Command: logout
Direct message to alice: Thanks for your help
Group message to general: See you all later
What You will Learn
This exercise demonstrates the traditional approach to handling different types:
- Lots of if/else or switch statements
- Duplicated logic for similar operations
- Hard to extend when adding new payload types
After completing this, you’ll appreciate how function pointers can simplify this pattern!
Implement logic for this task yourself, and then inspect the solution 01. After that, you can start working on the second milestone.
Extending a Non-OOP Project
In Ch01, you built a basic message relay. Now you need to extend it with new features. This exercise will demonstrate how difficult it becomes to maintain and extend code that relies on traditional if/else dispatching.
Objective
Your program needs to handle two new capabilities:
1. Command Arguments: Commands now need to parse and store their arguments:
- Login -
/login <username> <password>- Store username and password
- Join -
/join <channel>- Store channel name
- Logout -
/logout- No arguments
Output format:
Command: login
Arguments: [username: metw, password: SuperSecretP4%%w0rd]
Command: join
Arguments: [channel: general]
Command: logout
Arguments: []
2. Payload Buffering: Instead of printing payloads immediately, you need to
- Read and parse all payloads from the file into a buffer (array/list)
- After reading everything, print (process) payloads one by one from the buffer
- Print “Processing payload X of Y” before each message/command
This simulates how real servers queue payloads before processing. As we do not have any networking capabilities yet, we simply print a payload out to represent how it may be processed (e.g. print formatting).
Example
(using payloads.txt from previous example)
Expected Output:
--- Reading payloads ---
Read 8 payloads
--- Processing payloads ---
Processing payload 1 of 8
Command: login
Arguments: [username: metw, password: SuperSecretP4%%w0rd]
Processing payload 2 of 8
Global message: Hello, world!
Processing payload 3 of 8
Group message to announcements: Server maintenance tonight
...
Requirements
Your program should:
- Create payload and buffer structures:
- Store the payload type (login / join / logout / direct message / group message / global message)
- For commands, store parsed arguments
- Do two-pass processing:
- First pass: Read file and populate buffer. You should allocate new
char *for each string (receiver name, command arguments etc.) - Second pass: Process each buffered message, print them out
- First pass: Read file and populate buffer. You should allocate new
- Command argument parsing:
- Extract command and all arguments
- Store arguments in an
unionstructure
- Maintain compatibility with previous version:
- Your code should still handle direct messages, group messages, and global messages from Ch01
Hints
- Recap: union
union payload_data { struct { char *username; char *password; } login; struct { char *content; } message; }; - Recap: enum
enum payload_kind { COMMAND_LOGIN, COMMAND_LOGOUT /* ... */ }; - Consider using a struct to represent a buffered payload and enum for type of
it:
struct payload { enum payload_kind kind; /* Type discriminator */ union payload_data data; /* Type-specific payload data */ } - Dynamic allocation (
malloc) will be necessary for the buffer
The problem this reveals is that the traditional approaches make extensions exponentially harder:
- Ch01: Handle 4 message types -> ~50 lines
- Ch02: Add arguments + buffering + separate processing -> ~300+ lines
Spoiler: Do not inspect the requirements for Ch03 before completing this chapter if you value your mental health!
The agony of extending your code will be unbearable. You will eventually be forced to either perform a complete rewrite to accommodate new features or fundamentally change your design principles to escape the impending misery.
Inspect the solutions for this chapter. Current design principle becoming increasingly unmaintainable with new requirements.
- term Static and Dynamic Dispatch
- We called the enum-based approach “traditional dispatch” because it was the familiar starting point, not because it is a standard pattern. It is actually called static dispatch, the compiler determines which code to execute at compile time based on the enum value. With function pointers, you are using dynamic dispatch, the decision of which function to call is made at runtime based on the function pointer stored in the object. Both achieve the same goal (executing type-specific behavior), but dynamic dispatch provides more flexibility at the cost of a small runtime overhead.
Refactoring with Function Pointers
New Requirements
Multiple receivers: Messages can now target multiple users/channels
- Example:
@alice @bob #general Hello everyone! - A single message goes to alice, bob, AND the general channel
Stop! Before You Continue…
Question: How would you implement these in your Ch02 code?
Think about it for a moment… The painful reality:
- You would need to rewrite
push_payload()parsing logic, completely. - The
payload_dataunion would need a new receiver list structure. - Multiple switch cases would change to handle arrays instead of single values.
- Processing logic would need loops for each receiver.
- Command parsing would need string splitting on semicolons.
In other words: you would need to touch almost every function, risking bugs and spending hours debugging edge cases! This is the moment to ask: “Is there a better way to organize my code?”
The answer is yes: function pointers.
The Current Problem
Look at your code from Ch02. Notice how every time you want to add a new payload type, you have to modify code in multiple places:
- Add a new enum value to
payload_kind - Add a new struct to the
payload_dataunion - Add a parsing case in
push_payload() - Add a processing case in
process_next() - Add a cleanup case (if you are properly freeing memory)
This is called tight coupling. Everything is intertwined, one change ripples through the entire codebase.
Imagine more future requirements, what if we need to add:
- Timestamps for each payload?
- Priority levels (urgent, normal, low)?
- Validation before processing?
- Logging when payloads are processed?
- JSON export of payloads?
- Filtering payloads by type?
With the current design, each feature requires changes in almost all functions. This does not scale! Real applications have dozens or hundreds of types, which requires continuous maintenance and extensions.
The Function Pointer Solution
Instead of using switch statements everywhere, we can store behavior with data, recall function pointers.
Core Idea: What if each payload could “know how to process itself”?
// Traditional approach: separate data and behavior
switch (payload->kind) {
case COMMAND_LOGIN:
printf("Command: login\n");
printf(" Arguments: [username: %s, password: %s]\n", ...);
break;
// ... more cases
}
// Function pointer approach: behavior stored with data
payload->process(payload); // Each payload knows how to process itself!
Your Mission
Refactor your Ch02 code to use function pointers instead of switch statements, and then you will requested to implement features in “New Requirements” section.
New Design Principle “Dynamic Dispatch”
- Remove the gigantic
payload_kindenum:- No more type enumeration!
- Each payload type is self-contained.
- Add a function pointer to each payload:
We use thestruct payload { void (*process)(const struct payload *self); // ... data }selfreference because the process function is a standalone function that exists outside of thestruct payloadinstance. We must explicitly pass the memory address of the struct so the function knows which specific object’s data it should operate on. - Each payload type sets its own processing function:
- Login command -> uses
process_login()function - Direct message -> uses
process_direct_message()function - … etc.
- Login command -> uses
- Simplify
process_next():void process_next(struct payload_buffer *buf) { struct payload *p = &buf->payloads[buf->process_base]; p->process(p); // That's it! No switch needed. buf->process_base++; }
Benefits You will See
- No more repetitive switch statements in processing code.
- Behavior and data are together: Each payload is self-contained.
- Extensibility: Adding new operations (like
validateorto_json) is straightforward. - Separation of Concerns: Adding new types is easier, as it does not require lots of changes in process orchestrator. Just implement business logic in a function and assign it.
What You will Learn
This exercise demonstrates the core principle of object-oriented programming:
“Objects bundle data and behavior together, allowing them to be treated uniformly.”
In C++/Java/Python, this is called polymorphism with virtual methods. In C, we achieve the same thing with function pointers.
Important Notes
- We are keeping this simple, just a few function pointers per payload.
- We are not building a full “virtual table” system yet. You will learn details of that in upcoming milestones.
- Focus on understanding how function pointers enable polymorphic behavior.
- Later exercises will show more sophisticated patterns.
Hints
- Keep your existing
payload_dataunion structure, simplify it by eliminating duplicate logic - Add a
void (*process)(struct payload *self)function pointer tostruct payload - Create separate
process_login(),process_direct_message(), etc. functions - In
push_payload(), after parsing, assign the appropriate function pointer - Your
process_next()becomes much simpler!
Expected Outcome
Your code should:
- Handle all the same payloads as Ch01.
- Have no switch statements in
process_next(). - Be easier to extend with new payload types.
- Demonstrate how function pointers enable polymorphism.
After completing this, you will understand why object-oriented languages use virtual methods. They solve the exact same problem you just solved with function pointers!
After completing the exercise on your own, inspect the notes on refactored and extended with new requirements solutions for this chapter.
We complained about switch/case being repetitive, but now that we have defined so many functions, is not this even more repetitive?
- Is it reasonable to store a pointer for each function inside the struct?
- No type-safety. What prevents conflicting behaviors (e.g., a command’s
processfunction but a message’sdestructor)? - We still had problems in the old approach…
Up Next: A detailed discussion on dynamic dispatch.
Refactored Solution: From Static to Dynamic Dispatch
This directory contains the refactored version of Ch02, demonstrating how function pointers eliminate switch statements and improve code organization.
What Changed from Ch02?
1. Removed the Type Enum
Ch02 (Static Dispatch):
enum payload_kind {
COMMAND_LOGIN, COMMAND_JOIN, COMMAND_LOGOUT,
MESSAGE_DIRECT, MESSAGE_GROUP, MESSAGE_GLOBAL
};
struct payload {
enum payload_kind kind; // Type tag
union payload_data data;
};
Ch03 (Dynamic Dispatch):
struct payload {
void (*process)(const struct payload *self); // Behavior
void (*destroy)(const struct payload *self); // Cleanup
union payload_data data;
};
Impact: No more type enumeration needed. Each payload carries its own behavior through function pointers.
2. Eliminated Switch Statements for Destroy and Process
Ch02:
void process_next(struct payload_buffer *buf) {
struct payload *p = &buf->payloads[buf->process_base];
switch (p->kind) {
case COMMAND_LOGIN:
printf("Command: login\n");
printf(" Arguments: [username: %s, password: %s]\n",
p->data.command_login.username,
p->data.command_login.password);
break;
case COMMAND_JOIN:
// ... more cases
// ... 6 total cases
}
buf->process_base++;
}
Ch03:
void process_next(struct payload_buffer *buf) {
struct payload *p = &buf->payloads[buf->process_base];
p->process(p); // Dynamic dispatch!
buf->process_base++;
}
Impact: From 30 lines with 6 cases to 3 lines. Each payload knows how to process itself.
3. File Organization for Isolation
Ch02 Structure:
traditional_dispatch.h: Everything (types, buffer, functions)traditional_dispatch.c: All logic mixed together
Ch03 Structure:
payload.h/payload.c: Payload types and behaviors (process/destroy functions)dynamic_dispatch.h/dynamic_dispatch.c: Buffer management and orchestration
Impact: Better isolation between concerns. Payload behavior is separated from buffer orchestration. We will elaborate isolation on later chapters, but here is a quick explaination:
What is Isolation?
Isolation means separating different concerns so changes in one area do not ripple through unrelated areas.
In Ch02:
- Adding a new payload type required changes in multiple functions
- Buffer logic and payload logic were intertwined
- Everything depended on the shared enum
In Ch03:
- Payload behaviors are self-contained (each has its own process/destroy functions)
- Buffer management is isolated in
dynamic_dispatch.c - Adding a new payload type only requires:
- Create new process/destroy functions in
payload.c - Modify
push_payload()to assign those functions - Done! No changes to
process_next()ordestroy()
- Create new process/destroy functions in
Benefits of isolation:
- Easier to understand: Each file has a single, clear purpose
- Safer to modify: Changes don’t accidentally break unrelated code
- Better for teams: Different developers can work on payloads vs. buffer logic without conflicts
Side Quest: Further Isolation
Currently, push_payload() does two things:
- Parses the raw string and constructs a payload
- Adds it to the buffer
Challenge: Extract parsing logic into a separate function:
struct payload parse_payload(const char *raw);
This function should:
- Parse the input string
- Assign appropriate function pointers
- Return a constructed payload
Then push_payload() becomes simpler:
void push_payload(struct payload_buffer *buf, const char *raw) {
struct payload parsed;
bool is_parsing_successful = parse_payload(&parsed, raw); // Construct
// ... add to buffer (if is_parsing_successful)
}
Why? Parsing logic is now isolated from buffer management. You can test payload construction independently, and changing the buffer structure does not affect parsing.
Remaining Limitations
Even with function pointers, we still have problems:
- Construction is still static:
push_payload()still uses if/else to assign function pointers - Repetitive function pointers: Each payload struct stores two pointers (process + destroy)
- No type safety: Nothing prevents mixing incompatible function pointers
Next: The extended version implements the new requirements (multiple receivers) to further demonstrate the benefits of dynamic dispatch.
Hold on! Lots of Repetition, Again!
Our code base has once again become impossible to maintain! This solution only implements the first part of the task, multiple receivers. For function batching, try it yourself!
What Changed in This Solution?
This solution extends 01 with support for multiple receivers:
Example Input:
@alice @bob #general Hello everyone!
Output:
Direct message to alice: Hello everyone!
Direct message to bob: Hello everyone!
Group message to general: Hello everyone!
1. Nested Polymorphism
We introduced two levels of polymorphic behavior:
Level 1 - Payloads:
struct payload {
void (*process)(const struct payload *self);
void (*destroy)(const struct payload *self);
union payload_data data;
};
Level 2 - Message Receivers:
struct message_receiving_entity {
void (*transmit_message)(const struct message_receiving_entity *self,
const char *content);
void (*destroy)(const struct message_receiving_entity *self);
char *additional_info;
};
Each receiver is polymorphic! Direct messages, group messages, and global messages all use the same interface but behave differently.
2. Architecture
File separation:
payload_constructor.c- Parsing and constructionpayload_behaviors.c- Behavioral implementationsdynamic_dispatch.c- Orchestration (clean!)
This demonstrates composition: payloads contain arrays of polymorphic receivers. We will discuss composition later on.
The Problem: Too Much Repetition!
Let’s count the repetition we have encountered:
1. Manual Function Pointer Assignment
For every payload type, we manually assign function pointers:
if (strcmp("login", command_name) == 0) {
*p = (struct payload) {
.process = process_command_login, // manual assignment
.destroy = destroy_command_login, // manual assignment
// ...
};
} else if (strcmp("join", command_name) == 0) {
*p = (struct payload) {
.process = process_command_join, // manual assignment again!
.destroy = destroy_command_join, // manual assignment again!
// ...
};
}
// ... repeat for EVERY type
2. Repetitive Function Declarations
For every message receiver type:
void transmit_direct_message(const struct message_receiving_entity *self, ...);
void transmit_group_message(const struct message_receiving_entity *self, ...);
void transmit_global_message(const struct message_receiving_entity *self, ...);
void destroy_group_or_direct_message(const struct message_receiving_entity *self);
void destroy_global_message(const struct message_receiving_entity *self);
Every receiver needs its own function declarations!
3. No Type Safety
Nothing prevents this disaster:
receivers[0] = (struct message_receiving_entity) {
.transmit_message = transmit_direct_message,
.destroy = destroy_global_message // WRONG! Mixing behaviors!
};
The compiler will not complain, but this creates inconsistent objects.
4. Verbose Struct Initialization
Every construction site repeats the pattern:
receivers[receiver_count] = (struct message_receiving_entity) {
.additional_info = receiver_name,
.transmit_message = /* pick one */,
.destroy = /* pick one */
};
Enter: Syntactic Sugar
As mentioned in introduction part, syntactic sugar is a
anguage feature that makes code easier to write without changing what it
actually does.
What we have been doing in C is writing OOP patterns by hand:
- Manually assigning function pointers
- Manually managing polymorphic behavior
- Manually ensuring consistency between related functions
What C++ provides is syntactic sugar for these patterns:
- Classes: Group data and functions together automatically
- Constructors: Initialize objects with correct function pointers automatically
- Virtual functions: Polymorphism without manual function pointer management
- Type safety: Compiler ensures you do not mix incompatible behaviors
The Limitation We Have Hit
Function pointers solve the processing problem (no more giant switch statements!), but they create new problems:
- Construction is verbose and error-prone
- No compiler help to ensure consistency
- Every new type requires manual wiring
- Repetitive patterns everywhere
These are not fundamental limitations of OOP. they’re limitations of implementing OOP manually in C.
Time to Switch to C++!
You have now experienced why OOP languages exist. They did not invent new concepts - they automated the patterns we’ve been writing by hand.
Next steps:
- Translate this exact solution to C++
- See how classes eliminate the repetition
- Understand that C++ is just syntactic sugar over what you have been doing
The core concepts remain the same. C++ just handles the plumbing for you.
But before switching to C++, we need to do a detailed discussion on dynamic dispatch.
Questions Arise
In Ch03, we eliminated giant switch statements using function pointers. But we have created new problems and reached limits of manual polymorphism.
The Paradox of Repetition
We complained about switch/case being repetitive. But look at what we wrote:
Before (Switch Statement):
switch (p->kind) {
case COMMAND_LOGIN: /* ... */ break;
case COMMAND_JOIN: /* ... */ break;
case MESSAGE_DIRECT: /* ... */ break;
// ... 6 cases total
}
After (Function Pointers):
// Declaration repetition
void process_command_login(const struct payload *self);
void process_command_join(const struct payload *self);
void process_command_logout(const struct payload *self);
void process_message(const struct payload *self);
void destroy_command_login(const struct payload *self);
void destroy_command_join(const struct payload *self);
void destroy_command_logout(const struct payload *self);
void destroy_message(const struct payload *self);
// Assignment repetition
if (strcmp("login", command_name) == 0) {
*p = (struct payload) {
.process = process_command_login,
.destroy = destroy_command_login,
// ...
};
} else if (strcmp("join", command_name) == 0) {
*p = (struct payload) {
.process = process_command_join,
.destroy = destroy_command_join,
// ...
};
}
// ... endless if/else
Question: Did we just trade one type of repetition for another?
The Type Safety Problem
Nothing prevents this disaster:
struct payload broken = {
.process = process_command_login, // Command behavior
.destroy = destroy_message, // Message behavior!
};
The compiler won’t complain. At runtime:
processexpectsdata.command_login.usernamedestroyexpectsdata.message.receivers- Result: Memory corruption, segfault, undefined behavior
Question: How reasonable is it to store multiple function pointers in a struct when there is no mechanism to ensure they are consistent?
The Structural Repetition Problem
Every payload carries two function pointers:
struct payload {
void (*process)(const struct payload *self); // 8 bytes
void (*destroy)(const struct payload *self); // 8 bytes
union payload_data data;
};
For 1000 messages, we store 16,000 bytes of function pointers. But here is the thing: All messages of the same type have THE SAME function pointers!
payload[0] = { .process = process_message, .destroy = destroy_message, ... };
payload[1] = { .process = process_message, .destroy = destroy_message, ... };
payload[2] = { .process = process_message, .destroy = destroy_message, ... };
// ... all 1000 messages store identical pointers
Question: Why are we duplicating the same pointers in every instance?
Enter: The Virtual Function Table (vtable)
The solution is to share function pointers across instances of the same type.
Concept: Separate Type Information from Instance Data
Before (current approach):
Each payload = [process*, destroy*, data]
^^^^^^^^^^^^^^^ stored per-instance
After (vtable approach):
Each payload = [vtable*, data]
^^^^^^^ single pointer to shared table
vtable_message = [process_message*, destroy_message*]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ shared by ALL messages
C Implementation of vtables:
// Step 1: Define the vtable structure
struct payload_vtable {
void (*process)(const struct payload *self);
void (*destroy)(const struct payload *self);
};
// Step 2: Create static vtables (one per type)
const struct payload_vtable vtable_command_login = {
.process = process_command_login,
.destroy = destroy_command_login
};
const struct payload_vtable vtable_message = {
.process = process_message,
.destroy = destroy_message
};
// Step 3: Payload now stores a vtable pointer
struct payload {
const struct payload_vtable *vtable; // Single pointer!
union payload_data data;
};
// Step 4: Usage
payload->vtable->process(payload); // Dynamic dispatch!
// ... same steps for receivers
Using vtables, we reduced pointers to one pointer per instance instead of N pointers. Also, a partial type safety is ensured as all functions for a type live in one vtable.
Remaining Problems
- Still manual: We still write if/else to assign vtables
- No enforcement: Can still assign wrong vtable to wrong data
- Constructor repetition: Every type needs init code
What Have We Learned?
Function pointers solve the processing problem but create new problems:
| Problem | Solution | New Problem Created |
|---|---|---|
| Giant switch statements | Function pointers in structs | Repetitive declarations |
| Adding new types modifies many places | Each type self-contained | Verbose construction |
| Type/behavior coupling | Dynamic dispatch | No type safety |
| - | - | Memory overhead (partially solved by vtables) |
The pattern: Each solution introduces new complexity.
The Fundamental Limitation
We have been manually implementing OOP patterns in C. This works, but it is verbose - requires lots of boilerplate.
C++ doesn’t invent new concepts. It automates what we have been doing. Recall syntactic sugar!
| What We Did Manually | C++ Keyword |
|---|---|
struct payload with function pointers | class |
| Assigning function pointers in constructor | Constructor automatically sets up vtable |
Our vtable struct | Compiler-generated vtable (hidden) |
payload->vtable->process(payload) | payload->process() (syntactic sugar) |
| Manual type safety checks | Compiler enforces at compile-time |
Objective
Implement vtables in C. Refactor Ch03 to use vtables. You should:
- Define
struct payload_vtable&struct message_receiving_entity_vtable - Create vtables for each payload & receiver type
- Change
struct payload&struct message_receiving_entity_to use a single vtable pointer - Update construction logic to assign the correct vtable
- Change
process_next()to call through vtable
After completing this, you will understand exactly what C++ virtual functions
do under the hood.
You now have implemented:
- Polymorphism with enums + switch (Ch01-02)
- Polymorphism with function pointers (Ch03)
- Polymorphism with vtables (Ch04)
We will translate Ch03 directly to C++ and see:
- How
classeliminates manual vtable management - How constructors automate function pointer assignment
- How
virtualprovides type-safe polymorphism - How inheritance shares behavior without code duplication
OOP languages exist because manual polymorphism is tedious and error-prone. Every limitation we have hit is a feature C++ automates. You are ready to appreciate what C++ actually does.
A brief detour from polymorphism: Before discussing C++ polymorphism features, we should first cover how resources are managed using RAII.
- term Resource Acquisition is Initialization (RAII)
- Describe a particular language behavior for managing resources. In RAII, holding a resource is a class invariant, and is tied to object lifetime. Resource allocation (or acquisition) is done during object creation (specifically initialization), by the constructor, while resource deallocation (release) is done during object destruction (specifically finalization), by the destructor. In other words, resource acquisition must succeed for initialization to succeed. Thus, the resource is guaranteed to be held between when initialization finishes and finalization starts (holding the resources is a class invariant), and to be held only when the object is alive. Thus, if there are no object leaks, there are no resource leaks. [Wikipedia]
We have been implementing RAII principle manually:
parse_payload()construct payload object - that is data + behavior, and allocate resources.- Resources held during object lifetime (until buffer is destroyed via
buffer_destroy) - Destructor methods deallocate resources used by payload objects.
In this chapter, we will inspect how C++ syntactic sugar automates this steps.
From Manual Memory Management to RAII
You have built manual polymorphism in C. You have seen how verbose it is to initialize and clean up resources. Now let’s see how C++ automates this with constructors and destructors.
The Problem We Had in C
Initialization:
struct payload *p = malloc(sizeof(struct payload));
assert(p);
p->vtable = &vtable_message;
p->data.message.content = malloc(strlen(raw) + 1);
strcpy(p->data.message.content, raw);
// ... more initialization
Cleanup:
p->vtable->destroy(p); // Must remember to call!
free(p);
What could go wrong?
- Forget to initialize a field - undefined behavior
- Forget to call
destroy()- memory leak - Call
destroy()twice - double-free - Use after
free()- use-after-free
Before Switching to C++…
To familiarize yourself with C++ syntax, take a quick glance to links:
Implementing RAII: String Class
Let’s build a RAII class from scratch using char *. Here is a string class
with manual memory management:
class String {
private:
char *data;
public:
// Constructor: Acquire resource (allocate memory)
String(const char *str) {
data = new char[strlen(str) + 1]; // Allocate
strcpy(data, str);
std::cout << "String created: " << data << "\n";
}
// Destructor: Release resource (free memory)
~String() {
std::cout << "String destroyed: " << data << "\n";
delete[] data; // Free automatically!
}
const char* c_str() const { return data; }
};
Usage
{
String username("alice");
String message("Hello!");
std::cout << "Processing: " << username.c_str() << "\n";
} // Destructors called automatically here!
Output:
String created: alice
String created: Hello!
Processing: alice
String destroyed: Hello! ← Reverse order!
String destroyed: alice ← Reverse order!
What C++ does behind the scenes:
// Equivalent C code:
{
struct String username;
String_constructor(&username, "alice"); // Constructor
struct String message;
String_constructor(&message, "Hello!"); // Constructor
printf("Processing: %s\n", String_c_str(&username));
String_destructor(&message); // Automatic!
String_destructor(&username); // Automatic, reverse order!
}
syntactic sugar: RAII = Constructor allocates, Destructor frees. No manual
free!
Construction and Destruction Order
Rule 1: Local Variables
Objects are destroyed in reverse order of their construction:
{
String a("first"); // Constructed 1st
String b("second"); // Constructed 2nd
String c("third"); // Constructed 3rd
}
// Destroyed: c (3rd), b (2nd), a (1st) - REVERSE!
Why? Later objects might depend on earlier ones. Reverse order ensures dependencies are destroyed last.
Rule 2: Member Variables
Members are destroyed in reverse declaration order:
class Payload {
private:
String username; // Declared 1st
String content; // Declared 2nd
public:
Payload(const char *user, const char *msg)
: username(user), content(msg) {}
// Destructor automatically destroys: content, then username
};
syntactic sugar:
~Payload() {
content.~String(); // Destroy 2nd-declared member first
username.~String(); // Destroy 1st-declared member last
}
Dependency Chains: RAII Composition
RAII objects can contain other RAII objects. Cleanup happens automatically in the correct order.
Example: Command with Multiple Strings
On upcoming exercises, we will implement command with inheritance. But since we have not covered inheritance yet, this example does not use it:
class LoginCommand {
private:
String username; // RAII object 1
String password; // RAII object 2
public:
LoginCommand(const char *cmd, const char *user, const char *pass)
: username(user), password(pass) {
std::cout << "LoginCommand constructed\n";
}
~LoginCommand() {
std::cout << "LoginCommand destructed\n";
// String destructors called automatically after this!
}
void process() {
// ... business logic for login
}
};
Full Execution:
int main() {
LoginCommand p("alice", "s3cr3t");
p.process();
}
Output:
String created: alice ← username constructor
String created: s3cr3t ← password constructor
Payload constructed
Processing: login
Payload destructed
String destroyed: s3cr3t ← password destructor (reverse!)
String destroyed: alice ← username destructor
syntactic sugar: You never called delete or wrote cleanup code. The
compiler generated the entire destruction chain!
The Dependency Chain in Detail
Construction order (top-to-bottom):
usernamemember constructedpasswordmember constructedLoginCommandconstructor body runs
Destruction order (bottom-to-top, reverse!):
LoginCommanddestructor body runspasswordmember destroyedusernamemember destroyed
Why reverse? If LoginCommand constructor uses username and password
in its body, username and password must outlive the LoginCommand
destructor body. Reverse order guarantees this.
Comparison: C vs C++
Here is a simplified version of what we have been doing in C (Manual dependency tracking):
struct login_command {
char *username;
char *password;
};
void payload_init(struct login_command *cmd,
const char *user, const char *pass) {
cmd->username = malloc(strlen(user) + 1);
strcpy(cmd->username, user);
cmd->password = malloc(strlen(pass) + 1);
strcpy(cmd->password, pass);
}
void payload_destroy(struct payload *p) {
free(cmd->password); // Manual order!
free(cmd->username); // Manual order!
}
C++ (Automatic dependency tracking):
class LoginCommand {
private:
String username;
String password;
public:
LoginCommand(const char *user, const char *pass)
: username(user), password(pass) {}
// Destructor generated automatically with correct order!
};
What About Standard Library?
std::string and std::vector are RAII classes, just like our custom
String.
class Message {
private:
std::string content; // RAII, manages char * internally
std::vector<MessageReceivingEntity> receivers; // RAII manages list and
// receivers internally
public:
Message(const std::string& str) : content(str) {}
// std::vector destructor called and it called MessageReceivingEntity's
// destructor
// std::string destructor called automatically
};
Under the hood, std::string does:
class string {
private:
char *data;
public:
string(const char *str) { data = new char[strlen(str) + 1]; }
~string() { delete[] data; }
};
Same pattern we just built, standard library = collection of well-tested RAII classes.
You have now seen RAII with manual char *. Next:
- Implement your payload classes using custom String (do not implement all of them, just a few to get used to defining classes)
- See how construction/destruction order prevents bugs, by adding print statemets to constructors/destructors. (in provided solution, this does not implemented to make code easier to read.)
- Replace custom String with
std::string(same behavior, more features) - Understand: Standard library is built on RAII principles
RAII is not a C++ feature or a magic, it is a design pattern. C++ makes it the default through automatic constructor/destructor calls. This is syntactic sugar that prevents entire categories of memory bugs.
We will continue our discussion with C++ virtual tables on next chapter.
Virtual Methods and Inheritance
In Ch04, we discussed vtables and how they solve the problem of storing duplicate function pointers in every object. We also saw that manual vtable assignment is error-prone.
In Ch05, we learned how C++ automates resource management with constructors and destructors.
Now let’s see how C++ automates dynamic dispatch using virtual methods and
code reuse using inheritance.
Recall: Our Manual vtable in C
// Step 1: Define the vtable structure
struct payload_vtable {
void (*process)(const struct payload *self);
void (*destroy)(const struct payload *self);
};
// Step 2: Create static vtables (one per type)
const struct payload_vtable vtable_command_login = {
.process = process_command_login,
.destroy = destroy_command_login
};
const struct payload_vtable vtable_message = {
.process = process_message,
.destroy = destroy_message
};
// Step 3: Payload stores a vtable pointer
struct payload {
const struct payload_vtable *vtable;
union payload_data data;
};
// Step 4: Manual assignment in constructor
void parse_payload(struct payload *p, const char *raw) {
if (raw[0] == '/') {
// ... parsing logic
p->vtable = &vtable_command_login; // Manual!
} else {
p->vtable = &vtable_message; // Manual!
}
}
// Step 5: Dynamic dispatch through vtable
void process_next(struct payload_buffer *buf) {
struct payload *p = &buf->payloads[buf->process_base];
p->vtable->process(p); // Dynamic dispatch
}
Recall: Problems of manual vtables
- Manual vtable assignment. Easy to get wrong!
- No compile-time checks, type-safety. Can assign wrong vtable!
- Repetitive vtable creation for every type.
- No shared code between similar types.
C++ Syntactic Sugar: Virtual Methods!
Basic Virtual Method Syntax:
class Payload {
public:
// Virtual method = automatically uses vtable
virtual void process() {
cout << "Processing generic payload" << endl;
}
virtual ~Payload() = default;
};
class Message : public Payload {
public:
// Override base class method
void process() override {
cout << "Processing message" << endl;
}
};
What C++ Does Behind the Scenes:
// Compiler generates vtable automatically
struct Payload_vtable {
void (*process)(Payload *self);
void (*destructor)(Payload *self);
};
const struct Payload_vtable vtable_Message = {
.process = Message_process, // Automatically assigned!
.destructor = Message_destructor
};
struct Payload {
const struct Payload_vtable *vptr; // Hidden! Added by compiler
// ... your data members
};
// Constructor automatically sets vptr
void Message_constructor(struct Message *m) {
m->vptr = &vtable_Message; // Automatic assignment!
}
Unlike manual vtables:
- Compiler generates vtable structs
- Compiler populates vtable entries
- Compiler inserts vptr into every object
- Constructor automatically assigns correct vtable
overridekeyword provides compile-time checking
Inheritance: Sharing Code
The Problem with Repetition
In our C implementation, we had lots of repeated code:
void process_command_login(const struct payload *p) {
// Command-specific processing
printf("Command: login\n");
// ... print arguments
}
void process_command_join(const struct payload *p) {
// Command-specific processing
printf("Command: join\n");
// ... print arguments
}
void process_command_logout(const struct payload *p) {
// Command-specific processing
printf("Command: logout\n");
// No arguments
}
Every command has similar structure but different details.
Inheritance Solution
Base class contains shared behavior:
class Command : public Payload {
public:
Command(const char *command_name_)
: command_name { command_name_ } {}
// Shared implementation
void process() override {
cout << "Command: " << command_name << endl;
process_arguments(); // Call derived class method
}
virtual ~Command() {}
protected:
// Derived classes must implement this
virtual void process_arguments() = 0; // Pure virtual, check end of this
// README for details
// Syntax is not so important. When you are not sure about syntax, Google
// it.
string command_name;
};
Derived classes implement specific behavior:
class LoginCommand : public Command {
public:
LoginCommand(const char *username_, const char *password_)
: Command { "login" }, username { username_ }, password { password_ } {}
private:
// Implement required method
void process_arguments() override {
cout << " Arguments: [username: " << username
<< ", password: " << password << "]" << endl;
}
string username;
string password;
};
class JoinCommand : public Command {
public:
JoinCommand(const char *channel_)
: Command { "join" }, channel { channel_ } {}
private:
void process_arguments() override {
cout << " Arguments: [channel: " << channel << "]" << endl;
}
string channel;
};
class LogoutCommand : public Command {
public:
LogoutCommand() : Command { "logout" } {}
private:
void process_arguments() override {
cout << " Arguments: []" << endl;
}
};
// Polymorphism: store different types in same pointer type
Payload *payloads[3];
payloads[0] = new LoginCommand { "alice", "pass123" };
payloads[1] = new JoinCommand { "general" };
payloads[2] = new LogoutCommand {};
// Dynamic dispatch - each calls correct method
for (int i = 0; i < 3; i++) {
payloads[i]->process(); // Calls through vtable!
}
// Cleanup
for (int i = 0; i < 3; i++) {
delete payloads[i]; // Virtual destructor ensures correct cleanup!
}
Command: login
Arguments: [username: alice, password: pass123]
Command: join
Arguments: [channel: general]
Command: logout
Arguments: []
Syntax Reference
class Base {
// Shared functionality
};
class Derived : public Base {
// Specific functionality + inherits from Base
};
virtual return_type method_name(); // Can be overridden
- Pure Virtual Functions (Abstract Classes)
virtual void method_name() = 0; // MUST be overridden
- Override Keyword
void process() override; // Compile error if not overriding
- Virtual Destructor
virtual ~ClassName() {} // Critical for polymorphic classes!
Why critical? Without virtual destructor:
Payload *p = new Message { ... };
delete p; // Only calls ~Payload(), leaks Message members!
With virtual destructor:
Payload *p = new Message { ... };
delete p; // Calls ~Message() then ~Payload() - correct order!
class Example {
public:
// Accessible from anywhere
protected:
// Accessible from this class and derived classes
private:
// Accessible only from this class
};
You are now familiar with how C++ automates virtual table construction. Next:
- Implement command hierarchy. Create base
Commandclass and derived classes:LoginCommand,JoinCommand, andLogoutCommand. - Implement message hierarchy. Create base
Messageclass with:DirectMessage,GroupMessage, andGlobalMessage. Use inheritance to share message content storage and common processing logic. - Create polymorphic buffer that stores
Payload *pointers (base class pointers) and can hold any derived type (commands or messages). - Test virtual destruction via adding print statements in destructors to verify derived destructor runs first and base destructor runs second. (This task not implemented in provided solution to make code easier to read.)
virtual is syntactic sugar for the vtable pattern you built manually.
Inheritance is syntactic sugar for sharing vtable entries and data. C++ did not
invent polymorphism, it automated the plumbing.
On next chapter, we will continue our journey with references and copy management.
Pure Virtual Methods
In our Paload, Command, and Message examples, we used a specific syntax:
virtual void <method_name>() = 0;. This is called a pure virtual method.
A pure virtual method is a declaration that a function must exist, but the base class provides no implementation for it. It acts as a mandatory contract: any class that inherits from Payload or Command is required to implement this method to be considered complete.
When a class contains at least one pure virtual method, it becomes an Abstract Class.
- No Instantiation: You cannot create an object of an abstract class (e.g.,
new Command("login")will result in a compiler error). - Incomplete Blueprint: The class exists solely to provide a common interface and shared data for its children.
Under the Hood: The C Parallel
In your manual C vtable implementation, a pure virtual method is like defining
a function pointer in your vtable struct but purposefully leaving it
unassigned in the base “type”.
In C: If you accidentally called a NULL: function pointer, your program would
crash at runtime.
In C++: The compiler prevents this crash by refusing to compile any code that tries to create an object that hasn’t fulfilled its pure virtual requirements.
As you work through Ch06, notice that if you forget to implement
process_arguments() in classes that inherit Command, or process_recipient()
in classes that inherit Message, your code will not compile. This is the
“syntactic sugar” of compile-time safety replacing manual runtime checks.
References and Copy Management
In Ch05, we built a custom String class with RAII. In Ch06, we used it with
polymorphism. But there is a hidden danger: the default copy behavior can cause
double-free crashes.
This chapter covers two essential topics:
- References to avoid expensive copies
- Copy control for making RAII classes safe to copy
The Hidden Bug in Our Code
Recall the custom String class from Ch05:
class String {
public:
String(const char *data_) {
data = new char[strlen(data_) + 1];
strcpy(data, data_);
}
~String() { delete[] data; }
const char* c_str() const { return data; }
private:
char *data;
};
This code has a critical bug: The Double-Free Disaster
void print_string(String s) { // !bug: passes by value (creates copy)
cout << s.c_str() << endl;
} // s destroyed here, deletes[] data called
int main() {
String name { "alice" };
print_string(name); // Creates copy with same data pointer
// name destroyed here
// delete[] data called again on the same pointer. Double-free, CRASH!
}
Step-by-step:
namecreated, allocates memory for “alice”print_string(name)copiesname:- Default copy constructor copies
datapointer - Both
nameandspoint to same memory
- Default copy constructor copies
sdestroyed at end of function: deletes memorynamedestroyed at end of main: deletes already-freed memory
This is the shallow copy problem: copying the pointer, not the data.
Solution 1: References - Stop Copying
Instead of creating a copy, pass a reference (alias) to the original:
void print_string(String &s) { // Reference, not copy
cout << s.c_str() << endl;
} // Nothing destroyed, just alias goes away
int main() {
String name { "alice" };
print_string(name); // No copy created!
} // name destroyed once, everything is fine
String &s means “s is just another name for the original String, and I
promise not to modify it via RAII methods.”
Reference Syntax
int x = 42;
int &ref = x; // ref is an alias for x
ref = 100; // x is now 100
cout << x; // Prints 100
cout << ref; // Prints 100 (same variable!)
References vs Pointers
| Feature | Pointer | Reference |
|---|---|---|
| Can be null? | Yes (nullptr) | No (must refer to something) |
| Can change target? | Yes | No (always refers to same object) |
| Syntax | *ptr, ptr->member | Just use like variable, no field dereference operator (->) |
| Use case | Optional parameters, arrays | Function parameters (avoid copies) |
When to Use References
Pass by const reference: Default for objects:
void process(const Message &msg); // Read-only, no copy
void send(const String &content);
Pass by reference, only when you need to modify:
void update_content(Message &msg, const String &newContent) {
msg.set_content(newContent); // Modifies original
}
Pass by value, small types only:
void set_port(int port); // int is cheap to copy
void set_flag(bool enabled); // bool is cheap to copy
Rule of thumb: If it has a destructor (manages resources), use &.
const Correctness
Mark methods that do not modify the object as const:
class String {
public:
const char *c_str() const { return data; } // Does not modify
void clear() { delete[] data; data = nullptr; } // Not const, modifies
};
Why? const methods can be called on const references:
void print_string(const String &s) {
cout << s.c_str(); // OK, c_str() is const
s.clear(); // ERROR: clear() is not const
}
Exercise 01: Fix the References
Refactor this code to use references:
// Current code (inefficient and buggy):
void display_message(GlobalMessage msg) { // Copies entire message!
cout << "Message: " << msg << endl;
}
// Copies both!
void append_content(GlobalMessage msg, /* string from std */ string suffix) {
msg.setContent(msg + suffix); // Modifies copy, not original!
// std::string overloaded + operator for string concatenation.
// We will discuss operator overloading later on.
}
int main() {
GlobalMessage m { "Hello" };
display_message(m); // Wasteful copy
append_content(m, "!"); // Does not modify m!
display_message(m); // Still prints "Hello", not "Hello!"
}
Tasks:
- Fix
display_messageto avoid copying - Fix
append_contentto modify the original message - Add
constto methods that don’t modify objects
Solution 2: Deep Copy - Rule of Three
References solve the “passing to functions” problem. But what about this?
String name1 { "alice" };
String name2 = name1; // What should happen here?
Two options:
Option 1: Prevent Copying
class String {
public:
// Delete copy operations:
String(const String &) = delete;
String &operator=(const String &) = delete;
// Other members...
};
String name1 { "alice" };
String name2 = name1; // COMPILE ERROR: copy deleted
Use when: The resource cannot be meaningfully shared (e.g. file handles, synchronization primitives).
Option 2: Deep Copy
class String {
public:
// Copy constructor: creates independent copy
String(const String &other) {
data = new char[strlen(other.data) + 1];
strcpy(data, other.data); // Copy the DATA, not the pointer
cout << "String copied: " << data << "\n";
}
// Copy assignment: replaces content with copy
String &operator=(const String &other) {
if (this != &other) { // Self-assignment check
delete[] data; // Free old data
data = new char[strlen(other.data) + 1];
strcpy(data, other.data); // Copy new data
}
return *this;
}
// Destructor (already had this)
~String() {
delete[] data;
}
private:
char *data;
};
Now copying is safe:
String name1 { "alice" };
String name2 = name1; // Deep copy, independent memory
name1.~String(); // Deletes name1's memory
// name2 still valid, has its own copy!
The Rule of Three
If you define any of these, you should define all three:
- Destructor
~T() - Copy constructor
T(const T&) - Copy assignment
T& operator=(const T&)
Why? If your class needs a custom destructor (manages resources), the default copy operations are almost certainly wrong.
Shallow copy (default, WRONG for our String):
Before copy:
name1 -> [data ptr] -> "alice"
After: String name2 = name1;
name1 -> [data ptr] --
|-> "alice" SHARED! BAD!
name2 -> [data ptr] --
After name1 destroyed:
name2 -> [data ptr] -> <freed memory> CRASH!
Deep copy (correct):
Before copy:
name1 -> [data ptr] -> "alice"
After: String name2 = name1;
name1 -> [data ptr] -> "alice"
name2 -> [data ptr] -> "alice" Independent copy, GOOD!
After name1 destroyed:
name2 -> [data ptr] -> "alice" Still valid!
The Rule of Five (C++11)
Modern C++ adds move operations:
- Move constructor
T(T &&) - Move assignment
T &operator=(T &&)
For now, we will stick with rule of three. Move semantics is an optimization (transfer instead of copy), but not essential for correctness. Here is a quick reference for rule of five, if you are into it.
The Rule of Zero
Best practice: Avoid manual resource management entirely.
// Instead of custom String with rule of three:
class Message {
public:
Message(const char *content_);
Message(const Message&); // Copy constructor
Message& operator=(const Message&); // Copy assignment
~Message(); // Destructor
private:
char *content; // Manual memory. Rule of three needed
};
// Use std::string (already implements rule of three):
class Message {
public:
Message(const char *content_) : content { content_ } {}
// No custom destructor/copy needed! Compiler defaults work!
private:
string content; // RAII type, handles copying automatically
};
Rule of Zero: If all your members are RAII types (std::string, std::vector,
std::unique_ptr), you don’t need custom copy/destructor.
Exercise 02: Implement Rule of Three
class Buffer {
public:
Buffer(size_t size_) : size { size_ } {
data = new char[size];
memset(data, 0, size);
}
~Buffer() {
delete[] data;
}
// TODO: Implement copy constructor
// TODO: Implement copy assignment
char *get_data() { return data; }
size_t get_size() const { return size; }
private:
char *data;
size_t size;
};
// This should work after you implement copy operations:
void test_buffer() {
Buffer buf1 { 100 };
buf1.get_data()[0] = 'X';
Buffer buf2 = buf1; // Copy constructor
buf2.get_data()[0] = 'Y';
cout << buf1.get_data()[0]; // Should print 'X' (not 'Y'!)
cout << buf2.get_data()[0]; // Should print 'Y'
}
Tasks:
- Implement copy constructor that performs deep copy
- Implement copy assignment operator (don’t forget self-assignment check!)
- Verify no memory leaks with valgrind
Bonus: Rewrite using std::vector<char> (Rule of Zero).
For references, avoid expensive copies by passing const T &. Use T & only
when you need to modify. Small types (int, bool) can still pass by value.
Rule of Three:
- If you write a destructor, write copy constructor and copy assignment
- Prevents shallow copy bugs (double-free)
- Implement deep copy: allocate new memory, copy data
Rule of Zero:
- Prefer RAII types (
std::string,std::vector) - Let compiler generate correct copy operations (syntactic sugar!)
- Only write rule of three when you must manage raw resources
Pattern Reference
// 1. Read-only parameter:
void process(const Message &msg);
// 2. Modifiable parameter:
void update(Message &msg);
// 3. RAII class that owns memory:
class MyClass {
public:
MyClass(const MyClass &); // Deep copy
MyClass &operator=(const MyClass &); // Deep copy
~MyClass(); // Free memory
private:
char *data;
};
// 4. RAII class that CANNOT be copied
class Socket {
public:
Socket(const Socket &) = delete;
Socket &operator=(const Socket &) = delete;
~Socket();
private:
int fd;
};
// 5. Rule of Zero (preferred):
class Message {
std::string content; // Handles its own copying
// No custom copy/destructor needed!
};
These are fundamental C++ idioms you will use in every project. Combined with RAII (Ch05) and polymorphism (Ch06), you can now begin writing safe, modern C++ code.
Future topics (you will learn during your hands-on project):
- Move semantics (optimization for transfers)
- Smart pointers (
std::unique_ptr,std::shared_ptr) - Templates and generic programming
- Exception safety
But with references and proper copy control, you have the foundation for safe C++ programming! In the next chapter, we will reinforce what we have learned so far with a more theoretical explanation, and then we begin developing pracical applications.
The Four Pillars of OOP
You have spent the previous chapters manually building the mechanics of object-oriented languages using C. You have seen that modern languages do not invent new logic; they simply automate the tedious plumbing of pointers and memory management. This chapter formally defines the four conceptual pillars that support every OOP design.
Before diving into practical applications, let’s consolidate what you have learned by mapping your manual implementations to the standard OOP terminology.
Encapsulation
- term Encapsulation
- The bundling of data and the functions that operate on that data into a single unit (a class or struct), while restricting direct access to some of the object’s components.
You implemented encapsulation throughout this module, even before switching to C++.
In C (Ch03): You bundled data and behavior by storing function pointers alongside data in the same struct:
struct payload {
void (*process)(const struct payload *self);
void (*destroy)(const struct payload *self);
union payload_data data;
};
The payload struct encapsulated both state (data) and behavior (process,
destroy). This prevented scattering related logic across multiple files.
In C++ (Ch05): You formalized encapsulation with access specifiers:
class String {
public:
String(const char *str); // Controlled interface
~String();
const char *c_str() const;
private:
char *data; // Hidden from external access
};
By marking data as private, you enforced a critical rule: only the String
class itself can manage its memory. External code cannot accidentally corrupt
the pointer, preventing entire categories of bugs.
Why? Without encapsulation, your char *data would be public, and any code
could call free(data) or reassign the pointer. Encapsulation creates a
protective barrier around your resources.
Abstraction
- term Abstraction
- The process of hiding implementation details and exposing only a simple, high-level interface to the user.
While encapsulation bundles data with behavior, abstraction focuses on what the user sees versus what the object does internally. We do not deal with implementations, with abstraction we interact with ideas.
In C (Ch03): Your process_next() function demonstrated abstraction:
void process_next(struct payload_buffer *buf) {
struct payload *p = &buf->payloads[buf->process_base];
p->process(p); // Abstract interface
buf->process_base++;
}
The caller of process_next() does not know or care whether the payload is a
login command or a direct message. The function pointer process abstracts
away the implementation details. The interface is simple: “process this
payload.” In C++ (Ch06): Virtual methods provide the same abstraction with
cleaner syntax:
for (Payload *p : payloads) {
p->process(); // Abstract interface
}
Each payload knows how to process itself, but the calling code is abstracted from those details. You can add new payload types without changing the loop.
The key difference:
- Encapsulation answers: “How do I protect my data?”
- Abstraction answers: “How do I hide my complexity?”
Your String class encapsulates its char *data while abstracting the memory
management details behind c_str() and the destructor.
Inheritance
- term Inheritance
- A mechanism where a class (derived/child) acquires properties and behaviors from another class (base/parent), enabling code reuse and establishing “Is-A” relationships.
In C (Ch03): Inheritance was manual and error-prone. You had to duplicate the vtable structure across every payload type or use complex casting tricks to simulate base types.
In C++ (Ch06): Inheritance became declarative, i.e. we have not required to implement underlying polymorphism manually:
class Command : public Payload {
public:
void process() override {
cout << "Command: " << command_name << endl;
process_arguments(); // implemented in derived classes
}
virtual void process_arguments() = 0;
protected:
string command_name;
};
class LoginCommand : public Command {
// Inherits: command_name, process()
// Implements: process_arguments()
};
LoginCommand is a Command, which is a Payload. This hierarchy
eliminates repetition: all commands share the command_name field and the
common process() logic.
Type substitution: Because LoginCommand inherits from Payload, you can
store it in a Payload * pointer:
Payload *p = new LoginCommand("alice", "secret");
p->process(); // Calls LoginCommand's version
This is the “Is-A” relationship in action.
Composition
- term Composition
- A design principle where complex objects are built by combining simpler ones, establishing “Has-A” relationships.
While inheritance models “Is-A,” composition models “Has-A.”
Example from Ch05 (RAII):
class LoginCommand {
private:
String username; // LoginCommand *has a* String
String password; // LoginCommand *has a* String
};
LoginCommand is composed of String objects. It does not inherit from
String; it contains them.
Composition in practice (Ch03):
struct payload_buffer {
struct payload *payloads; // Buffer *has* payloads
size_t capacity;
size_t size;
};
The buffer has payloads and has capacity tracking. These are distinct concerns managed through composition.
Guideline: Prefer composition over inheritance when:
- The relationship is “Has-A” rather than “Is-A”
- You want to avoid tight coupling between classes
- You need flexibility to change parts independently
Polymorphism
- term Polymorphism
- The ability to treat different types uniformly through a common interface while maintaining their unique behaviors.
You implemented polymorphism in three evolutionary stages:
1. Static Dispatch (Ch01-02): Decision made at compile-time using enum
tags:
switch (payload.kind) {
case COMMAND_LOGIN:
process_login();
break;
case MESSAGE_DIRECT:
process_direct_message();
break;
}
The compiler generates separate code paths. No runtime flexibility and adding a new type requires recompiling and modifying the switch statement.
2. Manual Dynamic Dispatch (Ch03-04): Runtime decision using function pointers:
struct payload {
void (*process)(const struct payload *self);
};
p->process(p); // Calls different function depending on runtime type
You manually assigned function pointers during construction. This was flexible but error-prone: nothing stopped you from assigning the wrong function pointer.
3. Automatic Dynamic Dispatch (Ch06): C++ virtual keyword automates
vtable management:
class Payload {
public:
virtual void process() = 0;
};
class LoginCommand : public Payload {
public:
void process() override { /* ... */ }
};
Payload *p = new LoginCommand(...);
p->process(); // Compiler generates vtable lookup
The compiler guarantees the correct function is called. Vtable assignment
happens automatically in constructors. The override keyword catches typos at
compile-time.
The progression: From explicit type checks (Ch01) -> manual vtables (Ch03) -> automated vtables (Ch06). Same underlying mechanism, progressively less manual work.
Bringing It All Together
These four pillars are not independent features. They work together to enable extensible designs.
Example: Your Chapter 06 command hierarchy
- Encapsulation:
Commandclass hidescommand_nameasprotected - Abstraction:
process()provides a simple interface that hides the argument parsing complexity - Inheritance:
LoginCommandinherits common structure fromCommand - Polymorphism: A
Payload *can point to any command, andprocess()calls the correct implementation
From manual to automatic:
| Pillar | Manual (C) | Automatic (C++) |
|---|---|---|
| Encapsulation | Struct grouping | private/protected/public |
| Abstraction | Function pointers | Virtual methods |
| Inheritance | Manual struct layout | class Derived : public Base |
| Polymorphism | Manual vtable assignment | Compiler-generated vtables |
What You Have Accomplished
You did not just learn C++ syntax. You built the fundamental patterns that all object-oriented languages implement. These patterns exist in Java, Python, C#, and every other OOP language. The syntax changes, but the underlying mechanics, the mechanics you implemented by hand, remain the same.
With these theoretical foundations solidified, you are ready to transition from “how objects work” to “how objects solve problems.” In the next part, we will apply these pillars to build a robust, real-world networking application.
You now understand what your compiler is doing behind the scenes. Let’s build something real.