Memory and Resource Management

Keywords

memory management, garbage collection, raii, ownership, borrowing, smart pointers, reference counting, stack, heap, lifetimes, deterministic destruction, resource management, drop, defer

Introduction

The pager went off at 02:14 with a single line: p99 checkout latency 1.8s, SLA 200ms. The service was a JVM trading engine that did almost nothing per request — look up a price, validate a quantity, write a record — and it did it in two milliseconds, on average. The average was a lie. Once or twice a minute a request would freeze for over a second while every thread in the process stopped dead, did no work, returned no responses, and then resumed as if nothing had happened. The threads weren’t blocked on a lock or a slow query. They were blocked on the garbage collector, which had decided that this was the moment to walk the heap and reclaim the millions of short-lived objects the request path had been quietly allocating. Nobody had written a line of code that said “pause here.” The pause was a property of how the language reclaims memory, and it arrived on the collector’s schedule, not the application’s.

A week earlier and a building away, a C++ team had the opposite problem from the same root. Their pricing daemon never paused — it had no collector to pause it — but it crashed once a day with a corrupted heap, because somewhere in a hundred thousand lines a Session was being deleted twice: one code path freed it when a socket closed, another freed it again when its work finished, and neither knew about the other. Two teams, two incidents, one question underneath both: who is responsible for freeing this memory, and when does it happen? The JVM team had handed that question to a runtime that answered it safely but at an unpredictable moment. The C++ team had kept the question for themselves and gotten the answer wrong. Every memory-management strategy ever designed is an answer to that one question, and each answer trades away something to buy something else.

The Core Insight

A program’s data lives in two places. The stack is a scratchpad that grows and shrinks with function calls: allocating is one instruction (bump a pointer), freeing is automatic and free (the pointer un-bumps when the function returns), and the lifetime of everything on it is dictated rigidly by scope. The heap is the open warehouse for everything whose lifetime doesn’t fit that discipline — data that must outlive the function that made it, data whose size isn’t known until runtime, data shared between parts of the program that come and go on different schedules. The stack manages itself. The heap is where the hard question lives, because something has to decide when each heap object is no longer needed and reclaim it. Reclaim too early and a live pointer dangles into freed memory (use-after-free). Reclaim twice and you corrupt the allocator (double-free). Never reclaim and you leak until the process dies.

There are exactly four strategies for answering “who frees the heap, and when,” and the entire chapter is the trade-offs between them:

Manual. The programmer calls free/delete by hand (C, raw C++). Maximum control, zero safety net — the question is yours, and so are all three bugs above.
RAII and smart pointers. Tie each heap allocation to a stack object whose destructor frees it; freeing then happens deterministically at scope exit (C++ unique_ptr/shared_ptr/weak_ptr). You still state ownership, but the compiler inserts the delete.
Garbage collection. A runtime periodically finds objects nothing points to and reclaims them (Java, Go, Python, JavaScript). The question vanishes from your code entirely — at the cost of a collector that runs on its schedule, not yours.
Compile-time ownership and borrowing. The compiler tracks who owns each value, proves no reference outlives its data, and inserts the free at the one correct point (Rust). Safe like a GC, deterministic like RAII, with no runtime collector.

The order is roughly “how late the decision is made.” Manual and GC decide at run time; RAII and ownership decide at compile time. And the single axis that organizes all four is this: the earlier you push the decision, the more the cost moves from the running program — pauses, crashes — onto the type system and the programmer.

A mental model

Picture a library and the question of who reshelves the returned books. Manual management is a library with no staff: every patron must personally reshelve what they borrow, in the exact right spot, exactly once. It works beautifully when everyone is disciplined and collapses the instant someone reshelves a book twice (it’s now in two places, the catalog is corrupt) or walks out still holding one (it’s lost forever).

RAII hires a clerk who shadows each patron and reshelves a book the moment that specific patron is done with it — predictable, immediate, no roaming. You still have to tell the clerk which patron owns which book, but once you have, reshelving is exact and instant. Garbage collection fires the clerks and instead sends a sweeper through the whole building at intervals: it finds every book no patron is still reading and reshelves the lot at once. No patron ever has to think about it — but when the sweep runs, the library closes until it finishes, and you cannot predict when that will be or how long it takes. Compile-time ownership is the strangest and most powerful: a librarian at the door who, before anyone enters, traces every patron’s entire route through the stacks and works out in advance exactly when each book will be done with — then has it reshelved at precisely that moment, with no clerk shadowing anyone and no sweep ever closing the building. All the safety, none of the runtime staff. The price is that the door-librarian is strict, and will turn away any patron whose route can’t be proven safe.

When to use which strategy

The choice is rarely yours alone — it largely comes with the language — but understanding the trade lets you pick the right language for a workload and reason correctly within the one you’re handed. Figure 4.1 sorts the four strategies by when the decision is made and what it costs.

Reach for a garbage-collected language when developer velocity and safety matter more than worst-case latency: business logic, web backends, data pipelines, anything where a handful of millisecond pauses are invisible against network and database time. You give up control over when memory is freed and accept the collector’s overhead and occasional pauses; in return nobody on the team ever writes a use-after-free, and you ship faster. This is the correct default for the overwhelming majority of software.

Reach for RAII (C++) when you need deterministic destruction and bare-metal performance and are willing to state ownership yourself: game engines, audio and video pipelines, high-frequency trading, embedded systems where there is no room for a collector and no tolerance for a pause. Freeing happens at a line you can point to, every time. The price is that the compiler trusts you — a stray raw pointer or a missed ownership decision is a runtime bug, not a compile error.

Reach for compile-time ownership (Rust) when you want C++’s determinism and performance and a GC’s safety, and can pay the learning curve: systems software where a data race or a use-after-free is unacceptable but a pause is equally unacceptable — browsers, OS components, network infrastructure, latency-critical services. You trade a demanding compiler for the elimination of an entire class of bugs at zero runtime cost.

The single most common mistake is using a GC language and then fighting its collector — allocating millions of short-lived objects in a hot loop, then blaming the pauses on “the GC being slow.” The collector isn’t slow; you handed it a flood of garbage to clean up. The flip side is reaching for C++ or Rust for a CRUD web service whose latency budget is dominated by a database round-trip, paying a steep complexity tax to optimize a cost that doesn’t matter. Match the strategy to whether worst-case latency — not average — is on your critical path.

What you’ll learn

Why the stack manages itself for free and the heap doesn’t, and how that split decides where every allocation should live
The four strategies for reclaiming the heap — manual, RAII, garbage collection, compile-time ownership — and the precise trade each one makes
How tracing GCs (Java’s generational G1/ZGC, Go’s concurrent low-pause collector) and reference-counting GCs (Python, with a cycle detector) actually work, and why each pauses differently
Why RAII gives deterministic destruction with no collector, and how unique_ptr, shared_ptr, and weak_ptr encode ownership in the type
How Rust’s move/borrow/lifetime model gets GC-grade safety at compile time with zero runtime cost — and what it costs the programmer instead
The cross-language idiom for resources that aren’t memory — files, sockets, locks — and why with, defer, try-with-resources, RAII, and Drop are the same idea wearing six different hats
How to read a latency profile and decide whether your memory strategy is the bottleneck — and whether you fix it by tuning, by allocating less, or by changing languages

Prerequisites

Software Engineering Overview — what a process, a runtime, and a build step are; why reproducibility and performance are engineering concerns, not afterthoughts
C++: Fundamentals — classes, constructors, destructors, and the RAII lifecycle; several core examples here are in C++ and assume you can read a destructor
Comfort with pointers and references in at least one language: what an address is, what it means for a pointer to dangle, and why a leak is a problem

The stack and the heap

Every strategy in this chapter is about the heap, so it pays to be precise about why the stack is the easy case and the heap is the hard one. The stack is a region of memory managed as a literal stack: each function call pushes a frame holding its local variables and bookkeeping, and returning pops it. Allocation is a single instruction that moves the stack pointer; deallocation is the same instruction in reverse, run automatically when the frame pops. There is no fragmentation, the data is contiguous and hot in cache, and — crucially — the lifetime of every stack object is exactly its enclosing scope, enforced by the machine. Nothing decides when to free a stack variable; the call structure decides for you.

That rigidity is also the stack’s limit. A stack object cannot outlive the function that created it (return a pointer to one and it dangles the moment the function returns — the dangling-reference bug Rust’s greeting() example catches at compile time). Its size must be known when the frame is laid out. And it can’t be shared with another part of the program running on a different schedule. The heap exists for exactly these cases: data that must outlive its creating scope, data sized at runtime, data shared across independent lifetimes. The heap buys you that flexibility and hands you the bill — because nothing about the call structure tells the runtime when a heap object is finished, someone or something must decide.

A subtle point that trips up newcomers: a std::vector, a Python list, a Java ArrayList, a Go slice — these are handles that live on the stack (or inside another object) while their backing buffer lives on the heap. The small fixed-size handle (a pointer, a length, a capacity) rides the stack and frees itself; the variable-size buffer it points to is the heap allocation someone must manage. This is why moving a vector is cheap: you copy the three-word handle, not the buffer (the deed-transfer idea developed in C++: Modern C++ and Rust: Ownership & Borrowing).

The first rule of memory management, true in every language, follows directly: prefer the stack; reach for the heap only when lifetime or size forces it. A local value, a small struct passed by value, a fixed-size array — keep them on the stack, where management is free and bug-free. Heap-allocate only when the data genuinely must outlive its scope or its size is unknown. Half of “memory management” is just not putting things on the heap that didn’t need to be there.

Strategy 1: manual management

The oldest answer is to make the programmer do it. In C you malloc a block and free it; in raw C++ you new an object and delete it. The model is maximally simple and maximally unforgiving: you have total control over exactly when every byte is allocated and freed — no collector, no overhead, no surprises — and total responsibility for getting it right, with the compiler offering no help at all.

The trouble is that a raw pointer carries an address and nothing else. A Session* tells you where the session is; it says nothing about whose job it is to free it. That missing information is where the three classic bugs breed. The double-free is two parties each believing they own the block, each calling free — the introduction’s crash. The use-after-free is one party freeing while another still holds the pointer and dereferences it later. The leak is everyone assuming someone else will free it, so nobody does. All three are the same disease: ownership is a fact that lives in a human’s head or a comment, never in the type, so the compiler cannot check it.

Here is manual C, with the contract that the compiler will not enforce written out as a comment — which is to say, not enforced at all:

// Caller OWNS the returned buffer and MUST free() it exactly once.
// Nothing in the type says so; this comment is the only contract.
char *read_line(FILE *f) {
    char *buf = malloc(256);          // who frees this? the caller, by convention
    if (!buf) return NULL;            // ... and on this error path, nobody allocated, so OK
    if (!fgets(buf, 256, f)) {
        free(buf);                    // must remember to free on THIS path too
        return NULL;
    }
    return buf;                       // ownership transferred to caller — by comment only
}

Every early return is a place to leak; every shared pointer is a place to double-free; every stored pointer is a place to use-after-free once the owner frees it. Manual management is not wrong — it is the foundation everything else is built on, and the only option when there is no runtime to lean on (kernels, bootloaders, tiny embedded targets). But as the default strategy for application code it has been comprehensively retired, because the bug it invites is the worst kind: silent, intermittent, and discovered far from its cause. The next three strategies are three different ways to take the question off the programmer’s hands.

Strategy 2: RAII and deterministic destruction

C++’s answer was not to add a garbage collector but to make the stack manage the heap. The mechanism is RAII — Resource Acquisition Is Initialization, the pattern from C++: Fundamentals — and the idea is exact: wrap every heap allocation in a stack object whose constructor acquires it and whose destructor releases it. Because the stack frees its objects automatically and deterministically at scope exit, the heap allocation inside rides along, freed at a moment you can point to in the source — on every exit path, including an exception unwinding the stack. No collector, no pause, no manual delete.

The standard library packages this as smart pointers, and their real contribution is encoding ownership in the type so the compiler reads and enforces it. The full smart-pointer and move-semantics walkthrough is the subject of C++: Modern C++; here is the model you need to compare across languages. There are three kinds, and the comparison is the whole story:

Smart pointer	Ownership	Cost	Use when
`unique_ptr<T>`	Exactly one owner	Zero overhead (just the pointer)	The default — one clear owner
`shared_ptr<T>`	Several co-owners	An atomic refcount per copy/destroy	Ownership is genuinely shared
`weak_ptr<T>`	Observes, owns nothing	Cheap; must `lock()` to use	Break cycles; watch without pinning alive

A unique_ptr is single ownership: when it dies, the object dies, at no cost beyond the pointer itself. It cannot be copied — that would mint a second owner and the double-free returns — so you move it to transfer ownership, handing over the deed and leaving the old holder empty. Reach for it first and stay there:

// unique_ptr: one owner, freed deterministically when the owner's scope exits.
std::unique_ptr<Session> open_session(int id) {
    return std::make_unique<Session>(id);   // no raw new, no matching delete to forget
}                                            // caller's unique_ptr frees it — exactly once

A shared_ptr is for genuinely plural ownership — a cache touched by several threads, a document open in several views. Co-owners share a reference count; the object is freed when the last owner drops and the count hits zero. The cost is real: every copy and destruction is an atomic increment or decrement (the count may be touched concurrently), so defaulting to shared_ptr “to be safe” sprinkles atomic traffic across your program. A weak_ptr observes without owning — it adds nothing to the strong count — which is how you watch an object without keeping it alive and, critically, how you break the reference cycle that two mutual shared_ptrs create and leak forever. C++: Modern C++ tells the war story of the gateway whose connections and registry held each other alive for days until one edge became weak.

The punchline, and the reason RAII is more than a local trick: hold your resources in smart members and you write no destructor, copy, or move at all — the compiler-generated ones just forward to members that already free themselves correctly (the rule of zero). RAII isn’t a thing you do per allocation; it’s a property that propagates through every type built from self-managing parts. The trade against a GC is stark and deliberate: deterministic, pause-free freeing at a known line, paid for by stating ownership yourself — with a stray raw pointer or a missed ownership call still able to slip past the compiler as a runtime bug.

Strategy 3: garbage collection

The third strategy removes the question from your code entirely: let a runtime find and free unreachable objects automatically. You allocate and never free; the garbage collector periodically determines what is still reachable and reclaims the rest. The guarantee is enormous — no use-after-free, no double-free, no leak of reachable objects, no ownership to reason about — and it is why GC languages dominate application development. The cost is equally real and is the subject of the introduction’s 02:14 page: you lose control over when memory is freed, and the collector does work, sometimes pausing your program to do it. There are two broad families.

Tracing collectors (Java, Go, JavaScript, and Python’s backstop) start from a set of roots — stack variables, globals, registers — and walk every pointer reachable from them, marking each object they reach as live. Anything unmarked at the end is unreachable and its memory is reclaimed. The genius and the curse are the same: reachability is the exact definition of “still needed,” so tracing never frees something in use and never leaks something dead — but computing it means touching the live object graph, which competes with your program for the CPU and, in the simplest designs, requires stopping the program while it runs (a stop-the-world pause). Modern collectors fight that pause with two ideas. The generational hypothesis observes that most objects die young, so the heap is split into a small young generation collected often and cheaply and an old generation collected rarely; the request path’s flood of short-lived temporaries is swept in fast minor collections that never touch the long-lived data. And concurrency lets the collector do most of its marking while the program runs, shrinking the stop-the-world window to the few phases that genuinely need exclusivity.

Reference counting (Python’s primary mechanism, and shared_ptr under the hood) takes the opposite approach: each object carries a count of how many references point to it, incremented on every new reference and decremented when one goes away, and the object is freed the instant its count hits zero. This is pleasantly prompt — memory is reclaimed the moment the last reference drops, not at some future sweep — but it has two costs. Maintaining the count adds work to every assignment, and it cannot reclaim a cycle: two objects referring to each other keep each other’s count above zero forever even when nothing outside reaches them. Python therefore pairs its refcount with a periodic cycle detector (a tracing collector in miniature) to catch exactly the case refcounting misses — the same cycle problem C++’s weak_ptr solves by hand.

The languages differ in where they sit on the throughput-versus-latency axis, and it is worth seeing them side by side:

Runtime	Collector	Optimized for	Characteristic cost
Java	Generational tracing; G1 default, ZGC/Shenandoah low-pause	Throughput on huge heaps	Tunable; ZGC holds pauses to single-digit ms on multi-TB heaps
Go	Concurrent tri-color mark-sweep, non-generational	Low, predictable pause	Sub-millisecond pauses; more total CPU spent collecting
Python	Reference counting + cycle detector	Prompt reclamation, simplicity	Per-op refcount work; the GIL serializes it
JavaScript (V8)	Generational tracing (scavenge + mark-compact)	Browser/server responsiveness	Incremental/concurrent to keep the UI thread free

The design philosophies diverge sharply. Java’s collectors are throughput champions tuned over decades: a long-running JVM with a well-sized heap can sustain enormous allocation rates, and ZGC pushes worst-case pauses below ten milliseconds even on terabyte heaps — at the cost of dozens of tuning knobs and a memory footprint that runs larger than the live set. Go made the opposite bet: a deliberately simple, non-generational concurrent collector tuned for consistently tiny pauses, accepting that it spends more total CPU collecting and offers fewer knobs, because Go’s target — network services where tail latency is king — wants predictability over peak throughput. Python’s refcounting reclaims promptly and is easy to reason about, but the count maintenance and the global interpreter lock that serializes it are part of why CPython is not the tool for CPU-bound parallel work.

The illustrative point across all of them: you allocate, you never free, and the cost shows up not in your code but in your latency profile.

# Python: you never free. The refcount drops to zero when `data` goes out of
# scope and the list is reclaimed promptly — UNLESS it's part of a reference
# cycle, which the periodic cycle detector cleans up later.
def process(records: list[dict]) -> int:
    data = [transform(r) for r in records]  # allocates; no free() anywhere
    return len(data)
    # `data` becomes unreachable on return; refcount → 0; memory reclaimed.

// Go: same deal — allocate, never free. The escape analyzer decides whether
// `buf` can live on the stack (freed for free at return) or must escape to the
// heap (reclaimed later by the concurrent collector). You influence which by
// how you use it; you never call free.
func render(n int) []byte {
    buf := make([]byte, n) // may escape to the heap because it's returned
    return buf             // collector reclaims it once nothing references it
}

War story: the 02:14 page and the allocation that wasn’t free

The trading engine from the introduction was breaching its 200ms p99 SLA with second-long stalls a couple of times a minute. Every stall lined up with a major GC event. The team’s first instinct was to blame the collector and tune it — bigger heap, different collector, more parallel GC threads — and each change moved the pauses around without removing them. The actual fix came from a flame graph of allocations, not GC: the validation step parsed each request into a tree of small short-lived objects, allocating tens of thousands of them per request and discarding them all immediately. The collector wasn’t slow; it was drowning. The young generation filled in seconds, forcing constant minor collections, and the churn occasionally promoted junk into the old generation and triggered the expensive major collection that caused the visible stall. The real fix was to allocate less: reuse a parser buffer across requests, replace the object tree with a flat parse into primitives, and pool the few objects that did need to persist. Allocation rate dropped by 30x, the young generation stopped thrashing, major collections became rare, and the p99 fell under budget — without touching a single GC flag. The lesson the team wrote down: in a GC language, the lever is usually how much garbage you make, not how the collector cleans it. “The GC is pausing” almost always means “you are allocating too much.”

Strategy 4: compile-time ownership and borrowing

The fourth strategy is the youngest and the most ambitious: get a GC’s safety with RAII’s determinism and no runtime collector at all, by having the compiler prove memory safety before the program runs. This is Rust’s model, and the deep mechanics — the borrow checker, non-lexical lifetimes, interior mutability — are the subject of Rust: Ownership & Borrowing. The comparative essence is three rules the compiler enforces:

Every value has exactly one owner, and when the owner leaves scope the value is dropped — its Drop runs, its memory is freed — automatically and exactly once. This is RAII, except the compiler enforces it rather than trusting you. One owner means never two (no double-free) and never zero (no leak by construction).
You either move ownership or borrow it. Moving hands over the single deed and invalidates the old binding — touch it afterward and the program won’t compile, so use-after-move is a build error. Borrowing lends temporary access without transferring ownership.
Borrows obey shared-XOR-mutable: any number of shared read-only references, or exactly one mutable reference, never both at once. Aliasing-plus-mutation is the root of data races and use-after-free, and forbidding the combination eliminates both families of bug at compile time.

The result is the most striking trade in the chapter. The same C++ ownership concepts — unique_ptr as a move-only single owner, shared_ptr as reference counting, RAII tying cleanup to scope — exist in Rust, but the compiler checks them instead of trusting you. A stray copy that would create a second owner, a reference held past its data’s death, a share across threads without synchronization: all compile errors, not runtime hazards.

// The owner of `s` is the binding; `s` is dropped (freed) at the closing brace.
// No free(), no collector — the compiler inserts the drop at exactly one point.
fn process() {
    let s = String::from("hello");   // owns a heap buffer
    let n = s.len();                 // borrow `s` (shared, read-only) — no ownership moves
    consume(s);                      // ownership MOVES into consume; `s` is now dead here
    // println!("{s}");              // error[E0382]: borrow of moved value `s`
    println!("{n}");                 // fine — n is a plain Copy value
}                                    // if `s` still owned a value, it would drop here

What it costs is not runtime — there is zero runtime overhead, the checks evaporate after compilation — but the programmer’s time and the compiler’s strictness. Code that aliases and mutates, or holds a reference too long, won’t build, and learning to restructure for the borrow checker is the famous Rust learning curve. When a design genuinely needs shared mutation, Rust provides escape valves with explicit costs — Rc/Arc for shared ownership (reference counting, exactly like shared_ptr), RefCell/Mutex for runtime- checked mutation — each trading a compile-time guarantee for a runtime cost you now manage, as Rust: Ownership & Borrowing details. The bet is that paying the cost once, at compile time, in the programmer’s head, is cheaper over a system’s life than paying it forever in runtime pauses (the GC trade) or in 3 a.m. crashes (the manual trade).

The determinism versus throughput trade-off

Lay the four strategies on a single axis and the trade-offs snap into focus. The axis is when the decision to free is made, and everything else follows from it.

Manual and RAII free at a deterministic, known point — manual when you call delete, RAII when the scope exits. You can point at the line. This is what latency-critical systems need: a destructor that runs in a bounded few nanoseconds, every time, with no possibility of a collector deciding to walk the heap mid-request. The price of determinism is that someone decided ownership — you, in manual; you-but-checked, in RAII and ownership.

Garbage collection frees at a nondeterministic, runtime-chosen point, and that is both its gift and its tax. The gift is that you reason about nothing — no ownership, no lifetimes, no destructors. The tax is the pause and the throughput overhead: the collector needs spare CPU and spare memory (a heap noticeably larger than the live set) to run efficiently, and even the best low-pause collectors steal some cycles and impose some worst-case stall. For throughput-oriented batch work this is a great deal — a few percent overhead for total safety and faster development. For a hard real-time audio callback that must finish in 5ms or the speaker clicks, any unpredictable pause is disqualifying.

Compile-time ownership is the one that appears to cheat the trade: deterministic and safe and zero runtime cost. The catch is that the cost didn’t vanish, it moved — from runtime to compile time, from the machine to the programmer. You pay it in a stricter compiler and a steeper learning curve, up front, once. Whether that’s a bargain depends on the system’s lifetime: amortized over years of a long-lived, latency-critical service it is cheap; spent on a script that runs once it is absurd. The table makes the four-way comparison concrete:

Property	Manual (C/C++)	RAII (C++)	GC (Java/Go/Py/JS)	Ownership (Rust)
When freed	When you call `free`	Deterministically, at scope exit	Nondeterministically, at runtime	Deterministically, at scope exit
Who decides	Programmer	Programmer (compiler inserts)	Runtime	Compiler (you state ownership)
Runtime cost	None	None	Collector CPU + pauses + memory	None
Memory safety	None — all bugs possible	Mostly (raw pointers leak through)	Full	Full
Worst-case latency	Predictable	Predictable	Pause-prone	Predictable
Cost paid by	Programmer (and prod)	Programmer	The running program	Programmer (compile time)

There is no universally best column. The 02:14 page was a GC system whose workload didn’t fit GC’s worst case; the daily-crash daemon was a manual system whose complexity exceeded what manual discipline can sustain. The skill is reading which trade your workload can afford.

Resource management beyond memory

Memory is the loudest resource, but it is not the only one — and the most elegant insight in this whole area is that the same mechanism that frees memory frees everything else. A file handle, a network socket, a mutex lock, a database connection: each must be released exactly once, on every exit path including the error paths, or you leak file descriptors, deadlock on a lock you forgot to release, or exhaust a connection pool. This is the ownership question again, wearing a different resource, and every language answers it with the same idiom it uses for memory — generalized to “do this cleanup when control leaves this region.”

C++ does it with RAII directly: the lock is a stack object, its destructor releases on scope exit. Rust does it with Drop, the same trait that frees memory: implement Drop and your cleanup runs at the owner’s scope exit, even on a panic. These two are the purest form — cleanup is welded to the object’s lifetime, with nothing for the programmer to remember. The GC languages can’t use that mechanism, because they have no deterministic destructor — the collector runs whenever it likes, so you cannot tie a socket’s close to “when this object is collected” (a finalizer, the GC equivalent, is notoriously unreliable precisely because its timing is undefined). So they bolt on an explicit scope construct: Python’s with, Java’s try-with-resources, C#’s using, and Go’s defer, which schedules cleanup to run when the function returns. They are syntactically different and semantically identical: register cleanup, guarantee it runs when this region ends. Here is the same “open, use, guarantee close” across all six:

// C++ — RAII: the lock_guard's destructor releases the mutex at scope exit,
// even if use() throws. Nothing to remember; the type does it.
void critical() {
    std::lock_guard<std::mutex> guard(mtx);  // acquires here
    use_shared_state();
}                                            // releases here, on every path

// Rust — Drop: the MutexGuard releases when it leaves scope, even on panic.
// Same mechanism that frees memory, applied to a lock.
fn critical(m: &Mutex<State>) {
    let guard = m.lock().unwrap();   // acquires; guard owns the lock
    use_shared_state(&guard);
}                                    // guard dropped here → lock released

# Python — `with`: __exit__ runs on block exit, exception or not. The context
# manager is the GC-language stand-in for a deterministic destructor.
def critical(path: str) -> str:
    with open(path) as f:        # __enter__ acquires
        return f.read()          # __exit__ closes f, even on this early return

// Java — try-with-resources: close() is called automatically at block exit,
// in reverse order of acquisition, on every path including exceptions.
String critical(Path path) throws IOException {
    try (var reader = Files.newBufferedReader(path)) {  // acquires
        return reader.readLine();
    }                                                   // close() called here
}

// Go — defer: schedules f.Close() to run when the function returns, no matter
// which path it takes. The cleanup sits right next to the acquisition.
func critical(path string) (string, error) {
    f, err := os.Open(path)
    if err != nil {
        return "", err
    }
    defer f.Close()              // runs at function return, on every path
    return readAll(f)
}

The comparison table makes the family resemblance explicit:

Language	Idiom	When cleanup runs	Tied to
C++	RAII destructor	Scope exit (deterministic)	Object lifetime
Rust	`Drop`	Scope exit (deterministic)	Object lifetime
Python	`with` (context manager)	Block exit	Lexical block
Java	try-with-resources	Block exit	Lexical block
C#	`using`	Block exit	Lexical block
Go	`defer`	Function return	Function call

Two differences are worth internalizing. First, C++ and Rust tie cleanup to the object, so it composes automatically — a struct of ten resources cleans up all ten with no code, because each member’s destructor/Drop runs. The with/defer family ties cleanup to a syntactic region, so you must remember to write the construct for each resource; forget the with and the file leaks. Second, Go’s defer runs at function return rather than block exit, which is why a defer inside a loop is a classic bug — the cleanups stack up and don’t run until the whole function ends, exhausting handles mid-loop. The deep lesson is the unifying one: deterministic cleanup is the same problem as deterministic memory freeing, and a language’s answer to one is its answer to the other. That is why GC languages, which gave up deterministic memory freeing, all had to add a separate explicit construct for non-memory resources — the one thing their collector cannot do for them.

Build it → Deterministic resource management under real load: the Rust Project 03: High-Performance Cache leans on Drop, Arc, and pooled buffers to manage memory and connections without a GC in a latency-critical hot path.

Knowing when memory is your bottleneck

A closing piece of judgment that ties the strategies to practice: most of the time, memory management is not your bottleneck, and the worst thing you can do is optimize it on a hunch. The discipline is to measure first (the Performance and Profiling material), then act on what the profile says.

In a GC language, watch two numbers: allocation rate and pause time/frequency. If your p99 latency has spikes that line up with GC events, the lever is almost always allocate less — pool and reuse objects, avoid per-request object trees, prefer value types and slices over boxed objects — exactly the 02:14 war story. Tuning the collector moves pauses around; reducing allocation removes them. Only after you’ve cut allocation does collector tuning (heap size, choosing ZGC over G1) earn its keep.

In a manual or RAII language, the questions are different: are you heap-allocating things that could live on the stack? Is shared_ptr’s atomic refcount showing up hot because you defaulted to it instead of unique_ptr? Is malloc itself the bottleneck — a simulation churning millions of same-sized objects — in which case an arena (bump- allocate, free everything at once by resetting a pointer) or a pool (fixed-size slots on a free list) can beat the general-purpose allocator? These are sharp tools for narrow problems; reach for them when a profiler, not intuition, says allocation is hot. The through-line, whatever the language: the strategy is a property of the workload’s latency budget, and you discover whether it’s the bottleneck by measuring, never by guessing.

Build it → When allocation itself is the hot path, custom allocators earn their keep: Project 39: GPU Memory Manager implements pooling and arena strategies for a resource where the general-purpose allocator is far too slow, and Project 14: Network Stack manages packet buffers with deterministic, zero-GC lifetimes under line-rate load.

Practical exercise

Difficulty: Level I · Level II · Level III

Level I — Answer the ownership question, four ways. Take a tiny program that manages one heap object across an early-return or error path — read a buffer, use it, release it. Write it four times: manual C with malloc/free, C++ with unique_ptr, a GC language (Python or Go) with no free at all, and Rust with ownership. For each version, write one sentence answering who frees this object and at exactly which line. The manual version is the only one where you can’t be sure; that uncertainty is the whole point of the other three.
Level II — Make and measure garbage. In a GC language, write a hot loop that allocates a flood of short-lived objects (an object tree per iteration), and turn on GC logging. Record allocation rate, pause frequency, and p99 latency. Now rewrite the loop to allocate far less — reuse a buffer, flatten the tree to primitives, pool the survivors — and record the same numbers. Report the deltas and explain, from the GC’s perspective, why allocating less moved the latency more than any collector flag would have.
Level III — Defend a strategy choice for a real workload. Pick a concrete system — a 5ms-deadline audio callback, a multi-terabyte-heap analytics service, a CRUD web backend, an OS network driver — and argue which of the four strategies (and which language) fits it, in terms of worst-case latency, throughput, safety, and team cost. Then argue the strongest case for the wrong choice and rebut it. The deliverable is the reasoning: a defensible mapping from a workload’s latency budget and risk tolerance to a memory-management strategy, with the trade you accepted named explicitly.

Summary

Every memory-management strategy answers one question — who frees the heap, and when? — and each answer trades something for something else. The stack is the easy case: it manages itself, so prefer it and reach for the heap only when lifetime or size forces you. For the heap there are four strategies. Manual management gives total control and zero safety, inviting double-free, use-after-free, and leaks because a raw pointer can’t say who owns what. RAII ties heap allocations to stack objects whose destructors free them deterministically at scope exit, encoding ownership in the type with unique_ptr, shared_ptr, and weak_ptr — pause-free determinism, paid for by stating ownership yourself. Garbage collection lets a runtime trace or refcount unreachable objects and free them automatically — total safety and zero ownership reasoning, paid for with nondeterministic pauses and collector overhead. Compile-time ownership has the compiler prove safety and insert the free, getting determinism and safety at zero runtime cost, paid for in compiler strictness and a learning curve. The same mechanisms generalize beyond memory: RAII, Drop, with, try-with-resources, using, and defer are one idea — deterministic cleanup of files, sockets, and locks — and a language’s answer for memory is its answer for every resource.

Key takeaways

Prefer the stack; the heap is the only place the “who frees this, and when?” question exists, and every strategy is an answer to it.
The four strategies sort by when the decision is made: manual and GC decide at runtime, RAII and ownership at compile time — and the earlier the decision, the more the cost shifts from the running program to the programmer.
GC buys total safety and zero ownership reasoning at the price of nondeterministic pauses; in a GC language the lever for latency is almost always allocate less, not tune the collector.
RAII (C++) and ownership (Rust) give deterministic, pause-free freeing; Rust adds compile-time safety at zero runtime cost, moving the price to the compiler and the learning curve.
Resource management beyond memory is the same problem: RAII/Drop tie cleanup to object lifetime; with/defer/try-with-resources/using tie it to a scope — and GC languages need the explicit construct because their collector can’t free resources deterministically.

Connections to other chapters

Software Engineering Overview (prerequisite): frames performance, safety, and language choice as engineering trade-offs rather than matters of taste — this chapter is that framing applied to the single decision (how memory is reclaimed) that most distinguishes one systems language from another.
C++: Fundamentals and C++: Modern C++ (field guides): Fundamentals teaches the RAII lifecycle this chapter generalizes; Modern C++ holds the deep smart-pointer and move-semantics walkthrough — the unique_ptr/shared_ptr/weak_ptr mechanics, the rule of zero, and noexcept moves — that this chapter summarizes comparatively rather than duplicates.
Rust: Fundamentals and Rust: Ownership & Borrowing (field guides): Ownership & Borrowing is the deep version of strategy 4 here — the borrow checker, lifetimes, shared-XOR-mutable, interior mutability, and Rc/Arc — where this chapter teaches only the comparative model and defers the mechanics. Rust: Unsafe shows what you give up when you step outside the checked subset.
Go: Fundamentals (field guide): the concrete home of Go’s concurrent low-pause collector, escape analysis, and defer — the GC and resource-cleanup story this chapter contrasts against the others.
Concurrency and Parallelism Models (sibling): aliasing-plus-mutation is the root of both use-after-free and data races, so the ownership rules here are the same rules that make Arc<Mutex<T>> and Go’s “share by communicating” safe — memory management and concurrency safety are two faces of one discipline.
Performance and Profiling (sibling): you decide whether memory is your bottleneck by measuring allocation rate and pause time, not by guessing — the methodology for trusting those numbers, and for telling a GC pause from a slow query, comes from there.
GPU and CUDA and the GPU Memory Manager project (extension): GPU memory is a manually managed, non-collected resource where pooling and arenas are mandatory, not optional — the manual and arena strategies here, applied where there is no GC and allocation is the hot path. The High-Performance Cache project applies RAII and pooling to a latency-critical service.

Scott Meyers, Effective Modern C++ — Items 18–22 (smart pointers) and 23–25 (move semantics): the canonical practical treatment of RAII-based ownership.
The Rust Programming Language (Klabnik & Nichols), the ownership chapters — the example-first introduction to compile-time ownership, moves, borrowing, and lifetimes.

Deep dives

Richard Jones, Antony Hosking & Eliot Moss, The Garbage Collection Handbook — the definitive reference on tracing, generational, concurrent, and reference-counting collectors and the trade-offs between them.
Getting Started with the Z Garbage Collector and the Go runtime’s GC design docs — two modern, opposite bets on the latency-versus-throughput axis, documented by their designers.
Jung et al., RustBelt: Securing the Foundations of the Rust Programming Language (POPL
1. — the formal proof that compile-time ownership actually delivers the safety it claims.

Historical context

John McCarthy, “Recursive Functions of Symbolic Expressions” (1960) — introduces garbage collection alongside Lisp, the origin of the whole tracing tradition.
Bjarne Stroustrup, The C++ Programming Language — RAII and deterministic destruction from the designer who built the language around them, the deliberate road not taken toward a collector.
Cyclone: A Safe Dialect of C (Jim et al., 2002) — the region-based and linear-type research lineage that compile-time ownership descends from, showing the idea predates Rust by decades.

--- title: "Memory and Resource Management" keywords: [memory management, garbage collection, raii, ownership, borrowing, smart pointers, reference counting, stack, heap, lifetimes, deterministic destruction, resource management, drop, defer] difficulty: advanced prerequisites: [software-engineering-overview, cpp-fundamentals] estimated_time: "4-5 hours" --- ## Introduction The pager went off at 02:14 with a single line: *p99 checkout latency 1.8s, SLA 200ms*. The service was a JVM trading engine that did almost nothing per request — look up a price, validate a quantity, write a record — and it did it in two milliseconds, on average. The average was a lie. Once or twice a minute a request would freeze for over a second while every thread in the process stopped dead, did no work, returned no responses, and then resumed as if nothing had happened. The threads weren't blocked on a lock or a slow query. They were blocked on the garbage collector, which had decided that *this* was the moment to walk the heap and reclaim the millions of short-lived objects the request path had been quietly allocating. Nobody had written a line of code that said "pause here." The pause was a property of how the language reclaims memory, and it arrived on the collector's schedule, not the application's. A week earlier and a building away, a C++ team had the opposite problem from the same root. Their pricing daemon never paused — it had no collector to pause it — but it crashed once a day with a corrupted heap, because somewhere in a hundred thousand lines a `Session` was being `delete`d twice: one code path freed it when a socket closed, another freed it again when its work finished, and neither knew about the other. Two teams, two incidents, one question underneath both: **who is responsible for freeing this memory, and when does it happen?** The JVM team had handed that question to a runtime that answered it safely but at an unpredictable moment. The C++ team had kept the question for themselves and gotten the answer wrong. Every memory-management strategy ever designed is an answer to that one question, and each answer trades away something to buy something else. ### The Core Insight A program's data lives in two places. The **stack** is a scratchpad that grows and shrinks with function calls: allocating is one instruction (bump a pointer), freeing is automatic and free (the pointer un-bumps when the function returns), and the lifetime of everything on it is dictated rigidly by scope. The **heap** is the open warehouse for everything whose lifetime *doesn't* fit that discipline — data that must outlive the function that made it, data whose size isn't known until runtime, data shared between parts of the program that come and go on different schedules. The stack manages itself. The heap is where the hard question lives, because something has to decide when each heap object is no longer needed and reclaim it. Reclaim too early and a live pointer dangles into freed memory (use-after-free). Reclaim twice and you corrupt the allocator (double-free). Never reclaim and you leak until the process dies. There are exactly four strategies for answering "who frees the heap, and when," and the entire chapter is the trade-offs between them: 1. **Manual.** The programmer calls `free`/`delete` by hand (C, raw C++). Maximum control, zero safety net — the question is yours, and so are all three bugs above. 2. **RAII and smart pointers.** Tie each heap allocation to a stack object whose destructor frees it; freeing then happens deterministically at scope exit (C++ `unique_ptr`/`shared_ptr`/`weak_ptr`). You still *state* ownership, but the compiler inserts the `delete`. 3. **Garbage collection.** A runtime periodically finds objects nothing points to and reclaims them (Java, Go, Python, JavaScript). The question vanishes from your code entirely — at the cost of a collector that runs on *its* schedule, not yours. 4. **Compile-time ownership and borrowing.** The compiler tracks who owns each value, proves no reference outlives its data, and inserts the free at the one correct point (Rust). Safe like a GC, deterministic like RAII, with no runtime collector. The order is roughly "how late the decision is made." Manual and GC decide at run time; RAII and ownership decide at compile time. And the single axis that organizes all four is this: **the earlier you push the decision, the more the cost moves from the running program — pauses, crashes — onto the type system and the programmer.** ### A mental model Picture a library and the question of who reshelves the returned books. **Manual management** is a library with no staff: every patron must personally reshelve what they borrow, in the exact right spot, exactly once. It works beautifully when everyone is disciplined and collapses the instant someone reshelves a book twice (it's now in two places, the catalog is corrupt) or walks out still holding one (it's lost forever). **RAII** hires a clerk who shadows each patron and reshelves a book the moment that specific patron is done with it — predictable, immediate, no roaming. You still have to tell the clerk *which* patron owns *which* book, but once you have, reshelving is exact and instant. **Garbage collection** fires the clerks and instead sends a sweeper through the whole building at intervals: it finds every book no patron is still reading and reshelves the lot at once. No patron ever has to think about it — but when the sweep runs, the library *closes* until it finishes, and you cannot predict when that will be or how long it takes. **Compile-time ownership** is the strangest and most powerful: a librarian at the door who, before anyone enters, traces every patron's entire route through the stacks and works out in advance exactly when each book will be done with — then has it reshelved at precisely that moment, with no clerk shadowing anyone and no sweep ever closing the building. All the safety, none of the runtime staff. The price is that the door-librarian is strict, and will turn away any patron whose route can't be proven safe. ### When to use which strategy The choice is rarely yours alone — it largely comes with the language — but understanding the trade lets you pick the *right language* for a workload and reason correctly within the one you're handed. @fig-memory-management sorts the four strategies by when the decision is made and what it costs. **Reach for a garbage-collected language** when developer velocity and safety matter more than worst-case latency: business logic, web backends, data pipelines, anything where a handful of millisecond pauses are invisible against network and database time. You give up control over *when* memory is freed and accept the collector's overhead and occasional pauses; in return nobody on the team ever writes a use-after-free, and you ship faster. This is the correct default for the overwhelming majority of software. **Reach for RAII (C++)** when you need deterministic destruction and bare-metal performance and are willing to state ownership yourself: game engines, audio and video pipelines, high-frequency trading, embedded systems where there is no room for a collector and no tolerance for a pause. Freeing happens at a line you can point to, every time. The price is that the compiler trusts you — a stray raw pointer or a missed ownership decision is a runtime bug, not a compile error. **Reach for compile-time ownership (Rust)** when you want C++'s determinism and performance *and* a GC's safety, and can pay the learning curve: systems software where a data race or a use-after-free is unacceptable but a pause is equally unacceptable — browsers, OS components, network infrastructure, latency-critical services. You trade a demanding compiler for the elimination of an entire class of bugs at zero runtime cost. **The single most common mistake** is using a GC language and then fighting its collector — allocating millions of short-lived objects in a hot loop, then blaming the pauses on "the GC being slow." The collector isn't slow; you handed it a flood of garbage to clean up. The flip side is reaching for C++ or Rust for a CRUD web service whose latency budget is dominated by a database round-trip, paying a steep complexity tax to optimize a cost that doesn't matter. Match the strategy to whether *worst-case* latency — not average — is on your critical path. ### What you'll learn - Why the stack manages itself for free and the heap doesn't, and how that split decides where every allocation should live - The four strategies for reclaiming the heap — manual, RAII, garbage collection, compile-time ownership — and the precise trade each one makes - How tracing GCs (Java's generational G1/ZGC, Go's concurrent low-pause collector) and reference-counting GCs (Python, with a cycle detector) actually work, and why each pauses differently - Why RAII gives deterministic destruction with no collector, and how `unique_ptr`, `shared_ptr`, and `weak_ptr` encode ownership in the type - How Rust's move/borrow/lifetime model gets GC-grade safety at compile time with zero runtime cost — and what it costs the programmer instead - The cross-language idiom for resources that *aren't* memory — files, sockets, locks — and why `with`, `defer`, try-with-resources, RAII, and `Drop` are the same idea wearing six different hats - How to read a latency profile and decide whether your memory strategy is the bottleneck — and whether you fix it by tuning, by allocating less, or by changing languages ### Prerequisites - **Software Engineering Overview** — what a process, a runtime, and a build step are; why reproducibility and performance are engineering concerns, not afterthoughts - **C++: Fundamentals** — classes, constructors, destructors, and the RAII lifecycle; several core examples here are in C++ and assume you can read a destructor - Comfort with pointers and references in at least one language: what an address is, what it means for a pointer to dangle, and why a leak is a problem --- ## The stack and the heap Every strategy in this chapter is about the heap, so it pays to be precise about why the stack is the easy case and the heap is the hard one. The **stack** is a region of memory managed as a literal stack: each function call pushes a *frame* holding its local variables and bookkeeping, and returning pops it. Allocation is a single instruction that moves the stack pointer; deallocation is the same instruction in reverse, run automatically when the frame pops. There is no fragmentation, the data is contiguous and hot in cache, and — crucially — the lifetime of every stack object is *exactly* its enclosing scope, enforced by the machine. Nothing decides when to free a stack variable; the call structure decides for you. That rigidity is also the stack's limit. A stack object cannot outlive the function that created it (return a pointer to one and it dangles the moment the function returns — the dangling-reference bug Rust's `greeting()` example catches at compile time). Its size must be known when the frame is laid out. And it can't be shared with another part of the program running on a different schedule. The **heap** exists for exactly these cases: data that must outlive its creating scope, data sized at runtime, data shared across independent lifetimes. The heap buys you that flexibility and hands you the bill — because nothing about the call structure tells the runtime when a heap object is finished, *someone or something* must decide. A subtle point that trips up newcomers: a `std::vector`, a Python `list`, a Java `ArrayList`, a Go slice — these are *handles* that live on the stack (or inside another object) while their backing buffer lives on the heap. The small fixed-size handle (a pointer, a length, a capacity) rides the stack and frees itself; the variable-size buffer it points to is the heap allocation someone must manage. This is why moving a vector is cheap: you copy the three-word handle, not the buffer (the deed-transfer idea developed in *C++: Modern C++* and *Rust: Ownership & Borrowing*). The first rule of memory management, true in every language, follows directly: **prefer the stack; reach for the heap only when lifetime or size forces it.** A local value, a small struct passed by value, a fixed-size array — keep them on the stack, where management is free and bug-free. Heap-allocate only when the data genuinely must outlive its scope or its size is unknown. Half of "memory management" is just not putting things on the heap that didn't need to be there. ## Strategy 1: manual management The oldest answer is to make the programmer do it. In C you `malloc` a block and `free` it; in raw C++ you `new` an object and `delete` it. The model is maximally simple and maximally unforgiving: you have total control over exactly when every byte is allocated and freed — no collector, no overhead, no surprises — and total responsibility for getting it right, with the compiler offering no help at all. The trouble is that a raw pointer carries an address and nothing else. A `Session*` tells you *where* the session is; it says nothing about *whose job it is to free it*. That missing information is where the three classic bugs breed. The **double-free** is two parties each believing they own the block, each calling `free` — the introduction's crash. The **use-after-free** is one party freeing while another still holds the pointer and dereferences it later. The **leak** is everyone assuming someone else will free it, so nobody does. All three are the same disease: ownership is a fact that lives in a human's head or a comment, never in the type, so the compiler cannot check it. Here is manual C, with the contract that the compiler will not enforce written out as a comment — which is to say, not enforced at all: ```c // Caller OWNS the returned buffer and MUST free() it exactly once. // Nothing in the type says so; this comment is the only contract. char *read_line(FILE *f) { char *buf = malloc(256); // who frees this? the caller, by convention if (!buf) return NULL; // ... and on this error path, nobody allocated, so OK if (!fgets(buf, 256, f)) { free(buf); // must remember to free on THIS path too return NULL; } return buf; // ownership transferred to caller — by comment only } ``` Every early return is a place to leak; every shared pointer is a place to double-free; every stored pointer is a place to use-after-free once the owner frees it. Manual management is not *wrong* — it is the foundation everything else is built on, and the only option when there is no runtime to lean on (kernels, bootloaders, tiny embedded targets). But as the default strategy for application code it has been comprehensively retired, because the bug it invites is the worst kind: silent, intermittent, and discovered far from its cause. The next three strategies are three different ways to take the question off the programmer's hands. ## Strategy 2: RAII and deterministic destruction C++'s answer was not to add a garbage collector but to make the *stack* manage the *heap*. The mechanism is **RAII** — Resource Acquisition Is Initialization, the pattern from *C++: Fundamentals* — and the idea is exact: wrap every heap allocation in a stack object whose constructor acquires it and whose destructor releases it. Because the stack frees its objects automatically and deterministically at scope exit, the heap allocation inside rides along, freed at a moment you can point to in the source — on every exit path, including an exception unwinding the stack. No collector, no pause, no manual `delete`. The standard library packages this as **smart pointers**, and their real contribution is encoding *ownership in the type* so the compiler reads and enforces it. The full smart-pointer and move-semantics walkthrough is the subject of *C++: Modern C++*; here is the model you need to compare across languages. There are three kinds, and the comparison is the whole story: | Smart pointer | Ownership | Cost | Use when | |---|---|---|---| | `unique_ptr<T>` | Exactly one owner | Zero overhead (just the pointer) | The default — one clear owner | | `shared_ptr<T>` | Several co-owners | An *atomic* refcount per copy/destroy | Ownership is *genuinely* shared | | `weak_ptr<T>` | Observes, owns nothing | Cheap; must `lock()` to use | Break cycles; watch without pinning alive | A `unique_ptr` is single ownership: when it dies, the object dies, at no cost beyond the pointer itself. It cannot be copied — that would mint a second owner and the double-free returns — so you *move* it to transfer ownership, handing over the deed and leaving the old holder empty. Reach for it first and stay there: ```cpp // unique_ptr: one owner, freed deterministically when the owner's scope exits. std::unique_ptr<Session> open_session(int id) { return std::make_unique<Session>(id); // no raw new, no matching delete to forget } // caller's unique_ptr frees it — exactly once ``` A `shared_ptr` is for genuinely plural ownership — a cache touched by several threads, a document open in several views. Co-owners share a reference count; the object is freed when the last owner drops and the count hits zero. The cost is real: every copy and destruction is an *atomic* increment or decrement (the count may be touched concurrently), so defaulting to `shared_ptr` "to be safe" sprinkles atomic traffic across your program. A `weak_ptr` observes without owning — it adds nothing to the strong count — which is how you watch an object without keeping it alive and, critically, how you break the **reference cycle** that two mutual `shared_ptr`s create and leak forever. *C++: Modern C++* tells the war story of the gateway whose connections and registry held each other alive for days until one edge became `weak`. The punchline, and the reason RAII is more than a local trick: hold your resources in smart members and you write *no* destructor, copy, or move at all — the compiler-generated ones just forward to members that already free themselves correctly (the *rule of zero*). RAII isn't a thing you do per allocation; it's a property that propagates through every type built from self-managing parts. The trade against a GC is stark and deliberate: **deterministic, pause-free freeing at a known line, paid for by stating ownership yourself** — with a stray raw pointer or a missed ownership call still able to slip past the compiler as a runtime bug. ## Strategy 3: garbage collection The third strategy removes the question from your code entirely: let a runtime find and free unreachable objects automatically. You allocate and never free; the **garbage collector** periodically determines what is still reachable and reclaims the rest. The guarantee is enormous — no use-after-free, no double-free, no leak of reachable objects, no ownership to reason about — and it is why GC languages dominate application development. The cost is equally real and is the subject of the introduction's 02:14 page: you lose control over *when* memory is freed, and the collector does work, sometimes pausing your program to do it. There are two broad families. **Tracing collectors** (Java, Go, JavaScript, and Python's backstop) start from a set of *roots* — stack variables, globals, registers — and walk every pointer reachable from them, marking each object they reach as live. Anything unmarked at the end is unreachable and its memory is reclaimed. The genius and the curse are the same: reachability is the exact definition of "still needed," so tracing never frees something in use and never leaks something dead — but computing it means touching the live object graph, which competes with your program for the CPU and, in the simplest designs, requires stopping the program while it runs (a *stop-the-world* pause). Modern collectors fight that pause with two ideas. The **generational hypothesis** observes that most objects die young, so the heap is split into a small *young* generation collected often and cheaply and an *old* generation collected rarely; the request path's flood of short-lived temporaries is swept in fast minor collections that never touch the long-lived data. And **concurrency** lets the collector do most of its marking *while the program runs*, shrinking the stop-the-world window to the few phases that genuinely need exclusivity. **Reference counting** (Python's primary mechanism, and `shared_ptr` under the hood) takes the opposite approach: each object carries a count of how many references point to it, incremented on every new reference and decremented when one goes away, and the object is freed the instant its count hits zero. This is pleasantly *prompt* — memory is reclaimed the moment the last reference drops, not at some future sweep — but it has two costs. Maintaining the count adds work to every assignment, and it cannot reclaim a **cycle**: two objects referring to each other keep each other's count above zero forever even when nothing outside reaches them. Python therefore pairs its refcount with a periodic cycle detector (a tracing collector in miniature) to catch exactly the case refcounting misses — the same cycle problem C++'s `weak_ptr` solves by hand. The languages differ in where they sit on the throughput-versus-latency axis, and it is worth seeing them side by side: | Runtime | Collector | Optimized for | Characteristic cost | |---|---|---|---| | **Java** | Generational tracing; G1 default, ZGC/Shenandoah low-pause | Throughput on huge heaps | Tunable; ZGC holds pauses to single-digit ms on multi-TB heaps | | **Go** | Concurrent tri-color mark-sweep, non-generational | Low, predictable pause | Sub-millisecond pauses; more total CPU spent collecting | | **Python** | Reference counting + cycle detector | Prompt reclamation, simplicity | Per-op refcount work; the GIL serializes it | | **JavaScript (V8)** | Generational tracing (scavenge + mark-compact) | Browser/server responsiveness | Incremental/concurrent to keep the UI thread free | The design philosophies diverge sharply. Java's collectors are throughput champions tuned over decades: a long-running JVM with a well-sized heap can sustain enormous allocation rates, and ZGC pushes worst-case pauses below ten milliseconds even on terabyte heaps — at the cost of dozens of tuning knobs and a memory footprint that runs larger than the live set. Go made the opposite bet: a deliberately simple, *non-generational* concurrent collector tuned for consistently tiny pauses, accepting that it spends more total CPU collecting and offers fewer knobs, because Go's target — network services where tail latency is king — wants predictability over peak throughput. Python's refcounting reclaims promptly and is easy to reason about, but the count maintenance and the global interpreter lock that serializes it are part of why CPython is not the tool for CPU-bound parallel work. The illustrative point across all of them: you allocate, you never free, and the cost shows up not in your code but in your *latency profile*. ```python # Python: you never free. The refcount drops to zero when `data` goes out of # scope and the list is reclaimed promptly — UNLESS it's part of a reference # cycle, which the periodic cycle detector cleans up later. def process(records: list[dict]) -> int: data = [transform(r) for r in records] # allocates; no free() anywhere return len(data) # `data` becomes unreachable on return; refcount → 0; memory reclaimed. ``` ```go // Go: same deal — allocate, never free. The escape analyzer decides whether // `buf` can live on the stack (freed for free at return) or must escape to the // heap (reclaimed later by the concurrent collector). You influence which by // how you use it; you never call free. func render(n int) []byte { buf := make([]byte, n) // may escape to the heap because it's returned return buf // collector reclaims it once nothing references it } ``` ::: {.callout-warning} ## War story: the 02:14 page and the allocation that wasn't free The trading engine from the introduction was breaching its 200ms p99 SLA with second-long stalls a couple of times a minute. Every stall lined up with a major GC event. The team's first instinct was to blame the collector and tune it — bigger heap, different collector, more parallel GC threads — and each change moved the pauses around without removing them. The actual fix came from a flame graph of *allocations*, not GC: the validation step parsed each request into a tree of small short-lived objects, allocating tens of thousands of them per request and discarding them all immediately. The collector wasn't slow; it was drowning. The young generation filled in seconds, forcing constant minor collections, and the churn occasionally promoted junk into the old generation and triggered the expensive major collection that caused the visible stall. The real fix was to *allocate less*: reuse a parser buffer across requests, replace the object tree with a flat parse into primitives, and pool the few objects that did need to persist. Allocation rate dropped by 30x, the young generation stopped thrashing, major collections became rare, and the p99 fell under budget — without touching a single GC flag. The lesson the team wrote down: **in a GC language, the lever is usually how much garbage you make, not how the collector cleans it. "The GC is pausing" almost always means "you are allocating too much."** ::: ## Strategy 4: compile-time ownership and borrowing The fourth strategy is the youngest and the most ambitious: get a GC's safety with RAII's determinism and *no runtime collector at all*, by having the compiler prove memory safety before the program runs. This is Rust's model, and the deep mechanics — the borrow checker, non-lexical lifetimes, interior mutability — are the subject of *Rust: Ownership & Borrowing*. The comparative essence is three rules the compiler enforces: 1. **Every value has exactly one owner**, and when the owner leaves scope the value is dropped — its `Drop` runs, its memory is freed — automatically and exactly once. This is RAII, except the compiler *enforces* it rather than trusting you. One owner means never two (no double-free) and never zero (no leak by construction). 2. **You either move ownership or borrow it.** Moving hands over the single deed and invalidates the old binding — touch it afterward and the program won't compile, so use-after-move is a build error. Borrowing lends temporary access without transferring ownership. 3. **Borrows obey shared-XOR-mutable**: any number of shared read-only references, *or* exactly one mutable reference, never both at once. Aliasing-plus-mutation is the root of data races and use-after-free, and forbidding the combination eliminates both families of bug at compile time. The result is the most striking trade in the chapter. The same C++ ownership *concepts* — `unique_ptr` as a move-only single owner, `shared_ptr` as reference counting, RAII tying cleanup to scope — exist in Rust, but the compiler *checks* them instead of trusting you. A stray copy that would create a second owner, a reference held past its data's death, a share across threads without synchronization: all compile errors, not runtime hazards. ```rust // The owner of `s` is the binding; `s` is dropped (freed) at the closing brace. // No free(), no collector — the compiler inserts the drop at exactly one point. fn process() { let s = String::from("hello"); // owns a heap buffer let n = s.len(); // borrow `s` (shared, read-only) — no ownership moves consume(s); // ownership MOVES into consume; `s` is now dead here // println!("{s}"); // error[E0382]: borrow of moved value `s` println!("{n}"); // fine — n is a plain Copy value } // if `s` still owned a value, it would drop here ``` What it costs is not runtime — there is *zero* runtime overhead, the checks evaporate after compilation — but the programmer's time and the compiler's strictness. Code that aliases and mutates, or holds a reference too long, won't build, and learning to restructure for the borrow checker is the famous Rust learning curve. When a design genuinely needs shared mutation, Rust provides escape valves with explicit costs — `Rc`/`Arc` for shared ownership (reference counting, exactly like `shared_ptr`), `RefCell`/`Mutex` for runtime- checked mutation — each trading a compile-time guarantee for a runtime cost you now manage, as *Rust: Ownership & Borrowing* details. The bet is that paying the cost once, at compile time, in the programmer's head, is cheaper over a system's life than paying it forever in runtime pauses (the GC trade) or in 3 a.m. crashes (the manual trade). ![Four answers to who frees memory: manual free, RAII destructors, garbage collection, and compile-time ownership — arranged by when the decision is made (compile time versus run time) and what it costs (determinism versus throughput versus safety).](../assets/diagrams/rendered/memory_management.svg){#fig-memory-management .lightbox} ## The determinism versus throughput trade-off Lay the four strategies on a single axis and the trade-offs snap into focus. The axis is *when the decision to free is made*, and everything else follows from it. Manual and RAII free at a **deterministic, known point** — manual when you call `delete`, RAII when the scope exits. You can point at the line. This is what latency-critical systems need: a destructor that runs in a bounded few nanoseconds, every time, with no possibility of a collector deciding to walk the heap mid-request. The price of determinism is that *someone decided ownership* — you, in manual; you-but-checked, in RAII and ownership. Garbage collection frees at a **nondeterministic, runtime-chosen point**, and that is both its gift and its tax. The gift is that you reason about *nothing* — no ownership, no lifetimes, no destructors. The tax is the pause and the throughput overhead: the collector needs spare CPU and spare memory (a heap noticeably larger than the live set) to run efficiently, and even the best low-pause collectors steal *some* cycles and impose *some* worst-case stall. For throughput-oriented batch work this is a great deal — a few percent overhead for total safety and faster development. For a hard real-time audio callback that must finish in 5ms or the speaker clicks, any unpredictable pause is disqualifying. Compile-time ownership is the one that appears to cheat the trade: deterministic *and* safe *and* zero runtime cost. The catch is that the cost didn't vanish, it *moved* — from runtime to compile time, from the machine to the programmer. You pay it in a stricter compiler and a steeper learning curve, up front, once. Whether that's a bargain depends on the system's lifetime: amortized over years of a long-lived, latency-critical service it is cheap; spent on a script that runs once it is absurd. The table makes the four-way comparison concrete: | Property | Manual (C/C++) | RAII (C++) | GC (Java/Go/Py/JS) | Ownership (Rust) | |---|---|---|---|---| | **When freed** | When you call `free` | Deterministically, at scope exit | Nondeterministically, at runtime | Deterministically, at scope exit | | **Who decides** | Programmer | Programmer (compiler inserts) | Runtime | Compiler (you state ownership) | | **Runtime cost** | None | None | Collector CPU + pauses + memory | None | | **Memory safety** | None — all bugs possible | Mostly (raw pointers leak through) | Full | Full | | **Worst-case latency** | Predictable | Predictable | Pause-prone | Predictable | | **Cost paid by** | Programmer (and prod) | Programmer | The running program | Programmer (compile time) | There is no universally best column. The 02:14 page was a GC system whose *workload* didn't fit GC's worst case; the daily-crash daemon was a manual system whose *complexity* exceeded what manual discipline can sustain. The skill is reading which trade your workload can afford. ## Resource management beyond memory Memory is the loudest resource, but it is not the only one — and the most elegant insight in this whole area is that **the same mechanism that frees memory frees everything else**. A file handle, a network socket, a mutex lock, a database connection: each must be released exactly once, on every exit path including the error paths, or you leak file descriptors, deadlock on a lock you forgot to release, or exhaust a connection pool. This is the ownership question again, wearing a different resource, and every language answers it with the same idiom it uses for memory — generalized to "do this cleanup when control leaves this region." C++ does it with **RAII** directly: the lock is a stack object, its destructor releases on scope exit. Rust does it with **`Drop`**, the same trait that frees memory: implement `Drop` and your cleanup runs at the owner's scope exit, even on a panic. These two are the purest form — cleanup is welded to the object's lifetime, with nothing for the programmer to remember. The GC languages can't use that mechanism, because *they have no deterministic destructor* — the collector runs whenever it likes, so you cannot tie a socket's close to "when this object is collected" (a *finalizer*, the GC equivalent, is notoriously unreliable precisely because its timing is undefined). So they bolt on an explicit scope construct: Python's **`with`**, Java's **try-with-resources**, C#'s **`using`**, and Go's **`defer`**, which schedules cleanup to run when the function returns. They are syntactically different and semantically identical: *register cleanup, guarantee it runs when this region ends.* Here is the same "open, use, guarantee close" across all six: ```cpp // C++ — RAII: the lock_guard's destructor releases the mutex at scope exit, // even if use() throws. Nothing to remember; the type does it. void critical() { std::lock_guard<std::mutex> guard(mtx); // acquires here use_shared_state(); } // releases here, on every path ``` ```rust // Rust — Drop: the MutexGuard releases when it leaves scope, even on panic. // Same mechanism that frees memory, applied to a lock. fn critical(m: &Mutex<State>) { let guard = m.lock().unwrap(); // acquires; guard owns the lock use_shared_state(&guard); } // guard dropped here → lock released ``` ```python # Python — `with`: __exit__ runs on block exit, exception or not. The context # manager is the GC-language stand-in for a deterministic destructor. def critical(path: str) -> str: with open(path) as f: # __enter__ acquires return f.read() # __exit__ closes f, even on this early return ``` ```java // Java — try-with-resources: close() is called automatically at block exit, // in reverse order of acquisition, on every path including exceptions. String critical(Path path) throws IOException { try (var reader = Files.newBufferedReader(path)) { // acquires return reader.readLine(); } // close() called here } ``` ```go // Go — defer: schedules f.Close() to run when the function returns, no matter // which path it takes. The cleanup sits right next to the acquisition. func critical(path string) (string, error) { f, err := os.Open(path) if err != nil { return "", err } defer f.Close() // runs at function return, on every path return readAll(f) } ``` The comparison table makes the family resemblance explicit: | Language | Idiom | When cleanup runs | Tied to | |---|---|---|---| | C++ | RAII destructor | Scope exit (deterministic) | Object lifetime | | Rust | `Drop` | Scope exit (deterministic) | Object lifetime | | Python | `with` (context manager) | Block exit | Lexical block | | Java | try-with-resources | Block exit | Lexical block | | C# | `using` | Block exit | Lexical block | | Go | `defer` | Function return | Function call | Two differences are worth internalizing. First, C++ and Rust tie cleanup to the *object*, so it composes automatically — a struct of ten resources cleans up all ten with no code, because each member's destructor/`Drop` runs. The `with`/`defer` family ties cleanup to a *syntactic region*, so you must remember to write the construct for each resource; forget the `with` and the file leaks. Second, Go's `defer` runs at *function* return rather than block exit, which is why a `defer` inside a loop is a classic bug — the cleanups stack up and don't run until the whole function ends, exhausting handles mid-loop. The deep lesson is the unifying one: **deterministic cleanup is the same problem as deterministic memory freeing, and a language's answer to one is its answer to the other.** That is why GC languages, which gave up deterministic memory freeing, all had to add a *separate* explicit construct for non-memory resources — the one thing their collector cannot do for them. > **Build it →** Deterministic resource management under real load: the Rust > [Project 03: High-Performance Cache](https://github.com/jchu0/applied-cs-projects/tree/main/03-high-performance-cache) > leans on `Drop`, `Arc`, and pooled buffers to manage memory and connections without a GC > in a latency-critical hot path. ## Knowing when memory is your bottleneck A closing piece of judgment that ties the strategies to practice: most of the time, memory management is *not* your bottleneck, and the worst thing you can do is optimize it on a hunch. The discipline is to measure first (the *Performance and Profiling* material), then act on what the profile says. In a **GC language**, watch two numbers: allocation rate and pause time/frequency. If your p99 latency has spikes that line up with GC events, the lever is almost always *allocate less* — pool and reuse objects, avoid per-request object trees, prefer value types and slices over boxed objects — exactly the 02:14 war story. Tuning the collector moves pauses around; reducing allocation removes them. Only after you've cut allocation does collector tuning (heap size, choosing ZGC over G1) earn its keep. In a **manual or RAII language**, the questions are different: are you heap-allocating things that could live on the stack? Is `shared_ptr`'s atomic refcount showing up hot because you defaulted to it instead of `unique_ptr`? Is `malloc` itself the bottleneck — a simulation churning millions of same-sized objects — in which case an **arena** (bump- allocate, free everything at once by resetting a pointer) or a **pool** (fixed-size slots on a free list) can beat the general-purpose allocator? These are sharp tools for narrow problems; reach for them when a profiler, not intuition, says allocation is hot. The through-line, whatever the language: the strategy is a property of the workload's latency budget, and you discover whether it's the bottleneck by measuring, never by guessing. > **Build it →** When allocation itself is the hot path, custom allocators earn their > keep: [Project 39: GPU Memory Manager](https://github.com/jchu0/applied-cs-projects/tree/main/39-gpu-memory-manager) > implements pooling and arena strategies for a resource where the general-purpose > allocator is far too slow, and > [Project 14: Network Stack](https://github.com/jchu0/applied-cs-projects/tree/main/14-network-stack) > manages packet buffers with deterministic, zero-GC lifetimes under line-rate load. --- ## Practical exercise **Difficulty:** Level I · Level II · Level III 1. **Level I — Answer the ownership question, four ways.** Take a tiny program that manages one heap object across an early-return or error path — read a buffer, use it, release it. Write it four times: manual C with `malloc`/`free`, C++ with `unique_ptr`, a GC language (Python or Go) with no free at all, and Rust with ownership. For each version, write one sentence answering *who frees this object and at exactly which line*. The manual version is the only one where you can't be sure; that uncertainty is the whole point of the other three. 2. **Level II — Make and measure garbage.** In a GC language, write a hot loop that allocates a flood of short-lived objects (an object tree per iteration), and turn on GC logging. Record allocation rate, pause frequency, and p99 latency. Now rewrite the loop to allocate far less — reuse a buffer, flatten the tree to primitives, pool the survivors — and record the same numbers. Report the deltas and explain, from the GC's perspective, *why* allocating less moved the latency more than any collector flag would have. 3. **Level III — Defend a strategy choice for a real workload.** Pick a concrete system — a 5ms-deadline audio callback, a multi-terabyte-heap analytics service, a CRUD web backend, an OS network driver — and argue which of the four strategies (and which language) fits it, in terms of *worst-case* latency, throughput, safety, and team cost. Then argue the strongest case for the *wrong* choice and rebut it. The deliverable is the reasoning: a defensible mapping from a workload's latency budget and risk tolerance to a memory-management strategy, with the trade you accepted named explicitly. ## Summary Every memory-management strategy answers one question — *who frees the heap, and when?* — and each answer trades something for something else. The stack is the easy case: it manages itself, so prefer it and reach for the heap only when lifetime or size forces you. For the heap there are four strategies. **Manual** management gives total control and zero safety, inviting double-free, use-after-free, and leaks because a raw pointer can't say who owns what. **RAII** ties heap allocations to stack objects whose destructors free them deterministically at scope exit, encoding ownership in the type with `unique_ptr`, `shared_ptr`, and `weak_ptr` — pause-free determinism, paid for by stating ownership yourself. **Garbage collection** lets a runtime trace or refcount unreachable objects and free them automatically — total safety and zero ownership reasoning, paid for with nondeterministic pauses and collector overhead. **Compile-time ownership** has the compiler prove safety and insert the free, getting determinism *and* safety at zero runtime cost, paid for in compiler strictness and a learning curve. The same mechanisms generalize beyond memory: RAII, `Drop`, `with`, try-with-resources, `using`, and `defer` are one idea — deterministic cleanup of files, sockets, and locks — and a language's answer for memory is its answer for every resource. ### Key takeaways - Prefer the stack; the heap is the only place the "who frees this, and when?" question exists, and every strategy is an answer to it. - The four strategies sort by *when the decision is made*: manual and GC decide at runtime, RAII and ownership at compile time — and the earlier the decision, the more the cost shifts from the running program to the programmer. - GC buys total safety and zero ownership reasoning at the price of nondeterministic pauses; in a GC language the lever for latency is almost always *allocate less*, not tune the collector. - RAII (C++) and ownership (Rust) give deterministic, pause-free freeing; Rust adds compile-time safety at zero runtime cost, moving the price to the compiler and the learning curve. - Resource management beyond memory is the same problem: RAII/`Drop` tie cleanup to object lifetime; `with`/`defer`/try-with-resources/`using` tie it to a scope — and GC languages *need* the explicit construct because their collector can't free resources deterministically. ### Connections to other chapters - **Software Engineering Overview** (prerequisite): frames performance, safety, and language choice as engineering trade-offs rather than matters of taste — this chapter is that framing applied to the single decision (how memory is reclaimed) that most distinguishes one systems language from another. - **C++: Fundamentals** and **C++: Modern C++** (field guides): Fundamentals teaches the RAII lifecycle this chapter generalizes; Modern C++ holds the deep smart-pointer and move-semantics walkthrough — the `unique_ptr`/`shared_ptr`/`weak_ptr` mechanics, the rule of zero, and `noexcept` moves — that this chapter summarizes comparatively rather than duplicates. - **Rust: Fundamentals** and **Rust: Ownership & Borrowing** (field guides): Ownership & Borrowing is the deep version of strategy 4 here — the borrow checker, lifetimes, shared-XOR-mutable, interior mutability, and `Rc`/`Arc` — where this chapter teaches only the comparative model and defers the mechanics. **Rust: Unsafe** shows what you give up when you step outside the checked subset. - **Go: Fundamentals** (field guide): the concrete home of Go's concurrent low-pause collector, escape analysis, and `defer` — the GC and resource-cleanup story this chapter contrasts against the others. - **Concurrency and Parallelism Models** (sibling): aliasing-plus-mutation is the root of both use-after-free and data races, so the ownership rules here are the same rules that make `Arc<Mutex<T>>` and Go's "share by communicating" safe — memory management and concurrency safety are two faces of one discipline. - **Performance and Profiling** (sibling): you decide whether memory is your bottleneck by measuring allocation rate and pause time, not by guessing — the methodology for trusting those numbers, and for telling a GC pause from a slow query, comes from there. - **GPU and CUDA** and the **GPU Memory Manager** project (extension): GPU memory is a manually managed, non-collected resource where pooling and arenas are mandatory, not optional — the manual and arena strategies here, applied where there is no GC and allocation is the hot path. The **High-Performance Cache** project applies RAII and pooling to a latency-critical service. ## Further reading ### Essential - Scott Meyers, *Effective Modern C++* — Items 18–22 (smart pointers) and 23–25 (move semantics): the canonical practical treatment of RAII-based ownership. - *The Rust Programming Language* (Klabnik & Nichols), the ownership chapters — the example-first introduction to compile-time ownership, moves, borrowing, and lifetimes. ### Deep dives - Richard Jones, Antony Hosking & Eliot Moss, *The Garbage Collection Handbook* — the definitive reference on tracing, generational, concurrent, and reference-counting collectors and the trade-offs between them. - *Getting Started with the Z Garbage Collector* and the Go runtime's GC design docs — two modern, opposite bets on the latency-versus-throughput axis, documented by their designers. - Jung et al., *RustBelt: Securing the Foundations of the Rust Programming Language* (POPL 2018) — the formal proof that compile-time ownership actually delivers the safety it claims. ### Historical context - John McCarthy, *"Recursive Functions of Symbolic Expressions"* (1960) — introduces garbage collection alongside Lisp, the origin of the whole tracing tradition. - Bjarne Stroustrup, *The C++ Programming Language* — RAII and deterministic destruction from the designer who built the language around them, the deliberate road *not* taken toward a collector. - *Cyclone: A Safe Dialect of C* (Jim et al., 2002) — the region-based and linear-type research lineage that compile-time ownership descends from, showing the idea predates Rust by decades.