Go: Web Services & gRPC

Keywords

go, grpc, protobuf, gin, echo, net/http, microservices, http handlers, middleware, production go, service

Introduction

Two teams shared a JSON contract for a User object, the way teams always do: a Confluence page, a couple of example payloads, and a tacit agreement that the producer would keep its end of the bargain. For a year, it held. Then someone on the producer team renamed created_at to createdAt to match a new style guide, shipped it on a Tuesday, and went home. Nothing broke in their tests — their service emitted valid JSON, and their tests checked valid JSON. The consumer’s deserializer, reading a field that no longer existed, quietly filled in the zero value: every user in the downstream billing report was now created at the Unix epoch, January 1st 1970. Invoices went out dated fifty-four years in the past. The bug had been committed on Tuesday and discovered by an angry customer on Thursday, and in between, every test on both sides was green. Nothing in the toolchain had any way to know the two services no longer agreed on what a User was, because the contract lived in prose, not in code.

This is the failure mode hand-rolled JSON invites. A REST/JSON boundary between services is a contract the compiler cannot see: the producer’s struct and the consumer’s struct are two independent declarations that happen to line up, until one day they don’t — and the mismatch surfaces not at build time but in production, as silently corrupted data rather than a clean crash. The fix is not “test harder.” It is to make the contract a thing the build system can check: a single typed schema that generates both sides’ types, so renaming a field is a compile error on every service that touches it. That schema is protobuf, the wire format is gRPC, and the language that made this pairing the default for backend infrastructure is Go.

Go is the cloud-service language for three reasons that compound. Concurrency is cheap — a goroutine per request costs about two kilobytes, so the standard concurrency model is the server model, with no async/await ceremony. Deployment is trivial — the output is a single static binary that runs anywhere with no interpreter and no runtime to install. And for the service-to-service calls that make up the interior of any real system, gRPC plus protobuf give you a typed, versioned contract compiled into both ends, eliminating the entire class of drift bugs that the billing team learned about the hard way. This chapter is about building Go services that take all three seriously.

The Core Insight

The insight is that Go was built, from the language up, for network services, and the two things that make a service operable — handling many concurrent requests and deploying without ceremony — are defaults rather than achievements.

Consider what a web server must do: accept thousands of simultaneous connections, each mostly blocked on a database query or a downstream call, and keep the CPU busy serving others meanwhile. The classic answers are bad in opposite ways. A thread per request is simple but expensive — an OS thread costs a megabyte of stack and a real context switch, so a Java servlet container caps out around a couple hundred threads. An event loop is cheap but viral — it scales well and infects the codebase with callbacks and the async/await coloring problem, where one blocking call anywhere poisons the chain. Go refuses the trade. A goroutine is a function scheduled by the Go runtime, not the kernel; it starts with a ~2 KB stack that grows on demand, and the runtime multiplexes hundreds of thousands of them onto a small pool of OS threads. When a goroutine blocks on I/O, the runtime parks it and runs another on the same thread. So you write a handler in plain, blocking, top-to-bottom code — call the database, wait for the row, return it — and get event-loop concurrency for free, because net/http already runs every request in its own goroutine. The concurrency model is the server model. There is nothing to bolt on.

The second half of the insight is the contract. Inside a system, services call each other constantly, and every one of those calls is a place where two codebases must agree on a data shape. Hand-rolled JSON makes that agreement invisible to the build. Protobuf makes it a compiled artifact: you write the shape once in a .proto file, and protoc generates strongly-typed Go structs and a typed client and server for both sides. Change the schema, regenerate, and the mismatch is a red compiler error in every service before anything ships. The contract is the code.

A mental model

Picture a Go service as a fan of goroutines spreading out from a single listener. The listener accepts a connection, spawns a goroutine, and the goroutine runs your handler start to finish — decode, do the work, encode, done — then evaporates. Ten thousand requests in flight are ten thousand goroutines, each a cheap green thread the runtime juggles; there is no pool to size, no executor to tune, no callback to register. That fan-out is the whole runtime picture of an HTTP server in Go.

Now picture the contract. The .proto file is the single source of truth — one document that describes the service’s methods and message types. Run it through the generator and it splits into two halves that fit together like a key and a lock: a typed client stub that the caller compiles in, and a typed server stub that the callee implements. Neither side hand-writes the wire format; both are generated from the same file. When the contract changes, both halves change together, and the compiler enforces that they still match. Figure 18.1 shows both ideas at once — the goroutine fan-out on the edge, the one-contract-two-stubs symmetry in the interior.

REST vs gRPC

The two protocols are not competitors so much as tools for different edges of the same system, and choosing between them is mostly about who is calling.

Reach for REST over JSON at the public edge and for anything a browser or a third-party touches. JSON is human-readable, debuggable with curl, cacheable by ordinary HTTP infrastructure, and supported by every client on earth without code generation. A public API, a mobile backend, a webhook, a simple CRUD endpoint — these want REST. The loose, text-based contract that bites you between your own services is a feature when the caller is the whole internet and you can’t recompile them.

Reach for gRPC with protobuf for internal, service-to-service traffic, especially where throughput and latency matter. It rides HTTP/2, serializes to compact binary instead of verbose text, supports streaming in both directions, and — the headline — gives you a typed contract that catches drift at build time. The cost is that it is not browser-native (it needs a proxy like gRPC-Web for the browser) and it is harder to poke at by hand. That cost is irrelevant inside a cluster where both ends are services you control and recompile together.

The common production shape, then, is both: gRPC for the east-west calls between internal services, fronted by a gateway that speaks REST/JSON north-south to the outside world. This chapter is deliberately narrow on the general distributed-systems theory — service decomposition, resilience, circuit breakers, the gateway pattern — all of which is developed in Python: Microservices and applies here unchanged. Our focus is the Go-specific machinery: how you build the HTTP edge, how you define and generate the gRPC contract, and what makes the resulting binary operable in production.

What you’ll learn

  • Why net/http is production-grade on its own, and what Gin and Echo actually add on top of it
  • How the http.Handler interface makes routers, frameworks, and middleware all the same shape — composable wrappers around one method
  • How to compose a middleware chain (recovery, logging, auth) and why order matters
  • How to define a service contract in protobuf, generate typed client and server stubs, and implement unary and streaming RPCs
  • What a typed contract buys you that hand-rolled JSON cannot, and where each belongs
  • How to make a Go service operable: graceful shutdown, context deadlines propagated end-to-end, health checks, and the static-binary deploy
  • Where to hook in structured logging, metrics, and tracing without coupling them to your business logic

Prerequisites

  • Go: Fundamentals — structs, interfaces, methods, and idiomatic error handling. The http.Handler interface and the repository pattern lean hard on Go’s interface model.
  • Concurrency and Parallelism Models — goroutines, channels, and the context package (the Go material in that chapter). This is the load-bearing prerequisite: the whole server model rests on goroutine-per-request, and graceful shutdown, deadlines, and cancellation are context in disguise.
  • Working comfort with HTTP semantics (methods, status codes, headers) and JSON.

HTTP in Go: the standard library is the framework

Most ecosystems teach you the web framework first and the standard library never. Go inverts this, and for once the convention is right: net/http is not a toy you graduate from but a production HTTP server you can ship as-is. It handles keep-alive, HTTP/2, TLS, graceful shutdown, and — the part that matters most — it already runs every request in its own goroutine. You do not opt into concurrency; you get it by writing a handler.

The entire server abstraction is one interface, and internalizing it is most of understanding Go web code:

// The whole contract. A handler is anything that can serve one request.
type Handler interface {
    ServeHTTP(w http.ResponseWriter, r *http.Request)
}

Everything composes around that single method. A function with the right signature becomes a handler via http.HandlerFunc; a router (http.ServeMux) is a handler that dispatches to other handlers by path; a middleware is a handler that wraps another handler. Routers, frameworks, and middleware are all just Handler values wearing different hats — which is why you can mix and match them freely. A minimal but real handler reads the request and writes a typed response, and because it runs in its own goroutine, a blocking call inside it blocks only this request:

// Each call to this handler runs on its own goroutine; a slow DB query here
// parks this goroutine and frees the thread for other requests.
func (h *UserHandler) getByID(w http.ResponseWriter, r *http.Request) {
    id := r.PathValue("id") // Go 1.22+ ServeMux supports path parameters natively
    user, err := h.store.GetByID(r.Context(), id)
    if errors.Is(err, ErrNotFound) {
        http.Error(w, `{"error":"user not found"}`, http.StatusNotFound)
        return
    }
    w.Header().Set("Content-Type", "application/json")
    json.NewEncoder(w).Encode(user)
}

So where do Gin and Echo come in? Not for power — for ergonomics. The standard ServeMux got real routing in Go 1.22 (method matching, /users/{id} path parameters), but it still leaves you to hand-roll request binding, validation, and the small repetitive things a CRUD handler does a hundred times. Gin and Echo are thin layers that add a Context object bundling the request and response, declarative JSON binding with validation tags, route groups, and a built-in middleware mechanism. Gin’s handler does in three lines what the standard library does in fifteen:

// Gin: ShouldBindJSON decodes AND validates against the struct's binding tags
// in one call; a bad payload becomes a 400 without hand-written checks.
func (h *UserHandler) Create(c *gin.Context) {
    var req CreateUserRequest // fields tagged `binding:"required,email"` etc.
    if err := c.ShouldBindJSON(&req); err != nil {
        c.JSON(http.StatusBadRequest, gin.H{"error": err.Error()})
        return
    }
    user := h.service.Create(c.Request.Context(), req)
    c.JSON(http.StatusCreated, user)
}

The decision is unglamorous. For a handful of endpoints, a serverless function, or a service where you want zero dependencies and total control, net/http is genuinely enough and starts the fastest. For a real API with many endpoints, validation, and versioned route groups, reach for Gin (the default — biggest ecosystem, excellent docs) or Echo (very similar, slightly more flexible binding). Either way you are still writing http.Handlers underneath; the framework is a convenience, not a different universe. Pick one and move on — the choice matters far less than the patterns layered on top.

Middleware: cross-cutting concerns as composed handlers

Every service needs the same handful of things on every request: recover from panics so one bad handler doesn’t crash the process, log the request and its latency, authenticate the caller, maybe rate-limit. These are cross-cutting concerns — they don’t belong to any one handler, and copy-pasting them into all of them is how you forget one. Middleware is the answer, and in Go it falls directly out of the http.Handler insight: a middleware is a function that takes a handler and returns a handler, doing something before and after it delegates.

// A logging middleware: a handler that wraps another handler.
func RequestLog(logger *slog.Logger) func(http.Handler) http.Handler {
    return func(next http.Handler) http.Handler {
        return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
            start := time.Now()
            next.ServeHTTP(w, r) // run the rest of the chain
            logger.Info("request",
                "method", r.Method, "path", r.URL.Path,
                "duration", time.Since(start))
        })
    }
}

Because each middleware wraps the next, you build the request pipeline by composing them, and the composition order is the execution order. The shape to internalize is an onion: the request travels inward through each layer to reach your handler, and the response travels back outward through the same layers in reverse. Recovery goes outermost so it can catch a panic from anything inside it; logging goes next so it times the whole chain; auth goes last among the cross-cutters so an unauthenticated request is rejected before it reaches expensive business logic.

// Outermost runs first on the way in, last on the way out.
// recover( log( auth( router ) ) )
handler := Recover(logger)(RequestLog(logger)(Auth(cfg)(router)))

Gin and Echo dress this up with a Use() method and a c.Next() call, but the mechanism is identical: an ordered chain of wrappers, each free to short-circuit (return a 401 and not call next) or to enrich the request context for handlers downstream (stash the authenticated user ID so the handler can read it). One discipline is non-negotiable: when a middleware rejects a request, it must stop the chain — call c.Abort() in Gin, or simply return without calling next in standard Go. Forget it, and an auth middleware that writes a 401 also runs the handler, serving protected data to an unauthenticated caller after telling them no.

gRPC and protobuf: the typed contract

Here is the centerpiece, and the thing that makes Go the default for service interiors. The billing disaster from the introduction happened because the contract between two services lived in prose. gRPC moves it into a compiled file. You describe the service once — its methods, its message types, the field numbers that make the wire format forward-compatible — in a .proto file:

// user.proto — the single source of truth for both client and server.
syntax = "proto3";
package user.v1;
option go_package = "github.com/yourorg/svc/gen/user/v1";

service UserService {
  rpc GetUser(GetUserRequest) returns (User);
  rpc ListUsers(ListUsersRequest) returns (stream User); // server streaming
}

message User {
  string id = 1;           // field numbers, not names, define the wire format
  string name = 2;
  string email = 3;
}
message GetUserRequest  { string id = 1; }
message ListUsersRequest { int32 page_size = 1; }

That file is not documentation; it is source. Run it through the generator and it produces, for both sides, the Go structs for every message, a typed client whose methods are the RPCs, and a server interface you implement:

# Generates typed Go: message structs, a UserServiceClient, and an
# UnimplementedUserServiceServer to embed. Both halves from one .proto.
protoc --go_out=. --go-grpc_out=. user.proto

The payoff is exactly the bug the billing team couldn’t catch. Rename email to email_address in the .proto, regenerate, and every call site that referenced the old field becomes a compile error — on the producer and on every consumer — before a single byte ships. The field numbers (the = 1, = 2) are what actually travel on the wire, so you can add new fields safely and old clients ignore them, giving you forward and backward compatibility by construction. The contract is typed, versioned, and enforced by the build. Implementing the server is then just satisfying the generated interface, returning typed messages and gRPC status codes instead of HTTP integers:

// The generated interface is the contract; this is the only place you write logic.
func (s *userServer) GetUser(ctx context.Context, req *pb.GetUserRequest) (*pb.User, error) {
    u, err := s.store.GetByID(ctx, req.Id)
    if errors.Is(err, ErrNotFound) {
        return nil, status.Errorf(codes.NotFound, "user %s not found", req.Id)
    }
    return &pb.User{Id: u.ID, Name: u.Name, Email: u.Email}, nil
}

gRPC also gives you streaming as a first-class citizen, which plain request/response HTTP makes awkward. The stream User return type above means the server can push a sequence of messages over one long-lived call — paginating a large result set, tailing a log, feeding a live dashboard — and gRPC supports client-side and bidirectional streaming too, all over a single HTTP/2 connection. The same interceptor mechanism that HTTP middleware gives you applies here: a unary interceptor wraps every RPC for logging, auth, or metrics, exactly as middleware wraps every HTTP handler.

Build it → The repo’s Go system puts all of this together. Project 02: Microservice Platform runs auth, billing, users, and notifications as separate Go services talking over gRPC with protobuf IDLs, fronted by a Kong API gateway — the REST-at-the-edge, gRPC-in-the-interior shape, at production scale: 06-real-world-projects/02-microservice-platform.

Production patterns: making the service operable

A service that compiles is not a service that survives a deploy. The patterns below are what separate a demo from something you can put on-call rotation behind, and in Go they all turn out to be the same few primitives — context and a goroutine — applied with discipline.

Graceful shutdown is the one people skip and regret. When Kubernetes rolls out a new version, it sends your pod a SIGTERM and gives it a grace period to die. The naive server ignores the signal, gets SIGKILLed seconds later, and every in-flight request is dropped mid-response — users see 502s on every deploy. The fix is to catch the signal and call Shutdown, which stops accepting new connections but lets in-flight requests finish, bounded by a context deadline:

// Run the server on a goroutine; block the main goroutine on the signal.
go func() { _ = srv.ListenAndServe() }()

stop := make(chan os.Signal, 1)
signal.Notify(stop, syscall.SIGINT, syscall.SIGTERM)
<-stop // block until the orchestrator tells us to die

// Drain in-flight requests, but no longer than 15s.
ctx, cancel := context.WithTimeout(context.Background(), 15*time.Second)
defer cancel()
_ = srv.Shutdown(ctx) // stops accepting, waits for active requests to finish

Context deadlines are the same idea pushed end-to-end. Every handler receives a context.Context (r.Context() in HTTP, the first argument in gRPC), and the discipline is to thread that same context through every downstream call — the database query, the gRPC call to another service — never replacing it with context.Background(). A deadline set at the edge then propagates through the entire call tree: if the client gives up after two seconds, the cancellation flows all the way down and every service in the chain stops working on a request nobody is waiting for. This is what prevents a slow downstream from piling up wasted work across the whole system under load. gRPC propagates deadlines across the network automatically — but only if you pass the incoming context through; create a fresh one and you sever the chain.

Health checks are how the orchestrator knows whether to send you traffic. A liveness probe answers “is the process wedged?” (restart me if not); a readiness probe answers “can I serve right now?” (hold traffic until my database pool is connected). They are just cheap HTTP endpoints, but the distinction matters: failing liveness restarts the pod, failing readiness merely pulls it from the load balancer. Structured configuration — reading settings from environment variables or a config file into a typed struct at startup, not scattering os.Getenv through the code — keeps the same binary deployable across dev, staging, and prod by changing only the environment. And all of it ships as one static binary: build with CGO_ENABLED=0 and you get a single file with no interpreter, no shared libraries, no runtime to install. Drop it into a scratch or distroless container a few megabytes in size, and the deploy story is “copy a file and run it.” That is the static-binary advantage that makes Go the backbone language for cloud infrastructure.

War story: the contract that drifted, the deploy that dropped

Two failure modes from this chapter, both real, both preventable. The first is the billing drift from the introduction: a renamed JSON field, green tests on both sides, and invoices dated 1970 because the consumer silently zero-filled a field that no longer existed. A protobuf contract would have made the rename a compile error on every service that touched it — the bug could not have shipped, because the build would not have produced a binary. Untyped boundaries don’t fail loudly; they fail quietly, and quiet failures are the expensive ones.

The second is the deploy that dropped requests. A Go service had no Shutdown handler and no ReadTimeout/WriteTimeout on its http.Server. Under normal conditions nobody noticed. Then two things converged: a routine Kubernetes rollout sent SIGTERM, the process ignored it and was SIGKILLed mid-flight, and a wave of slow clients — with no write timeout to cut them off — had been holding connections open. Every in-flight request died on the deploy, and the missing timeouts meant slow clients could exhaust the connection pool the rest of the time. The fix was four lines: catch SIGTERM, call Shutdown with a deadline, and set read/write timeouts on the server. Graceful shutdown and server timeouts are not advanced features; they are the price of admission for running behind an orchestrator.

Observability hooks

Operability needs visibility, and everything you need hooks into the two seams you’ve already built. Structured logging belongs in middleware — a request logger emitting one structured line per request (method, path, status, latency, trace ID) with log/slog, the standard library’s structured logger, so logs are machine-parseable rather than printf soup. Metrics belong in the same seam — a middleware that increments a Prometheus counter and observes a latency histogram per route, exposed on a /metrics endpoint. Distributed tracing rides the context — an interceptor extracts the incoming trace ID (the W3C traceparent header), attaches a span, and threads it through every downstream call, so a request’s path across five services becomes one connected trace in Jaeger or Tempo. The point is that observability lives in the middleware and interceptor layers, not bolted onto your handlers, leaving the business logic clean. The general theory of the three pillars — logs, metrics, traces — is developed in the observability chapter; here the lesson is only that Go’s middleware/interceptor seams are exactly where they attach.


Practical exercise

Difficulty: Level I · Level II · Level III

  1. Level I — A JSON endpoint with a middleware chain. Build a small service with either net/http (Go 1.22+ routing) or Gin that exposes a POST /users endpoint with request validation and a GET /users/{id} endpoint backed by an in-memory store. Add two middleware — a recovery middleware that turns a panic into a clean 500, and a logging middleware that records method, path, status, and latency. Trigger a panic in a handler and confirm the recovery middleware keeps the process alive and the logger still records the request. Write down the order your middleware execute in and why recovery must be outermost.

  2. Level II — A typed gRPC contract. Define a UserService in a .proto file with at least one unary RPC and one server-streaming RPC. Generate the Go stubs with protoc, implement the server, and write a client that calls both RPCs. Then break the contract on purpose: rename a field in the .proto, regenerate, and observe that the compiler now rejects the old call sites on both client and server. In a short paragraph, explain precisely what the typed contract bought you here that a hand-rolled JSON boundary would not have — and tie it to the field-number wire format that makes additive changes safe.

  3. Level III — Production-grade and deployed. Take a service and make it operable. Add graceful shutdown that drains in-flight requests on SIGTERM within a bounded deadline; thread a context deadline from the edge through every downstream call and prove (with a slow downstream and a tight client deadline) that cancellation propagates; add liveness and readiness probes that mean different things; and build it CGO_ENABLED=0 into a scratch or distroless container, recording the image size. Finally, for each edge of your service — the public entry point and each service-to-service call — argue REST-vs-gRPC explicitly, and justify the boundary you chose.

Summary

Go is the backbone language for network services because its two hardest operational problems are defaults rather than features. Concurrency is a goroutine per request — cheap, runtime-scheduled, and already wired into net/http, so the concurrency model is the server model and you write plain blocking handlers that scale. Deployment is a single static binary with no interpreter or runtime to install. On top of that, the http.Handler interface makes routers, frameworks, and middleware one composable shape, so Gin and Echo are ergonomics over the standard library rather than replacements for it. And for the service interior, gRPC plus protobuf turn the contract between services into a compiled artifact — one .proto file generating typed client and server stubs — which catches the drift bugs that hand-rolled JSON lets ship silently. The patterns that make all of this operable — graceful shutdown, end-to-end context deadlines, health checks, observability in the middleware seam — are the difference between code that compiles and a service you can run.

Key takeaways

  • A goroutine per request is the server model; you get event-loop concurrency by writing plain blocking handlers, with no async coloring and no thread pool to size.
  • Everything is an http.Handler — routers, middleware, and frameworks are all composable wrappers around one method; Gin/Echo add ergonomics, not power.
  • Middleware composition order is execution order: recovery outermost, then logging, then auth, and a rejecting middleware must stop the chain.
  • gRPC + protobuf make the service contract a compiled artifact — one .proto generates both stubs, so a schema change is a build-time error on every side, not a 3 a.m. runtime surprise.
  • REST/JSON at the public edge, gRPC for the typed internal interior; the common shape is both, behind a gateway.
  • Graceful shutdown, propagated context deadlines, and read/write timeouts are the price of admission for running behind an orchestrator — not advanced extras.

Connections to other chapters

  • Concurrency and Parallelism Models (prerequisite): the goroutine-per-request server model, graceful shutdown, and end-to-end deadline propagation are all context and goroutines applied to the network. That chapter teaches the primitives — and compares Go’s goroutines and channels with the concurrency models of the other languages; this one is what you build with them.
  • Python: Microservices (sibling): the general distributed-systems theory this chapter deliberately assumes — service decomposition, resilience patterns, circuit breakers, the API-gateway role — is developed there in language-neutral form and applies to Go services unchanged. Read it for the why of the topology; read this for the Go how.
  • Python: Web Development and TypeScript: The Node Ecosystem (siblings): the same service concerns — handlers, middleware, validation, deployment — in other languages. Reading them alongside this chapter makes concrete why Go is so often chosen for the high-throughput, resource-constrained backbone while Python and Node hold the data-science and full-stack edges.
  • Orchestration with Kubernetes (extension, Part V): the static binary this chapter produces is the unit Kubernetes schedules. Graceful shutdown exists because of how Kubernetes rolls out new versions; liveness and readiness probes exist because of how it decides where to send traffic. The deploy story here is the input to that chapter.

Further reading

Essential

  • gRPC documentation (grpc.io) and the Protocol Buffers language guide — the canonical references for service definitions, code generation, streaming, and the field-number wire format that makes schemas evolvable.
  • Alex Edwards, Let’s Go Further — the deep, practical reference for building and operating production HTTP services in Go: middleware, graceful shutdown, structured JSON, and configuration.

Deep dives

  • The net/http package documentation and design (pkg.go.dev) — read the Server, Handler, and Shutdown docs directly; the standard library’s design is the thing every framework is built on.
  • Effective Go and Go Code Review Comments — idiomatic handler structure, error wrapping, and the interface conventions the whole ecosystem follows.

Historical context

  • The gRPC origin story — gRPC is the open-source descendant of Stubby, Google’s internal RPC system that carried service-to-service traffic at Google scale for years before being generalized. The typed-contract-as-source philosophy comes straight from that lineage.
  • Pike, “Go at Google: Language Design in the Service of Software Engineering” — why the language was shaped the way it was, with network services and large-scale deployment as the explicit design target.