Production Readiness Checklist

20 min read

The gap between “it compiles and the tests pass” and “I can page-proof this at 3 a.m.” is filled by a handful of unglamorous concerns: structured logging, honest error handling, timeouts on everything that can hang, limits on everything that can grow unbounded, observability you can query, and a security posture that does not leak. This chapter is the checklist a senior TypeScript/JavaScript developer should run through before a Rust service takes traffic — and the idiomatic, current-stable way to satisfy each item.

Quick Overview

Going to production is not one feature; it is a set of cross-cutting properties your service must hold under load and under failure. The current stable toolchain is Rust 1.96.0 on the 2024 edition; cargo new selects it automatically. The web examples here use axum 0.8 with tower-http middleware and the tracing ecosystem — the same building blocks the other chapters in this section use.

The six pillars this file covers:

Logging — structured (JSON), level-controlled, with secrets redacted and a correlation ID per request.
Errors — one typed error per surface, the right HTTP status, the cause logged but never leaked.
Timeouts — a hard bound on every inbound request and every outbound call. An unbounded await is a latent outage.
Limits — body size, concurrency, and connection caps so one client cannot exhaust the box.
Observability — logs, metrics, traces, and health probes wired up before you need them.
Security — least privilege, no secrets in logs or images, dependency auditing, and a minimal runtime.

Note: The sibling files in this section go deep on individual pillars — metrics.md, distributed-tracing.md, health-checks.md, graceful-shutdown.md, rate-limiting.md, caching.md, configuration.md, and environment.md. This file is the integrating checklist that ties them together.

TypeScript/JavaScript Example

A production-minded Express service on Node v22 bolts these concerns on through middleware. It is the shape most TypeScript developers will recognize:

1
// server.ts — production-hardened Express on Node v22
2
import express, { NextFunction, Request, Response } from "express";
3
import pino from "pino";
4
import pinoHttp from "pino-http";
5
import { randomUUID } from "node:crypto";
6

7
const log = pino({
8
  level: process.env.LOG_LEVEL ?? "info",
9
  // Redact secrets so tokens never reach the log sink.
10
  redact: ["req.headers.authorization", "req.headers.cookie"],
11
});
12

13
const app = express();
14

15
// Correlation ID + structured request logging.
16
app.use(pinoHttp({
17
  logger: log,
18
  genReqId: (req) => (req.headers["x-request-id"] as string) ?? randomUUID(),
19
}));
20

21
// Body-size limit: reject oversized payloads before parsing.
22
app.use(express.json({ limit: "1mb" }));
23

24
// Per-request timeout has to be wired by hand — Express has no built-in.
25
app.use((req: Request, res: Response, next: NextFunction) => {
26
  res.setTimeout(5000, () => res.status(503).json({ error: "timeout" }));
27
  next();
28
});
29

30
app.post("/users", (req: Request, res: Response) => {
31
  const name = String(req.body?.name ?? "").trim();
32
  if (!name) {
33
    return res.status(400).json({ error: "name must not be empty" });
34
  }
35
  res.json({ id: 1, name });
36
});
37

38
// Central error handler: log the real cause, send a safe message.
39
app.use((err: unknown, _req: Request, res: Response, _next: NextFunction) => {
40
  log.error({ err }, "request failed");
41
  res.status(500).json({ error: "internal error" }); // never leak `err`
42
});
43

44
// Outbound calls must be bounded too — fetch has no default timeout.
45
async function fetchUpstream(url: string): Promise<Response> {
46
  return fetch(url, { signal: AbortSignal.timeout(2000) });
47
}
48

49
app.listen(3000, () => log.info("listening on :3000"));

Key points:

Logging, redaction, request IDs, body limits, and timeouts are all opt-in middleware you must remember to add.
The error handler must manually avoid leaking err to the client — nothing in the type system stops you.
fetch has no default timeout; you must pass AbortSignal.timeout. A forgotten one is the classic Node outage.

Rust Equivalent

The same hardened service in axum. Each numbered layer corresponds to a checklist item; the typed error makes “log the cause, return a safe message” the path of least resistance.

Set up the project:

1
cargo new user-service
2
cd user-service
3
cargo add axum
4
cargo add tokio --features full
5
cargo add tower
6
cargo add tower-http --features timeout,trace,request-id,sensitive-headers
7
cargo add tracing
8
cargo add tracing-subscriber --features env-filter,json
9
cargo add serde --features derive
10
cargo add thiserror
11
cargo add anyhow

1
use std::time::Duration;
2

3
use axum::{
4
    extract::{DefaultBodyLimit, Json},
5
    http::{header, StatusCode},
6
    response::{IntoResponse, Response},
7
    routing::post,
8
    Router,
9
};
10
use serde::{Deserialize, Serialize};
11
use tower::ServiceBuilder;
12
use tower_http::{
13
    request_id::{MakeRequestUuid, PropagateRequestIdLayer, SetRequestIdLayer},
14
    sensitive_headers::SetSensitiveRequestHeadersLayer,
15
    timeout::TimeoutLayer,
16
    trace::TraceLayer,
17
};
18
use tracing::instrument;
19
use tracing_subscriber::{layer::SubscriberExt, util::SubscriberInitExt, EnvFilter};
20

21
// One error type for the whole API surface. Each variant maps to a status code.
22
#[derive(Debug, thiserror::Error)]
23
enum ApiError {
24
    #[error("invalid request: {0}")]
25
    Validation(String),
26
    #[error("user {0} not found")]
27
    NotFound(u64),
28
    #[error("internal error")]
29
    Internal(#[from] anyhow::Error),
30
}
31

32
// The body we actually send to clients. The internal cause is logged,
33
// never leaked to the wire.
34
#[derive(Serialize)]
35
struct ErrorBody {
36
    error: String,
37
}
38

39
impl IntoResponse for ApiError {
40
    fn into_response(self) -> Response {
41
        let status = match &self {
42
            ApiError::Validation(_) => StatusCode::BAD_REQUEST,
43
            ApiError::NotFound(_) => StatusCode::NOT_FOUND,
44
            ApiError::Internal(_) => StatusCode::INTERNAL_SERVER_ERROR,
45
        };
46
        // Log the full error server-side at the right level.
47
        if status.is_server_error() {
48
            tracing::error!(error = %self, "request failed");
49
        } else {
50
            tracing::warn!(error = %self, "request rejected");
51
        }
52
        // Clients get a safe message; a 5xx never reveals internals.
53
        let public = if status.is_server_error() {
54
            "internal error".to_string()
55
        } else {
56
            self.to_string()
57
        };
58
        (status, Json(ErrorBody { error: public })).into_response()
59
    }
60
}
61

62
#[derive(Deserialize)]
63
struct CreateUser {
64
    name: String,
65
}
66

67
#[derive(Serialize)]
68
struct User {
69
    id: u64,
70
    name: String,
71
}
72

73
#[instrument(skip(payload), fields(user.name = %payload.name))]
74
async fn create_user(Json(payload): Json<CreateUser>) -> Result<Json<User>, ApiError> {
75
    if payload.name.trim().is_empty() {
76
        return Err(ApiError::Validation("name must not be empty".into()));
77
    }
78
    tracing::info!("user created");
79
    Ok(Json(User { id: 1, name: payload.name }))
80
}
81

82
fn app() -> Router {
83
    // Headers we never want to appear in logs.
84
    let sensitive = [header::AUTHORIZATION, header::COOKIE];
85

86
    Router::new()
87
        .route("/users", post(create_user))
88
        // Reject oversized bodies before allocating (1 MiB cap).
89
        .layer(DefaultBodyLimit::max(1024 * 1024))
90
        .layer(
91
            ServiceBuilder::new()
92
                // 1. Give every request a stable ID for correlating logs.
93
                .layer(SetRequestIdLayer::x_request_id(MakeRequestUuid))
94
                // 2. Redact secrets BEFORE the trace layer reads headers.
95
                .layer(SetSensitiveRequestHeadersLayer::new(sensitive))
96
                // 3. Structured per-request spans/events.
97
                .layer(TraceLayer::new_for_http())
98
                // 4. Hard request timeout: a slow handler returns 408, never hangs.
99
                .layer(TimeoutLayer::with_status_code(
100
                    StatusCode::REQUEST_TIMEOUT,
101
                    Duration::from_secs(5),
102
                ))
103
                // 5. Echo the request ID back to the caller.
104
                .layer(PropagateRequestIdLayer::x_request_id()),
105
        )
106
}
107

108
#[tokio::main]
109
async fn main() {
110
    // JSON logs, level from RUST_LOG (defaults to info). Machine-parseable in prod.
111
    tracing_subscriber::registry()
112
        .with(EnvFilter::try_from_default_env().unwrap_or_else(|_| EnvFilter::new("info")))
113
        .with(tracing_subscriber::fmt::layer().json())
114
        .init();
115

116
    let app = app();
117

118
    // In a real binary you would bind a listener and serve:
119
    //   let listener = tokio::net::TcpListener::bind("0.0.0.0:3000").await.unwrap();
120
    //   axum::serve(listener, app).await.unwrap();
121
    // Here we exercise the pipeline end-to-end without opening a port.
122
    use tower::ServiceExt;
123
    let req = axum::http::Request::builder()
124
        .method("POST")
125
        .uri("/users")
126
        .header("content-type", "application/json")
127
        .header("authorization", "Bearer super-secret-token")
128
        .body(axum::body::Body::from(r#"{"name":"Ada"}"#))
129
        .unwrap();
130
    let resp = app.oneshot(req).await.unwrap();
131
    let status = resp.status();
132
    let bytes = axum::body::to_bytes(resp.into_body(), usize::MAX)
133
        .await
134
        .unwrap();
135
    println!("status = {status}");
136
    println!("body   = {}", String::from_utf8_lossy(&bytes));
137
}

Running it with RUST_LOG=info,tower_http=debug cargo run produces real structured output (the trace layer emits request/response spans; your handler’s info! nests inside the request span):

1
{"timestamp":"2026-06-02T06:51:10.937712Z","level":"DEBUG","fields":{"message":"started processing request"},"target":"tower_http::trace::on_request","span":{"method":"POST","uri":"/users","version":"HTTP/1.1","name":"request"},"spans":[{"method":"POST","uri":"/users","version":"HTTP/1.1","name":"request"}]}
2
{"timestamp":"2026-06-02T06:51:10.937895Z","level":"INFO","fields":{"message":"user created"},"target":"probe","span":{"user.name":"Ada","name":"create_user"},"spans":[{"method":"POST","uri":"/users","version":"HTTP/1.1","name":"request"},{"user.name":"Ada","name":"create_user"}]}
3
{"timestamp":"2026-06-02T06:51:10.937974Z","level":"DEBUG","fields":{"message":"finished processing request","latency":"0 ms","status":200},"target":"tower_http::trace::on_response","span":{"method":"POST","uri":"/users","version":"HTTP/1.1","name":"request"},"spans":[{"method":"POST","uri":"/users","version":"HTTP/1.1","name":"request"}]}
4
status = 200 OK
5
body   = {"id":1,"name":"Ada"}

Note the authorization header is set sensitive, so even at debug it never appears in the logged request fields.

Detailed Explanation

Logging: structured, leveled, redacted

The tracing_subscriber::registry() builder composes two layers: an EnvFilter that reads RUST_LOG (falling back to info), and a fmt layer in .json() mode. JSON is the right default in production because your log shipper (Loki, CloudWatch, Datadog) parses fields, not free text. The #[instrument] attribute on create_user opens a span carrying user.name; every tracing::info! inside it inherits that context, so a single log line tells you which user without manual string interpolation. Contrast with console.log: in Node you concatenate context by hand and hope every call site remembers to.

Redaction is structural, not a regex over the final string. SetSensitiveRequestHeadersLayer marks authorization and cookie as sensitive before TraceLayer reads the headers, so the secret is never rendered. Layer order matters: redaction must come before tracing in the ServiceBuilder stack.

Errors: typed, mapped, never leaked

ApiError is a single thiserror enum for the whole surface. Its IntoResponse impl is the one place that maps a variant to an HTTP status, logs the real cause at the correct level (error! for 5xx, warn! for client errors), and — critically — returns a generic body for server errors. A 4xx echoes a useful message; a 5xx says only "internal error". The #[from] anyhow::Error arm lets any deep failure bubble up with ? and land as a 500 without you writing a conversion at every call site. See Section 08: Error Handling for the Result/?/thiserror/anyhow foundations.

Timeouts: bound everything

TimeoutLayer::new(Duration::from_secs(5)) caps inbound request processing. Outbound calls need their own bound — wrap them in tokio::time::timeout (shown in Best Practices). Rust does not save you here automatically: a future that .awaits a hung socket waits forever unless something cancels it. This is the same trap as a missing AbortSignal in Node, just enforced by the same explicitness.

Limits: cap what can grow

DefaultBodyLimit::max(1024 * 1024) rejects bodies over 1 MiB with 413 Payload Too Large before buffering them — a cheap defense against memory-exhaustion. Production services also cap concurrency (tower::limit::ConcurrencyLimitLayer or tower::load_shed) and per-client request rate (see rate-limiting.md). axum’s DefaultBodyLimit is preferred over a raw tower-http body-limit layer because it integrates with extractors and returns the correct status cleanly.

Observability and security

The request ID (SetRequestIdLayer + PropagateRequestIdLayer) is the thread that stitches logs, metrics, and traces together and is echoed back to the caller as x-request-id for support tickets. Metrics and distributed traces extend this — covered in metrics.md and distributed-tracing.md. Security shows up as redaction, generic 5xx bodies, body limits, and — at the deployment layer — a minimal image and dependency auditing (covered below and in Section 27: Security).

Key Differences

Concern	TypeScript / Node (Express)	Rust (axum + tower)
Structured logging	`pino`/`winston`, opt-in; context concatenated by hand	`tracing` spans propagate context automatically
Log redaction	`redact` path list over the object	Headers marked sensitive structurally before rendering
Inbound timeout	Manual `res.setTimeout`; no framework default	`TimeoutLayer` as a composable middleware
Outbound timeout	`AbortSignal.timeout` per `fetch`; easy to forget	`tokio::time::timeout`; equally explicit, type-checked
Body-size limit	`express.json({ limit })`	`DefaultBodyLimit::max` → real `413`
Error leakage	Must remember not to send `err`	Typed `IntoResponse` makes the safe path the default
Panic isolation	An unhandled throw can crash the process	`catch_panic` turns a panic into a `500`; worker survives
Concurrency model	Single-threaded event loop	Multi-threaded runtime, but opt-in and explicit
Config at startup	Reads `process.env` lazily; fails late	Validate into a typed struct; fail fast (see environment.md)

Tip: Rust is not “multi-threaded by default.” #[tokio::main] starts a multi-thread runtime, but you choose that; #[tokio::main(flavor = "current_thread")] gives a single-threaded one. Concurrency is fearless and opt-in, not implicit.

Common Pitfalls

Forgetting `IntoResponse` on your error type

A handler must return something axum knows how to turn into a response. Return a bare error type and the bound fails — with a message that, while long, points you at the fix:

1
use axum::{routing::get, Router};
2

3
#[derive(Debug)]
4
struct MyError;
5

6
// does not compile (error[E0277]: the trait bound `... : Handler<_, _>` is not satisfied)
7
async fn handler() -> Result<String, MyError> {
8
    Err(MyError)
9
}
10

11
fn main() {
12
    let _app: Router = Router::new().route("/", get(handler));
13
}

The real error from cargo build:

1
error[E0277]: the trait bound `fn() -> impl Future<Output = Result<String, MyError>> {handler}: Handler<_, _>` is not satisfied
2
   --> src/main.rs:12:53
3
    |
4
 12 |     let _app: Router = Router::new().route("/", get(handler));
5
    |                                                 --- ^^^^^^^ the trait `Handler<_, _>` is not implemented for fn item `fn() -> impl Future<Output = Result<String, MyError>> {handler}`
6
    |
7
    = note: Consider using `#[axum::debug_handler]` to improve the error message

The fix is to implement IntoResponse for MyError (as in the main example). The note’s suggestion — annotate the handler with #[axum::debug_handler] — is the fastest way to get a precise diagnostic when this happens to a real handler.

`unwrap()` in a request path

unwrap() turns a recoverable error into a panic. In a handler that aborts the request (and, without catch_panic, can take down the worker). Clippy will not flag it by default, but the clippy::unwrap_used restriction lint will — turn it on for production crates:

1
#![warn(clippy::unwrap_used)]
2

3
fn parse_port(raw: &str) -> u16 {
4
    raw.parse().unwrap()
5
}
6

7
fn main() {
8
    println!("{}", parse_port("8080"));
9
}

cargo clippy then reports:

1
warning: used `unwrap()` on a `Result` value
2
 --> src/main.rs:4:5
3
  |
4
4 |     raw.parse().unwrap()
5
  |     ^^^^^^^^^^^^^^^^^^^^
6
  |
7
  = note: if this value is an `Err`, it will panic
8
  = help: consider using `expect()` to provide a better panic message
9
  = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#unwrap_used

Reserve unwrap/expect for startup invariants you want to crash on (a malformed config is better as a loud panic at boot than a silent default). In request handling, propagate with ?.

Leaking the error cause to the client

This one compiles — it is a logic and security bug, not a type error. If your IntoResponse does Json(ErrorBody { error: self.to_string() }) for every variant, a database error like connection refused to db-primary.internal:5432 ends up in the client’s response body, leaking topology to attackers. The fix in the main example is to gate on status.is_server_error() and emit only a generic string for 5xx. Always log the detail; never serialize it to an untrusted caller.

Unbounded outbound `await`

A reqwest/sqlx call with no timeout will wait as long as the upstream hangs, tying up a connection and a task. Nothing in the type system forces a bound — wrap every outbound call in tokio::time::timeout. See Best Practices.

Logging to plain text in production

tracing_subscriber::fmt() without .json() produces pretty, human-readable lines — perfect for cargo run locally, useless for a log aggregator. Gate the format on the environment: pretty in dev, .json() in prod (driven by config — see configuration.md).

Best Practices

Bound every outbound call

1
use std::time::Duration;
2
use tokio::time::{sleep, timeout};
3

4
// Simulate an outbound dependency call (DB, HTTP, cache) that may hang.
5
async fn fetch_from_dependency(slow: bool) -> String {
6
    if slow {
7
        sleep(Duration::from_secs(10)).await; // a hung upstream
8
    }
9
    "ok".to_string()
10
}
11

12
#[tokio::main]
13
async fn main() {
14
    // ALWAYS bound an outbound call. An unbounded await is a latent outage.
15
    match timeout(Duration::from_millis(200), fetch_from_dependency(true)).await {
16
        Ok(value) => println!("got: {value}"),
17
        Err(_elapsed) => println!("dependency timed out after 200ms -> degrade gracefully"),
18
    }
19

20
    match timeout(Duration::from_millis(200), fetch_from_dependency(false)).await {
21
        Ok(value) => println!("got: {value}"),
22
        Err(_elapsed) => println!("timed out"),
23
    }
24
}

Real output:

1
dependency timed out after 200ms -> degrade gracefully
2
got: ok

Build a release profile that fails loud and ships small

In Cargo.toml, a production profile that aborts on panic (no unwinding, smaller binary) and strips symbols:

1
[profile.release]
2
opt-level = 3
3
lto = "thin"
4
codegen-units = 1
5
panic = "abort"   # no unwinding; a panic terminates the process (pair with an orchestrator restart)
6
strip = "symbols" # smaller binary, no symbol table in the image

Warning: panic = "abort" means a panic kills the whole process, not just the task. That is often desirable in a container (the orchestrator restarts a clean instance), but it makes catch_panic and unwind-based recovery unavailable. Decide deliberately. If you keep the default unwind, add tower_http::catch_panic::CatchPanicLayer so a single bad request returns a 500 instead of taking down a worker.

The rest of the checklist

Configuration & environment: load config into a typed struct and validate it at startup so a bad value fails fast — see configuration.md and environment.md. Follow the 12-factor separation of config from code.
Graceful shutdown: catch SIGTERM, flip readiness to false, and drain in-flight requests — see graceful-shutdown.md.
Health probes: distinct liveness and readiness endpoints — see health-checks.md.
Metrics & tracing: RED/USE signals and request-scoped traces — see metrics.md and distributed-tracing.md.
Rate limiting & caching: protect and accelerate — see rate-limiting.md and caching.md.
Dependency hygiene: run cargo audit (RustSec advisories) and cargo deny (licenses, bans, duplicate versions) in CI. Pin a rust-toolchain.toml.
Minimal runtime image: build static or distroless. A from-scratch or distroless image has no shell and a tiny attack surface — see Section 27: Security.
Run as non-root, drop capabilities, read-only filesystem in the container.

Real-World Example

A panic in one handler should never take down the worker that serves every other request. With the default unwinding profile, CatchPanicLayer converts a handler panic into a clean 500:

1
cargo add tower-http --features catch-panic

1
use axum::{body::Body, http::Request, routing::get, Router};
2
use tower::ServiceExt; // for `oneshot`
3
use tower_http::catch_panic::CatchPanicLayer;
4

5
async fn boom() -> &'static str {
6
    panic!("handler bug"); // a latent bug in one endpoint
7
}
8

9
fn app() -> Router {
10
    Router::new()
11
        .route("/boom", get(boom))
12
        // Turn a panic in any handler into a 500 instead of killing the worker.
13
        .layer(CatchPanicLayer::new())
14
}
15

16
#[tokio::main]
17
async fn main() {
18
    let resp = app()
19
        .oneshot(Request::builder().uri("/boom").body(Body::empty()).unwrap())
20
        .await
21
        .unwrap();
22
    println!("status = {}", resp.status());
23
}

Running it (the default panic hook prints the location and message to stderr first, then CatchPanicLayer converts the unwind into a response):

1
thread 'main' panicked at src/main.rs:6:5:
2
handler bug
3
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
4
status = 500 Internal Server Error

The process keeps serving; the bad request gets a 500; the panic message lands in your logs for triage. In a real deployment you would pair this with metrics.md to alert on the 5xx rate and distributed-tracing.md to find the offending span. This is the defense-in-depth posture a production checklist exists to enforce: every layer assumes the one below it can fail.

Exercises

Exercise 1: Enforce a body-size limit

Difficulty: Beginner

Objective: Confirm that an oversized request body is rejected before your handler runs.

Instructions: Build a router with a single POST /echo handler that returns the request body as a string. Add a DefaultBodyLimit of 8 bytes (small, for the demo). Send a 100-byte body and assert the response status is 413 Payload Too Large.

1
use axum::{body::Body, extract::DefaultBodyLimit, http::Request, routing::post, Router};
2
use tower::ServiceExt;
3

4
async fn echo(body: String) -> String {
5
    body
6
}
7

8
fn app() -> Router {
9
    Router::new()
10
        .route("/echo", post(echo))
11
        // TODO: cap the body at 8 bytes
12
        /* ??? */
13
}
14

15
#[tokio::main]
16
async fn main() {
17
    let big = "x".repeat(100);
18
    let resp = app()
19
        .oneshot(
20
            Request::builder()
21
                .method("POST")
22
                .uri("/echo")
23
                .body(Body::from(big))
24
                .unwrap(),
25
        )
26
        .await
27
        .unwrap();
28
    println!("oversized body -> {}", resp.status());
29
}

Solution

1
use axum::{body::Body, extract::DefaultBodyLimit, http::Request, routing::post, Router};
2
use tower::ServiceExt;
3

4
async fn echo(body: String) -> String {
5
    body
6
}
7

8
fn app() -> Router {
9
    Router::new()
10
        .route("/echo", post(echo))
11
        .layer(DefaultBodyLimit::max(8)) // 8-byte cap for the demo
12
}
13

14
#[tokio::main]
15
async fn main() {
16
    let big = "x".repeat(100);
17
    let resp = app()
18
        .oneshot(
19
            Request::builder()
20
                .method("POST")
21
                .uri("/echo")
22
                .body(Body::from(big))
23
                .unwrap(),
24
        )
25
        .await
26
        .unwrap();
27
    println!("oversized body -> {}", resp.status());
28
}

Add the dependencies with cargo add axum tower and cargo add tokio --features full. Output:

1
oversized body -> 413 Payload Too Large

Exercise 2: A typed error that never leaks

Difficulty: Intermediate

Objective: Implement IntoResponse so that client errors return a useful message but server errors return only a generic one — and the real cause is always logged.

Instructions: Define an AppError enum with BadInput(String) (→ 400) and Database(String) (→ 500). Implement IntoResponse so the 400 body contains the input message, the 500 body contains only "internal error", and both log the real detail with tracing.

Solution

1
use axum::{
2
    http::StatusCode,
3
    response::{IntoResponse, Response},
4
    Json,
5
};
6
use serde::Serialize;
7

8
#[derive(Debug, thiserror::Error)]
9
enum AppError {
10
    #[error("bad input: {0}")]
11
    BadInput(String),
12
    #[error("database failure: {0}")]
13
    Database(String),
14
}
15

16
#[derive(Serialize)]
17
struct ErrorBody {
18
    error: String,
19
}
20

21
impl IntoResponse for AppError {
22
    fn into_response(self) -> Response {
23
        let status = match &self {
24
            AppError::BadInput(_) => StatusCode::BAD_REQUEST,
25
            AppError::Database(_) => StatusCode::INTERNAL_SERVER_ERROR,
26
        };
27
        if status.is_server_error() {
28
            tracing::error!(error = %self, "request failed");
29
        } else {
30
            tracing::warn!(error = %self, "request rejected");
31
        }
32
        let public = if status.is_server_error() {
33
            "internal error".to_string() // never leak the cause
34
        } else {
35
            self.to_string()
36
        };
37
        (status, Json(ErrorBody { error: public })).into_response()
38
    }
39
}
40

41
fn main() {
42
    // Confirm the mapping: a DB error becomes a generic 500 body.
43
    let resp = AppError::Database("connection refused to db:5432".into()).into_response();
44
    println!("status = {}", resp.status());
45
}

Dependencies: cargo add axum serde --features serde/derive, cargo add thiserror, and cargo add tracing. The Database variant carries "connection refused to db:5432", but the client only ever sees {"error":"internal error"}; the real string is logged. Output:

1
status = 500 Internal Server Error

Exercise 3: Survive a panicking handler

Difficulty: Advanced

Objective: Add panic isolation so a bug in one endpoint returns a 500 instead of crashing the worker, and verify the rest of the router still serves.

Instructions: Build a router with two routes: GET /ok returning "ok" and GET /boom that panic!s. Apply tower_http::catch_panic::CatchPanicLayer. Send a request to /boom, assert 500; then send a request to /ok on the same router and assert 200 — proving the worker survived.

Solution

1
use axum::{body::Body, http::Request, routing::get, Router};
2
use tower::ServiceExt;
3
use tower_http::catch_panic::CatchPanicLayer;
4

5
async fn ok() -> &'static str {
6
    "ok"
7
}
8

9
async fn boom() -> &'static str {
10
    panic!("handler bug");
11
}
12

13
fn app() -> Router {
14
    Router::new()
15
        .route("/ok", get(ok))
16
        .route("/boom", get(boom))
17
        .layer(CatchPanicLayer::new())
18
}
19

20
#[tokio::main]
21
async fn main() {
22
    let boom_resp = app()
23
        .oneshot(Request::builder().uri("/boom").body(Body::empty()).unwrap())
24
        .await
25
        .unwrap();
26
    println!("/boom -> {}", boom_resp.status());
27

28
    // A fresh request on a fresh service instance — the process never died.
29
    let ok_resp = app()
30
        .oneshot(Request::builder().uri("/ok").body(Body::empty()).unwrap())
31
        .await
32
        .unwrap();
33
    println!("/ok   -> {}", ok_resp.status());
34
}

Dependencies: cargo add axum tower, cargo add tokio --features full, and cargo add tower-http --features catch-panic. The default panic hook prints to stderr first, then the layer converts the unwind to a 500, and the /ok route still answers:

1
thread 'main' panicked at src/main.rs:10:5:
2
handler bug
3
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
4
/boom -> 500 Internal Server Error
5
/ok   -> 200 OK

Note: CatchPanicLayer relies on unwinding, so it has no effect under panic = "abort". If your release profile aborts on panic, isolation comes from the orchestrator restarting the process instead.

Production Readiness Checklist

Quick Overview

TypeScript/JavaScript Example

Rust Equivalent

Detailed Explanation

Logging: structured, leveled, redacted

Errors: typed, mapped, never leaked

Timeouts: bound everything

Limits: cap what can grow

Observability and security

Key Differences

Common Pitfalls

Forgetting IntoResponse on your error type

unwrap() in a request path

Leaking the error cause to the client

Unbounded outbound await

Logging to plain text in production

Best Practices

Bound every outbound call

Build a release profile that fails loud and ships small

The rest of the checklist

Real-World Example

Further Reading

Exercises

Exercise 1: Enforce a body-size limit

Exercise 2: A typed error that never leaks

Exercise 3: Survive a panicking handler

Forgetting `IntoResponse` on your error type

`unwrap()` in a request path

Unbounded outbound `await`