Skip to content

Overview

4 min read

Rust gives you C-class performance by default, but “by default” is not the same as “automatically optimal.” This section is about measuring before you optimize, then applying the techniques that matter: profiling and flame graphs to find the real hot spots, criterion for statistically honest benchmarks, and concrete optimizations around allocation, memory layout, and cache behavior. It closes with a clear-eyed comparison against Node.js — where Rust wins, by how much, and the caveats an experienced engineer should keep in mind.


  • How to profile a Rust program (samply/perf/Instruments) and read a flame graph to find where time actually goes
  • How to write statistically rigorous benchmarks with criterion, and why black_box matters
  • The highest-leverage optimizations: avoiding needless clone/allocation, taking &str/&[T], and letting iterators fuse
  • How memory layout (field ordering, #[repr], niche optimization, enum size) affects both size and speed
  • Writing cache-friendly code: data-oriented design, struct-of-arrays vs array-of-structs, and contiguity
  • Why Rust’s zero-cost abstractions really are zero-cost — iterators and closures compiling to the same code as hand-written loops
  • How to cut compile time and binary size when they matter
  • An honest performance comparison with Node.js/TypeScript — and when not to optimize at all

TopicDescription
ProfilingProfiling Rust apps with samply/perf/Instruments; finding hot spots; release builds with debug info.
Flame GraphsGenerating and reading flame graphs with cargo-flamegraph.
BenchmarkingStatistically driven micro-benchmarks with criterion; groups, parameters, and black_box.
Optimization TechniquesAvoiding clones/allocations, borrowing over owning, and letting the iterator chain do less work.
Memory LayoutStruct field ordering, size/align, #[repr], niche optimization, and enum sizes.
Cache EfficiencyCache-friendly, data-oriented code: SoA vs AoS and why contiguity wins.
Zero-Cost AbstractionsHow iterators and closures compile down to the same machine code as manual loops.
Compilation TimeReducing compile time: workspaces, generics, codegen-units, and caching.
Binary SizeShrinking binaries: opt-level = "z", LTO, strip, panic = "abort", and cargo-bloat.
Performance vs Node.jsWhere Rust beats Node.js (CPU, memory, no GC pauses) — and the honest caveats.
When to OptimizeMeasure first: premature optimization, and choosing readable-then-fast.

By the end of this section, you will be able to:

  • Profile a Rust program and use a flame graph to locate the real bottleneck instead of guessing
  • Write criterion benchmarks that produce trustworthy numbers and resist compiler elision
  • Remove avoidable allocations and clones, and reason about when borrowing beats owning
  • Lay out data for size and cache-friendliness, and explain why an iterator chain is not slower than a loop
  • Tune compile time and binary size when a project needs it
  • Make a fair, defensible performance comparison with a Node.js implementation — and decide when optimization is not worth it

  • Section 13: Testing — benchmarking with criterion builds directly on the testing/Cargo workflow, and you will run benches alongside tests.
  • Section 07: Collections — most optimization work is about iterators, Vec/String allocation, and capacity, so be comfortable with those first.
  • Section 05: Ownership — “avoid the clone” only makes sense once moves, borrows, and Clone are second nature.

  • Reading: 6 hours
  • Hands-on Practice: 5 hours
  • Exercises: 3 hours
  • Total: 14 hours

Tip: Resist optimizing until you have a profile. Read when-to-optimize and profiling first, get a flame graph of real code, and only then reach for the techniques in optimization, memory-layout, and cache-efficiency. The single most common mistake a fast-language newcomer makes is optimizing the wrong 90%.


Next: Section 22: Common Patterns → — idiomatic Rust design patterns and how they differ from their object-oriented TypeScript counterparts.