Async vs Sync: Choosing the Right Concurrency Model

26 min read

Quick Overview

In Node.js you rarely choose: almost everything is async, because the single-threaded event loop punishes any code that blocks it. Rust gives you a genuine menu — async tasks for I/O-bound concurrency, OS threads (or rayon) for CPU-bound parallelism, and plain synchronous code when there is nothing to overlap. This page is about making that choice deliberately, the I/O-bound-versus-CPU-bound distinction that drives it, and the “function coloring” problem that async introduces in both languages.

Note: This page assumes you have met Rust’s lazy futures and the Tokio runtime. If not, read promises-vs-futures.md and tokio-intro.md first. The mechanics of tokio::spawn, spawn_blocking, and OS threads live in spawning-tasks.md; this page is about when to reach for each.

TypeScript/JavaScript Example

A typical Node.js service mixes two kinds of work: waiting on the network or disk (I/O-bound) and crunching numbers (CPU-bound). The event loop handles the first beautifully and the second terribly:

1
import { createHash } from "node:crypto";
2

3
// I/O-bound: mostly waiting. async/await + the event loop excels here.
4
async function fetchUser(id: number): Promise<{ id: number; name: string }> {
5
  const res = await fetch(`https://api.example.com/users/${id}`);
6
  return res.json();
7
}
8

9
// I/O-bound work overlaps perfectly: three "requests" take as long as one.
10
async function loadUsers(): Promise<unknown[]> {
11
  return Promise.all([fetchUser(1), fetchUser(2), fetchUser(3)]);
12
}
13

14
// CPU-bound: a synchronous hash loop. This BLOCKS the single event-loop thread.
15
function hashPasswords(passwords: string[]): string[] {
16
  // While this runs, NO other callback, timer, or awaited promise can proceed.
17
  return passwords.map((p) => {
18
    let h = createHash("sha256");
19
    for (let i = 0; i < 200_000; i++) h.update(p); // deliberately heavy
20
    return h.digest("hex");
21
  });
22
}

The hidden trap: because Node runs your JavaScript on one thread, hashPasswords freezes the entire process — every pending request, timer, and await stalls until it returns. We can demonstrate the freeze precisely with a busy loop and a timer that should fire at 50 ms:

1
const start = Date.now();
2
setTimeout(() => {
3
  console.log(`timer fired at ${Date.now() - start} ms (wanted 50)`);
4
}, 50);
5

6
// Busy-loop the single thread for ~300 ms.
7
const spinUntil = start + 300;
8
while (Date.now() < spinUntil) {
9
  /* burn CPU */
10
}
11
console.log(`busy loop done at ${Date.now() - start} ms`);

Running it under Node v22:

1
busy loop done at 309 ms
2
timer fired at 316 ms (wanted 50)

The timer was due at 50 ms but did not fire until 316 ms — the CPU loop held the thread hostage. Node’s only escape hatch for CPU work is worker_threads, a separate, heavier mechanism with message-passing serialization. Rust faces the same physics but hands you cleaner, first-class tools for both halves of the problem.

Rust Equivalent

Rust makes you pick the tool that matches the workload. For I/O-bound work, async tasks on a runtime overlap waits exactly like the event loop — three 100 ms “requests” finish in about 100 ms total. This example uses two crates — the tokio runtime and futures (for join_all) — so add them first with cargo add tokio --features full and cargo add futures:

1
use std::time::{Duration, Instant};
2
use tokio::time::sleep;
3

4
/// I/O-bound: mostly waiting. async + a runtime is the right tool.
5
async fn fetch(url: &str) -> usize {
6
    sleep(Duration::from_millis(100)).await; // stands in for a network round-trip
7
    url.len()
8
}
9

10
#[tokio::main]
11
async fn main() {
12
    let urls = ["https://a.example", "https://b.example", "https://c.example"];
13
    let start = Instant::now();
14

15
    // Run all three "requests" concurrently. They overlap because each .await
16
    // yields the worker while it waits — so total time ~= one request, not three.
17
    let results = futures::future::join_all(urls.iter().map(|u| fetch(u))).await;
18

19
    println!("results = {results:?}");
20
    println!("elapsed = {} ms", start.elapsed().as_millis());
21
}

Real output:

1
results = [17, 17, 17]
2
elapsed = 101 ms

For CPU-bound work, async does nothing — there is no waiting to overlap, only computation to spread across cores. That is a job for threads. The data-parallel crate rayon turns a sequential iterator into a parallel one with a one-word change, and on a multi-core machine the speedup is real:

1
use std::time::Instant;
2
use rayon::prelude::*;
3

4
/// CPU-bound: a deliberately heavy, purely synchronous computation.
5
fn heavy(seed: u64) -> u64 {
6
    let mut acc = seed;
7
    for _ in 0..50_000_000u64 {
8
        acc = acc.wrapping_mul(6364136223846793005).wrapping_add(1442695040888963407);
9
    }
10
    acc
11
}
12

13
fn main() {
14
    let inputs: Vec<u64> = (0..8).collect();
15

16
    // Sequential baseline.
17
    let start = Instant::now();
18
    let seq: Vec<u64> = inputs.iter().map(|&s| heavy(s)).collect();
19
    println!("sequential: {} ms", start.elapsed().as_millis());
20

21
    // Rayon: a parallel iterator spreads the work across CPU cores.
22
    let start = Instant::now();
23
    let par: Vec<u64> = inputs.par_iter().map(|&s| heavy(s)).collect();
24
    println!("rayon par:  {} ms", start.elapsed().as_millis());
25

26
    assert_eq!(seq, par);
27
    println!("results match: {}", seq == par);
28
}

Real output (on an 8-core machine, --release):

1
sequential: 830 ms
2
rayon par:  83 ms
3
results match: true

That is roughly a 10× speedup — and notice there is no async, no .await, and no Tokio anywhere. CPU-bound work wants cores, not an event loop. The art is knowing which world you are in.

Tip: rayon needs no runtime and no Cargo.toml features beyond rayon = "1". It manages its own thread pool sized to your CPU. Tokio is for concurrency over I/O; rayon is for parallelism over data. They compose — see the Real-World Example.

Detailed Explanation

Two axes: concurrency vs parallelism, I/O-bound vs CPU-bound

Two independent distinctions drive every decision here.

Concurrency vs parallelism. Concurrency is dealing with many things at once by interleaving them (one cook juggling several pans). Parallelism is doing many things at literally the same instant (several cooks). Async gives you concurrency cheaply; whether it also gives you parallelism depends on the runtime’s scheduler (the multi-thread vs current-thread choice in tokio-intro.md). JavaScript gives you concurrency but never parallelism for your own JS code — one thread, always. The deeper treatment is in concurrency.md; here we care only about its consequence for choosing a tool.

I/O-bound vs CPU-bound. This is the question to ask first about any task:

I/O-bound: the task spends most of its time waiting — on a socket, a disk, a database, a timer. The CPU is idle during the wait. Overlapping the waits is the whole win.
CPU-bound: the task spends most of its time computing — hashing, parsing, compressing, rendering, number-crunching. There is nothing to overlap; you only win by using more cores.

Async is built to overlap waits. It does not — cannot — make computation faster. Pointing async at a CPU-bound problem is like hiring a faster waiter to cook the food.

Why async wins for I/O and loses for CPU

When an async task hits .await on something that is not ready (a socket with no data yet), it yields the worker thread back to the runtime, which runs other tasks meanwhile. Thousands of mostly-idle connections can therefore share a handful of threads. That is exactly the I/O-bound sweet spot.

A CPU-bound loop never hits an .await — it just computes. So it never yields. On the multi-thread runtime it merely pins one worker (wasting the runtime’s lightweight scheduling); on the single-thread runtime it freezes everything, just like Node. The yield points that make async efficient simply do not exist in a tight compute loop.

The runtime starvation trap (the Rust mirror of blocking the event loop)

This is the single most important practical consequence. A blocking or CPU-heavy synchronous call inside an async task starves its sibling tasks, because cooperative scheduling only hands off control at .await. On a single-thread runtime it is dramatic and deterministic:

1
use std::time::Duration;
2
use tokio::time::{sleep, Instant};
3

4
// A blocking wait that NEVER yields the async worker: it sleeps the OS thread.
5
// (A long CPU loop would behave the same way — neither hits an .await.)
6
fn blocking_work() {
7
    std::thread::sleep(Duration::from_millis(300));
8
}
9

10
// Single-thread runtime so the starvation is deterministic and easy to see.
11
#[tokio::main(flavor = "current_thread")]
12
async fn main() {
13
    let start = Instant::now();
14

15
    // A "heartbeat" task that SHOULD tick every 50 ms.
16
    let heartbeat = tokio::spawn(async move {
17
        for n in 1..=3 {
18
            sleep(Duration::from_millis(50)).await;
19
            println!("heartbeat {n} at {} ms", start.elapsed().as_millis());
20
        }
21
    });
22

23
    // This blocking call hogs the single worker thread for 300 ms. The heartbeat
24
    // cannot run until this returns — its timers all fire late, bunched up.
25
    blocking_work();
26
    println!("blocking work done at {} ms", start.elapsed().as_millis());
27

28
    heartbeat.await.unwrap();
29
}

Real output:

1
blocking work done at 305 ms
2
heartbeat 1 at 356 ms
3
heartbeat 2 at 409 ms
4
heartbeat 3 at 461 ms

The heartbeat was supposed to tick at 50, 100, and 150 ms. Instead it does not fire at all until 356 ms — fully blocked until the 300 ms call returns, then it catches up. This is byte-for-byte the same failure as the Node busy-loop above; cooperative scheduling has the same Achilles’ heel everywhere.

The fix is to move the blocking work off the async workers with tokio::task::spawn_blocking, which runs it on a dedicated blocking-thread pool:

1
use std::time::Duration;
2
use tokio::time::{sleep, Instant};
3

4
fn blocking_work() {
5
    std::thread::sleep(Duration::from_millis(300));
6
}
7

8
#[tokio::main(flavor = "current_thread")]
9
async fn main() {
10
    let start = Instant::now();
11

12
    let heartbeat = tokio::spawn(async move {
13
        for n in 1..=3 {
14
            sleep(Duration::from_millis(50)).await;
15
            println!("heartbeat {n} at {} ms", start.elapsed().as_millis());
16
        }
17
    });
18

19
    // spawn_blocking moves the blocking call to a dedicated thread pool, so the
20
    // async worker stays free and the heartbeat ticks on time.
21
    let work = tokio::task::spawn_blocking(blocking_work);
22

23
    work.await.unwrap();
24
    println!("blocking work done at {} ms", start.elapsed().as_millis());
25

26
    heartbeat.await.unwrap();
27
}

Real output:

1
heartbeat 1 at 54 ms
2
heartbeat 2 at 108 ms
3
heartbeat 3 at 161 ms
4
blocking work done at 302 ms

Now the heartbeat ticks on time (54 / 108 / 161 ms) while the blocking work runs concurrently on its own thread. spawn_blocking is the moral equivalent of Node’s worker_threads, but far lighter to use. The full mechanics are in spawning-tasks.md.

When you do not need async at all

A point that surprises Node developers: lots of excellent Rust programs use no async whatsoever. A CLI that reads a file, transforms it, and writes it out has nothing to overlap — synchronous std::fs is simpler, faster to compile, and easier to reason about. A CPU-bound batch job wants threads, not a runtime. Reaching for #[tokio::main] reflexively (because that is what Node trained you to do) often adds a dependency and a layer of complexity you will never use.

For CPU-bound parallelism with no I/O, plain OS threads need no runtime at all:

1
use std::thread;
2
use std::time::Instant;
3

4
/// CPU-bound work: count primes below n with a naive trial-division loop.
5
fn count_primes(n: u64) -> u64 {
6
    (2..n).filter(|&x| (2..x).all(|d| x % d != 0)).count() as u64
7
}
8

9
fn main() {
10
    let ranges = [50_000u64, 50_000, 50_000, 50_000];
11
    let start = Instant::now();
12

13
    // Plain OS threads: no async, no runtime. Each thread runs on its own core.
14
    let handles: Vec<_> = ranges
15
        .into_iter()
16
        .map(|n| thread::spawn(move || count_primes(n)))
17
        .collect();
18

19
    let total: u64 = handles.into_iter().map(|h| h.join().unwrap()).sum();
20

21
    println!("total primes = {total}");
22
    println!("elapsed = {} ms", start.elapsed().as_millis());
23
}

Real output (--release):

1
total primes = 20532
2
elapsed = 682 ms

No async, no Tokio, no .await — just threads doing CPU work in parallel. For real data-parallel pipelines, prefer rayon (it handles work-stealing and pool sizing for you); use raw std::thread for a handful of long-lived, distinct jobs.

Function coloring: the cost async imposes

There is a famous essay, “What Color is Your Function?”, describing how async splits a language’s functions into two colors: async functions and sync functions. The rules are asymmetric and infectious:

An async function can call a sync function freely.
A sync function cannot simply call an async function and get its value — it must drive the future through a runtime.
Calling an async function gives you a future; you must .await it (only legal inside another async function), so async-ness propagates up the call stack.

JavaScript has exactly this problem — await is only legal inside async function, so one await deep in your code tends to turn every caller async. Rust has it too, but with a sharper edge: the boundary is enforced by the type system, and a bare future does nothing until polled.

In Rust, calling .await outside an async context is a hard compile error:

1
async fn fetch_count() -> u32 {
2
    42
3
}
4

5
// A plain synchronous function trying to call an async one.
6
fn summarize() -> u32 {
7
    // does not compile (error[E0728]): `.await` is only allowed inside
8
    // async fn / async block.
9
    let count = fetch_count().await;
10
    count * 2
11
}
12

13
fn main() {
14
    println!("{}", summarize());
15
}

Real compiler output:

1
error[E0728]: `await` is only allowed inside `async` functions and blocks
2
 --> src/main.rs:8:31
3
  |
4
6 | fn summarize() -> u32 {
5
  | --------------------- this is not `async`
6
7 |     // does not compile (error[E0728]): `.await` is only allowed inside
7
8 |     let count = fetch_count().await;
8
  |                               ^^^^^ only allowed inside `async` functions and blocks

The error names the cure: make summarize async too (and the coloring spreads), or bridge into the async world explicitly. Bridging from a synchronous function uses a runtime’s block_on, which runs a future to completion on the current thread and returns its value:

1
use std::time::Duration;
2
use tokio::runtime::Runtime;
3
use tokio::time::sleep;
4

5
async fn fetch_count() -> u32 {
6
    sleep(Duration::from_millis(10)).await;
7
    42
8
}
9

10
// A plain synchronous main — no #[tokio::main]. We build a runtime by hand and
11
// use block_on as the bridge from the sync world into the async world.
12
fn main() {
13
    let rt = Runtime::new().expect("failed to build runtime");
14

15
    // block_on runs the future to completion on this thread and returns its value.
16
    let count = rt.block_on(fetch_count());
17

18
    println!("count = {count}");
19
}

Real output:

1
count = 42

#[tokio::main] is just sugar that builds a runtime and calls block_on(main()) for you. Knowing the desugaring matters when you must call async code from a context you do not control — a Drop impl, a synchronous trait method, an FFI callback — where block_on is your bridge.

Warning: Never call block_on (or any blocking call) from inside an async task — it blocks the worker and can deadlock the runtime. block_on is for crossing into async from genuinely synchronous code, not for nesting. If you are already async and just need to wait, use .await.

Key Differences

Question	JavaScript / Node.js	Rust
Default model	Async everything (one event loop)	You choose: sync, threads, or async
I/O-bound concurrency	`async`/`await` on the event loop	async tasks on a runtime (Tokio)
CPU-bound parallelism	`worker_threads` (heavy, serialized messages)	OS threads / `rayon` (shared memory, cheap)
“No concurrency needed”	Still usually async out of habit	Plain synchronous code; no runtime
Blocking the worker	Freezes the whole event loop	Freezes the runtime’s worker(s); use `spawn_blocking`
Offloading CPU work	`worker_threads`	`tokio::task::spawn_blocking`, threads, or `rayon`
Function coloring	Yes (`await` only in `async`)	Yes, type-enforced; futures are lazy
Sync→async bridge	Top-level `await` / an async IIFE	`Runtime::block_on` / `#[tokio::main]`
Cost of choosing wrong	Event-loop stalls; jank	Same stall, plus you may have pulled in a runtime you never needed

The mental shift for a TypeScript developer is this: async is not the default in Rust, it is a tool for I/O concurrency. In Node you make everything async because the platform gives you no real alternative. In Rust, slapping async on a CPU-bound or do-one-thing program is often a mistake — it adds a runtime and the Send + 'static constraints of spawning-tasks.md without buying you anything.

Note: A handy decision rule. Are you mostly waiting on many things at once? → async (Tokio). Are you mostly computing, and want more cores? → threads / rayon. Just doing one thing, or computing in sequence? → plain synchronous code. A bit of blocking inside an otherwise-async program? → spawn_blocking.

Common Pitfalls

Pitfall 1: Using async for CPU-bound work and expecting a speedup

The most common reflex from Node: wrapping a heavy computation in async fn and tokio::spawn, expecting it to “run in the background faster.” It does not. Async adds yield points for waiting; a compute loop has none, so it just pins a worker. Worse, on a single-thread runtime it starves everything (shown above). Async never makes computation faster — only more cores do.

Fix: for CPU-bound work use rayon (data parallelism) or std::thread / spawn_blocking (to offload from the async workers). Reserve async for I/O.

Pitfall 2: Calling a blocking API inside an async task

This compiles and runs, so the compiler will not save you — which makes it especially dangerous:

1
// Anti-pattern (compiles, but misbehaves):
2
// std::thread::sleep, std::fs, reqwest::blocking, a synchronous DB driver, or a
3
// long CPU loop inside an async task all block the worker thread — no yield.
4
tokio::spawn(async {
5
    std::thread::sleep(std::time::Duration::from_secs(5)); // blocks the worker!
6
    // Every other task on this worker stalls for 5 seconds.
7
});

Fix: use the async-aware equivalent (tokio::time::sleep(...).await, tokio::fs, an async DB driver like sqlx), or offload the genuinely-blocking call with tokio::task::spawn_blocking. The earlier heartbeat experiment shows both the failure and the fix. Tokio can detect some long stalls and log a warning, but it cannot fix them for you.

Pitfall 3: Calling `block_on` from inside the runtime

Bridging is one-directional. block_on enters the async world from sync code; calling it while you are already on a runtime thread blocks that worker and can panic or deadlock:

1
// Anti-pattern: block_on inside an async context.
2
#[tokio::main]
3
async fn main() {
4
    let rt = tokio::runtime::Handle::current();
5
    // Calling block_on on the current runtime from within it panics:
6
    // "Cannot start a runtime from within a runtime."
7
    rt.block_on(async { 1 + 1 }); // panics at runtime
8
}

Running it produces a real panic (Cannot start a runtime from within a runtime. This happens because a function (like 'block_on') attempted to block the current thread while the thread is being used to drive asynchronous tasks.).

Fix: if you are already async, just .await. Use block_on only from genuinely synchronous entry points.

Pitfall 4: Adding `#[tokio::main]` to a program with no I/O concurrency

A CLI that processes one file, or a batch job that crunches numbers, gains nothing from a runtime — but pays for it in a dependency, slower compiles, and the Send + 'static rules that async-ness forces on spawned work. New Rustaceans coming from Node often async-ify everything by habit.

Fix: start synchronous. Add Tokio only when you have concurrent I/O to overlap. For CPU parallelism, reach for rayon, which needs no runtime at all.

Pitfall 5: Forgetting `.await` and getting a `Future` instead of a value

A coloring side effect: a bare async call returns a lazy future, so forgetting .await is a type error, not a silent no-op (unlike JS, where a forgotten await gives you a Promise that may still run):

1
async fn fetch_count() -> u32 {
2
    42
3
}
4

5
#[tokio::main]
6
async fn main() {
7
    // does not compile (error[E0308]): forgot `.await`, so this is a Future.
8
    let count: u32 = fetch_count();
9
    println!("{count}");
10
}

Real compiler output (trimmed):

1
error[E0308]: mismatched types
2
 --> src/main.rs:8:22
3
  |
4
8 |     let count: u32 = fetch_count();
5
  |                ---   ^^^^^^^^^^^^^ expected `u32`, found future
6
  |                |
7
  |                expected due to this
8
  |
9
note: calling an async function returns a future
10
help: consider `await`ing on the `Future`
11
  |
12
8 |     let count: u32 = fetch_count().await;
13
  |                                   ++++++

Fix: add .await. The compiler even suggests it. Because Rust futures are lazy (see promises-vs-futures.md), a forgotten .await means the work never even starts — but the type system catches it long before that becomes a runtime mystery.

Best Practices

Classify the workload before choosing a tool

Ask “I/O-bound or CPU-bound?” first, every time. I/O-bound and many-at-once → async. CPU-bound → threads / rayon. One sequential thing → plain sync. This single question prevents the majority of mismatched-tool mistakes.

Keep async functions free of blocking and heavy CPU work

Treat an async task like the Node event loop: anything that does not yield is a liability. Use async-aware I/O (tokio::fs, tokio::net, sqlx, reqwest non-blocking), and push blocking calls and CPU loops to spawn_blocking or a thread pool. A good rule: every code path in an async fn should reach an .await “soon.”

Use `rayon` for data parallelism, Tokio for I/O concurrency — and compose them

These are not competitors. A server can accept connections with Tokio (I/O concurrency) and, inside a spawn_blocking closure, use rayon to parallelize a CPU-heavy transform across cores. Keep the two pools distinct: Tokio’s workers for I/O, the blocking/rayon pool for computation.

Do not reach for a runtime you do not need

Synchronous Rust is a feature, not a limitation. Libraries especially should think hard before becoming async-only — it colors every caller. Where practical, expose a sync core and let callers choose; or offer both, gated behind a feature flag.

Bridge sync↔async at the edges, deliberately

Use block_on (or #[tokio::main]) at the boundary where synchronous code must enter async — main, a sync trait impl, an FFI callback. Never nest block_on inside async. Within async, propagate with .await and ? (see async-await.md).

Right-size the parallelism to the cores

Spawning a million async tasks for I/O is fine — they are cheap and mostly idle. Spawning a million threads for CPU work is not; it thrashes the scheduler and exhausts memory on stacks. For CPU work, parallelism should track core count, which is exactly what rayon’s pool does by default.

Real-World Example

A production-flavored pipeline that mixes both worlds: a batch image service that downloads images (I/O-bound → async, overlapped) and then processes each one (CPU-bound → offloaded with spawn_blocking so it never stalls the downloads). This is the canonical “concurrent I/O feeding parallel compute” shape.

1
use std::time::{Duration, Instant};
2
use tokio::task::JoinSet;
3
use tokio::time::sleep;
4

5
/// I/O-bound: download an image. Mostly waiting on the network → async.
6
async fn download(id: u32) -> Vec<u8> {
7
    sleep(Duration::from_millis(80)).await; // network round-trip
8
    vec![(id % 256) as u8; 1_000_000]       // pretend this is a 1 MB image
9
}
10

11
/// CPU-bound: a synchronous transform (resize + checksum). No .await here.
12
fn process(bytes: &[u8]) -> u64 {
13
    // Stand-in for real image work: a heavy fold over every byte.
14
    bytes
15
        .iter()
16
        .fold(0u64, |acc, &b| acc.wrapping_mul(1099511628211).wrapping_add(b as u64))
17
}
18

19
#[tokio::main]
20
async fn main() {
21
    let start = Instant::now();
22
    let mut set = JoinSet::new();
23

24
    for id in 0..4u32 {
25
        set.spawn(async move {
26
            // 1. Await the I/O concurrently with the other tasks.
27
            let bytes = download(id).await;
28

29
            // 2. Offload the CPU-bound transform to the blocking pool so it does
30
            //    not stall the async workers driving the other downloads.
31
            let checksum = tokio::task::spawn_blocking(move || process(&bytes))
32
                .await
33
                .expect("processing task panicked");
34

35
            (id, checksum)
36
        });
37
    }
38

39
    let mut results = Vec::new();
40
    while let Some(joined) = set.join_next().await {
41
        results.push(joined.expect("worker panicked"));
42
    }
43
    results.sort();
44

45
    for (id, checksum) in results {
46
        println!("image {id}: checksum {checksum}");
47
    }
48
    println!("elapsed = {} ms", start.elapsed().as_millis());
49
}

Real output (--release):

1
image 0: checksum 0
2
image 1: checksum 15279771427360356480
3
image 2: checksum 12112798781011161344
4
image 3: checksum 8945826134661966208
5
elapsed = 91 ms

Four 80 ms downloads overlap (so they cost ~80 ms together, not 320 ms), and each CPU-bound process runs on the blocking pool without freezing the async workers — total ~91 ms. The same pipeline written as “async everything” would block a worker during each process; written as “threads everything” it would waste threads sitting idle during each download. Matching the tool to the workload-half is the whole point.

Note: In production you would download with reqwest, and if process were itself data-parallel you could use rayon inside the spawn_blocking closure. Sharing state across the tasks (a counter, a cache) uses the Arc<Mutex<_>> pattern in arc-mutex-pattern.md. Error handling with ? across async boundaries is covered in async-await.md.

Exercises

Exercise 1: Recognize and parallelize CPU-bound work

Difficulty: Easy

Objective: Identify a workload as CPU-bound and reach for data parallelism instead of async.

Instructions:

Write a synchronous fn collatz_steps(n: u64) -> u64 that counts how many steps the Collatz sequence takes to reach 1 (even → n/2, odd → 3n+1).
Over the range 1..100_000, find the number with the longest chain.
Use rayon’s parallel iterator (into_par_iter) — not async — to spread the work across cores. Print the winning number and its step count.
In a comment, state why async would not help here.

Solution

1
use rayon::prelude::*;
2

3
/// CPU-bound: count Collatz steps to reach 1. Pure computation, no waiting.
4
fn collatz_steps(mut n: u64) -> u64 {
5
    let mut steps = 0;
6
    while n != 1 {
7
        n = if n % 2 == 0 { n / 2 } else { 3 * n + 1 };
8
        steps += 1;
9
    }
10
    steps
11
}
12

13
fn main() {
14
    // CPU-bound: there is no I/O to overlap, so async buys nothing. Only more
15
    // cores help — that is exactly what rayon's parallel iterator gives us.
16
    let (best_n, best_steps) = (1..100_000u64)
17
        .into_par_iter()
18
        .map(|n| (n, collatz_steps(n)))
19
        .max_by_key(|&(_, steps)| steps)
20
        .unwrap();
21

22
    println!("n = {best_n} has {best_steps} steps");
23
}

Output (--release):

1
n = 77031 has 350 steps

The computation is pure CPU work with no waiting, so async/Tokio would add overhead without speeding anything up. rayon parallelizes across cores with a one-word change from into_iter to into_par_iter.

Exercise 2: Keep the runtime responsive by offloading a blocking call

Difficulty: Medium

Objective: Fix a blocking call inside an async program so other tasks stay responsive.

Instructions:

Write a synchronous fn slow_hash(password: &str) -> u64 that calls std::thread::sleep for 150 ms (standing in for a deliberately slow password hash) and then folds the bytes into a u64.
In a current_thread Tokio runtime, spawn a “heartbeat” task that prints twice, 50 ms apart.
Compute the hash without starving the heartbeat — offload it with spawn_blocking — then await both.
Verify from the timing that the heartbeat ticked on schedule.

Solution

1
use std::time::Duration;
2
use tokio::time::{sleep, Instant};
3

4
/// A synchronous, blocking "hash" of a password (stands in for bcrypt/argon2).
5
fn slow_hash(password: &str) -> u64 {
6
    std::thread::sleep(Duration::from_millis(150)); // CPU-bound + blocking
7
    password.bytes().fold(0u64, |a, b| a.wrapping_mul(31).wrapping_add(b as u64))
8
}
9

10
#[tokio::main(flavor = "current_thread")]
11
async fn main() {
12
    let start = Instant::now();
13

14
    // A heartbeat proving the runtime stays responsive.
15
    let heartbeat = tokio::spawn(async move {
16
        for n in 1..=2 {
17
            sleep(Duration::from_millis(50)).await;
18
            println!("heartbeat {n} at {} ms", start.elapsed().as_millis());
19
        }
20
    });
21

22
    let password = String::from("hunter2");
23
    // Offload the blocking hash so it does not stall the single async worker.
24
    let hash = tokio::task::spawn_blocking(move || slow_hash(&password))
25
        .await
26
        .expect("hashing task panicked");
27

28
    println!("hash = {hash} at {} ms", start.elapsed().as_millis());
29
    heartbeat.await.unwrap();
30
}

Output:

1
heartbeat 1 at 52 ms
2
heartbeat 2 at 104 ms
3
hash = 95755137202 at 152 ms

The heartbeat ticks at 52 and 104 ms — right on time — because spawn_blocking moved the 150 ms blocking call to a separate thread pool. Calling slow_hash(...) directly in main (without spawn_blocking) would have frozen the single worker and pushed the first heartbeat past 150 ms.

Exercise 3: A mixed pipeline — concurrent I/O feeding parallel compute

Difficulty: Medium–Hard

Objective: Combine async I/O concurrency with rayon CPU parallelism in one program, putting each tool where it belongs.

Instructions:

Write async fn fetch_shard(id: u64) -> Vec<u64> that sleeps 50 ms (I/O) then returns 250,000 numbers.
Write a synchronous fn sum_of_squares(data: &[u64]) -> u64 that uses rayon’s par_iter to sum the squares (CPU-bound).
In async main, fetch four shards concurrently, flatten them, then offload the CPU-bound reduction to the blocking pool (where rayon parallelizes it). Print the total and elapsed time.
The downloads should overlap (≈50 ms, not 200 ms) and the reduction should not stall the runtime.

Solution

1
use std::time::{Duration, Instant};
2
use rayon::prelude::*;
3
use tokio::time::sleep;
4

5
/// I/O-bound: fetch a "shard" of numbers (async).
6
async fn fetch_shard(id: u64) -> Vec<u64> {
7
    sleep(Duration::from_millis(50)).await; // network wait
8
    (0..250_000).map(|x| x + id * 250_000).collect()
9
}
10

11
/// CPU-bound: sum the squares of a slice (synchronous, parallelizable).
12
fn sum_of_squares(data: &[u64]) -> u64 {
13
    data.par_iter().map(|&x| x.wrapping_mul(x)).sum()
14
}
15

16
#[tokio::main]
17
async fn main() {
18
    let start = Instant::now();
19

20
    // 1. Fetch four shards concurrently (I/O overlaps → ~50 ms, not 200).
21
    let shards = futures::future::join_all((0..4u64).map(fetch_shard)).await;
22

23
    // 2. Flatten, then offload the CPU-bound reduction to the blocking pool,
24
    //    where rayon spreads it across cores.
25
    let all: Vec<u64> = shards.into_iter().flatten().collect();
26
    let total = tokio::task::spawn_blocking(move || sum_of_squares(&all))
27
        .await
28
        .expect("compute task panicked");
29

30
    println!("sum of squares = {total}");
31
    println!("elapsed = {} ms", start.elapsed().as_millis());
32
}

Output (--release):

1
sum of squares = 333332833333500000
2
elapsed = 62 ms

The four 50 ms fetches overlap via join_all (I/O concurrency, ~50 ms total), and the CPU-bound reduction runs on the blocking pool with rayon spreading it across cores — so the async runtime never stalls. This is the production shape: Tokio for the waiting, rayon/threads for the computing, spawn_blocking as the seam between them. join_all comes from the futures crate (futures = "0.3"); for a fixed, small set you could equally use tokio::join!.

Async vs Sync: Choosing the Right Concurrency Model

Quick Overview

TypeScript/JavaScript Example

Rust Equivalent

Detailed Explanation

Two axes: concurrency vs parallelism, I/O-bound vs CPU-bound

Why async wins for I/O and loses for CPU

The runtime starvation trap (the Rust mirror of blocking the event loop)

When you do not need async at all

Function coloring: the cost async imposes

Key Differences

Common Pitfalls

Pitfall 1: Using async for CPU-bound work and expecting a speedup

Pitfall 2: Calling a blocking API inside an async task

Pitfall 3: Calling block_on from inside the runtime

Pitfall 4: Adding #[tokio::main] to a program with no I/O concurrency

Pitfall 5: Forgetting .await and getting a Future instead of a value

Best Practices

Classify the workload before choosing a tool

Keep async functions free of blocking and heavy CPU work

Use rayon for data parallelism, Tokio for I/O concurrency — and compose them

Do not reach for a runtime you do not need

Bridge sync↔async at the edges, deliberately

Right-size the parallelism to the cores

Real-World Example

Further Reading

Exercises

Exercise 1: Recognize and parallelize CPU-bound work

Exercise 2: Keep the runtime responsive by offloading a blocking call

Exercise 3: A mixed pipeline — concurrent I/O feeding parallel compute

Pitfall 3: Calling `block_on` from inside the runtime

Pitfall 4: Adding `#[tokio::main]` to a program with no I/O concurrency

Pitfall 5: Forgetting `.await` and getting a `Future` instead of a value

Use `rayon` for data parallelism, Tokio for I/O concurrency — and compose them