Part I — The Language·Chapter 4

Async Rust —
Futures, Executors,
and the Poll Model

async/await in Rust is a compile-time transformation, not a runtime feature. The compiler converts async functions into state machines. An executor drives those machines. Understanding this mechanism is not optional — it is the mental model you need to write correct concurrent embedded firmware and correct web services.
§ 4.1
What Async Actually Is — And What It Is Not

Most programmers first encounter async through a framework that hides the mechanism — Python asyncio, Node.js, Django Channels, Go goroutines. In Rust, you must understand the mechanism because Rust exposes it explicitly, and code that ignores it breaks in subtle ways.

The core insight: async/await in Rust is a compile-time transformation, not a runtime feature. When you write async fn, the compiler does not create a thread or anything involving the OS. It identifies every .await point, identifies which local variables must survive across each await, and generates a struct — a state machine — that stores exactly those variables and implements the Future trait. That Future is then handed to an executor, which drives it to completion.

Zero-Cost Async

An async function's state is a struct — no heap required.

In Node.js every async function creates a Promise on the heap. In Python every coroutine is a heap-allocated object with its own frame. In Go every goroutine has a 2KB heap stack. These accumulate and require GC.

In Rust, an async function's state machine is sized at compile time. It can live on the stack, in a static array, or on the heap — wherever the caller places it. Embassy uses a static arena. Tokio uses the heap. The Future has no opinion about where it lives. This is why Embassy runs concurrent tasks on a Pico 2 with no heap allocator.

§ 4.2
The Future Trait and Poll Model

A Future is any type implementing:

the Future trait — core interface
pub trait Future {
    type Output;   // type produced when complete
    fn poll(
        self: Pin<&mut Self>,
        cx: &mut Context<'_>,
    ) -> Poll<Self::Output>;
}
enum Poll<T> {
    Ready(T),   // done — here is the result
    Pending,   // not done — wake me when conditions change
}

The Context<'_> carries a Waker. When the thing the Future is waiting for happens — a timer fires, a GPIO edge arrives — it calls waker.wake(), which tells the executor to poll this task again. This is the complete mechanism: poll → Pending + register waker → event fires → waker.wake() → executor repollls → Ready.

THE POLL-WAKEUP CYCLE
───────────────────────────────────────────────
Executor              Task             Hardware

  poll(cx) ──────────▶
                         Not ready yet.
                          Registers cx.waker()
                          with timer hardware.
              Pending ◀──────────

  (runs other tasks)
                                         Timer fires
                                          waker.wake()
  poll(cx) ──────────▶
                         Timer elapsed.
                          Continuing work.
              Ready(v) ◀──────────

On the Pico: WFI instruction between wakeups.
CPU clock gates entirely. Near-zero idle power.
Figure 4.1 — The executor never spins. It sleeps until woken by hardware events.
§ 4.3
The State Machine — What the Compiler Actually Generates
state_machine.rs — async to state machine
// What you write:
async fn blink(mut led: Output<'static>) {
    loop {
        led.set_high();
        Timer::after_millis(500).await;  // await A
        led.set_low();
        Timer::after_millis(500).await;  // await B
    }
}

// What the compiler conceptually generates:
// (actual output is more complex but this is the mental model)
enum BlinkSM {
    Start  { led: Output<'static> },
    AwaitA { led: Output<'static>, timer: TimerFuture },
    AwaitB { led: Output<'static>, timer: TimerFuture },
}
// Size = max(variant sizes) = known at compile time.
// Embassy's task arena allocates exactly sizeof(BlinkSM) bytes.
// No heap. No GC. No surprise allocation at runtime.
§ 4.4
Embassy vs Tokio — Same Model, Different Worlds
comparison.rs — Embassy and Tokio side by side
// ── EMBASSY: no_std, no heap, static tasks ──
#![no_std]
#![no_main]
#[embassy_executor::task]
async fn blink_task(led: PIN_25) { /* ... */ }

#[embassy_executor::main]
async fn main(spawner: Spawner) {
    let p = embassy_rp::init(Default::default());
    spawner.spawn(blink_task(p.PIN_25)).unwrap();
    // Tasks: statically allocated in fixed-size arena
    // Wakeups: hardware ISRs via embassy_rp interrupt handlers
    // Sleep: WFI — CPU gates clock until next interrupt
}

// ── TOKIO: std, heap, thread pool ──
#[tokio::main]
async fn main() {
    tokio::spawn(handle_request());   // boxed on heap
    tokio::spawn(stream_alerts());    // boxed on heap
    // Tasks: Box<dyn Future> on the heap, work-stealing thread pool
    // Wakeups: OS epoll (Linux) / kqueue (macOS) / IOCP (Windows)
    // Sleep: OS thread yield — scheduler decides next task
}

// KEY RULES FOR BOTH:
// 1. Never block inside async — use .await or spawn_blocking
// 2. Never hold a sync Mutex across an .await
// 3. Drop Futures correctly — cancellation may leave state incomplete
§ 4.5
Pin — Why Futures Cannot Move After First Poll

You noticed Pin<&mut Self> in the Future trait. A Pin guarantees the value will not move in memory after it is first polled. This exists because async state machines can contain self-referential data — a local variable and a reference pointing into it. If the struct moved, the reference would become a dangling pointer. Pin prevents movement. The executor promises to give the Future the same memory address on every subsequent poll call.

In practice you almost never write Pin directly. The .await syntax handles it. tokio::pin!() and Embassy's pin_mut!() handle it when you need to poll the same Future across multiple select! calls. The rule to remember: once a Future is polled, never move it.

§ 4.6
Three Pitfalls That Will Bite You
1 — Blocking inside async

Embassy is single-threaded per core. One blocking call — a spin-wait, a blocking delay, a blocking I2C write — freezes the entire executor for its duration. No other tasks run. Use Timer::after_millis(n).await not cortex_m::asm::delay(). Use embassy_rp's async I2C, not blocking. In Tokio, use tokio::task::spawn_blocking(|| expensive_work()) for any CPU-bound or blocking-I/O work.

2 — Holding a Mutex across .await
mutex_pitfall.rs
// WRONG — std Mutex held across await = potential deadlock
async fn bad(m: Arc<std::sync::Mutex<State>>) {
    let mut g = m.lock().unwrap();
    do_async_work().await;  // lock held while yielded — deadlock if another task needs it
    g.x += 1;
}

// CORRECT option A — async work first, then lock briefly
async fn good_a(m: Arc<std::sync::Mutex<State>>) {
    let result = do_async_work().await;  // async first
    m.lock().unwrap().x += result;        // brief lock, no await inside
}

// CORRECT option B — use tokio::sync::Mutex (yields instead of blocking)
async fn good_b(m: Arc<tokio::sync::Mutex<State>>) {
    let mut g = m.lock().await;  // yields while waiting — no deadlock
    g.x += 1;
}
// In Embassy: embassy_sync::mutex::Mutex — same pattern
3 — Cancellation Safety

When you select() two Futures and one resolves first, the other is dropped. If that dropped Future had partially completed some operation — sent half a command, incremented a counter — the partial state remains. Always structure operations so that dropping them mid-execution leaves the system consistent. Embassy I/O futures are generally cancellation-safe. Complex application futures require careful design.

§ 4.7
Exercises
Exercise 4.1 — Write a Future by Hand

Implement Future manually for a countdown

Without using async/await, implement Future for a CountdownFuture struct that resolves after N polls. In poll(): if count is zero, return Ready(()). Otherwise decrement, call cx.waker().wake_by_ref(), and return Pending. Build a simple blocking spin executor to drive it. After this exercise, you will understand exactly what .await desugars to — it is not magic, it is this loop.

Exercise 4.2 — Two Embassy Tasks

Producer/consumer with a shared counter

Write a producer task that increments a shared counter every 100ms, and a consumer task that reads it every 500ms and displays it on your TM1637. Use embassy_sync::mutex::Mutex. Verify neither task blocks the other, and that you never hold the mutex across an await point. The counter should read approximately 5× the increment rate of the display update rate.

Exercise 4.3 — select! for Button or Timeout

Non-blocking event multiplexing

Using Embassy's select(), write a task that waits for either a GPIO button press on PIN_15 or a 5-second timeout. Display "btn" or "time" on the TM1637 to indicate which arrived first. This is the fundamental embassy pattern for non-blocking event handling — used in every real-world firmware that reacts to multiple independent events.