Threads and the memory model

Threads were added in 0.3.0. A thread is an OS thread. spawn starts one and join waits for it; both are always-available builtins like alloc and read_file, gated behind no paradigm. Channels arrived in 0.3.1, mutexes and condition variables in 0.3.2, and the thread pool plus the non-blocking channel operations in 0.3.3.

For a task-oriented walkthrough, see the concurrency guide. For the stdlib API surface, see std.concurrent.

Spawn and join

spawn(f: () -> void) -> (thread, error) takes a lambda literal written at the call site, of type () -> void. The error fires when the operating system refuses the thread, and the must-handle rule makes the caller face it. A closure variable cannot be spawned, since only the literal site knows the environment layout the runtime copies; wrap the call in a literal instead.

join(t: thread) -> error blocks until the body returns. thread is an opaque builtin type like error. The handle is a record in the generational heap and join retires it, so a second join of the same handle faults through the same check a use after free hits. Join what you spawn: a thread still running when main returns dies mid work.

func main() -> int32 {
    t, e := spawn(lambda () -> void {
        println("worker")
    })
    if e.exists() {
        printerr(e)
        return 1
    }
    je := join(t)
    je.ignore()
    println("done")
    return 0
}

What crosses a spawn

A spawned lambda captures outer variables by immutable copy, like every lambda, and the copies live in a private heap block the runtime frees when the body returns. A thread therefore never reads another thread’s stack and never mutates another thread’s locals.

Scalars, strings, fixed arrays, structs, enums, tuples, raw pointers, and handle structs such as AtomicInt cross freely as captures. Capturing a slice, a closure, or an interface value is a compile error, wherever it sits, including buried in a struct or enum field, since each may view the spawning frame. A captured managed *T becomes a borrow inside the thread: the thread reads through it, and freeing or moving the binding there is a compile error. The ownership pass tracks direct bindings only, so a pointer laundered through an aggregate falls to the runtime generation backstop, the division of labor the ownership rules already document.

Channels

A channel is a bounded, thread-safe queue in std.concurrent.channel, an ordinary generic struct over runtime shims, not a compiler type. Channel<T> holds at most the capacity given at construction, always at least one.

@import std.concurrent.channel

func main() -> int32 {
    jobs: Channel<int64> = chan_new(8)
    e := chan_send(jobs, 42)
    e.ignore()
    v, re := chan_recv(jobs)
    re.ignore()
    println(v)
    chan_close(jobs)
    chan_free(jobs)
    return 0
}

chan_new<T>(cap: int64) -> Channel<T> sizes the element from the binding annotation, the same rule alloc uses, so a bare jobs := chan_new(8) cannot pin T and is a compile error. A capacity below one or exhausted memory is fatal rather than an error, the allocator’s contract.

chan_send(c, x) -> error copies the value in and blocks while the channel is full. Its error exists when the channel is closed, whether it was closed before the call or while the sender waited. chan_recv(c) -> (T, error) copies the oldest value out and blocks while the channel is empty. Its error exists only once the channel is closed and drained, so a loop breaking on e.exists() consumes everything that was sent. The value beside that error is the zero pattern for T and means nothing. When T is a managed pointer that zero is null, and dereferencing it faults by name as a null dereference. chan_close(c) is idempotent, wakes every blocked sender and receiver, and discards nothing already buffered.

A channel element must be safe to carry to another thread, the same rule spawn captures follow: an element type containing a slice, a closure, or an interface value, wherever it sits, is a compile error at the instantiation, since each may view the sending frame and the ring would deliver a dangling view. Send heap-owned data instead.

The handle is one word and copies freely, including into a spawned lambda’s captures, and every copy names the same channel. It is deliberately exempt from the single-owner rule because it is not a managed pointer: a channel is a sharing point, and aliasing it is its purpose.

Ownership across a channel

Ownership crosses a thread boundary by moving a managed pointer through a channel. chan_send(c, move(p)) kills the sender’s name at compile time through the ordinary argument-position move, and the receiver’s q, e := chan_recv(c) binds a fresh owner through the ordinary call-returns-ownership rule. Sending without move leaves the sender holding a live name, so the sender and receiver then share the record with no order between them. The generation check backstops a free racing a use, best effort, as the memory model says.

A moved send that the channel refuses loses the record. When chan_send(c, move(p)) returns the closed error, the value never entered the ring, the sender’s name is already dead, and no name anywhere reaches the allocation again, so it leaks. The same applies to managed pointers still buffered when chan_free runs, since the ring holds raw bytes and frees none of them. Neither is corruption, and neither happens in the sanctioned protocol where senders finish before the close, but a design that closes under active movers pays in leaked records, not faults.

Non-blocking and timed operations

Added in 0.3.3, three operations refuse instead of parking:

chan_try_send(c, x) -> error reports “channel is full” without waiting for room.
chan_try_recv(c) -> (T, error) reports “channel is empty” without waiting for a value.
chan_recv_timeout(c, ms) -> (T, error) parks at most ms milliseconds against a monotonic clock and reports “receive timed out”, so a wall-clock step cannot stretch or shrink the wait.

Each still reports the closed message its blocking twin uses, and the value beside any of these errors is the zero pattern for T. A tick loop parks on chan_recv_timeout, does a round of work, and loops back in, which is the event-loop shape the async release builds on.

Channel shutdown

Shutdown follows one order: close the channel, join every thread that touches it, then chan_free it. Freeing a channel while a thread is blocked inside a send or receive is fatal with a named message, caught best effort. Using a channel after chan_free is undefined. That is the raw layer’s honor system, since the one-word handle carries no generation.

Mutexes and condition variables

std.concurrent.sync carries Mutex and Condvar, ordinary structs over runtime shims like the channel. The blessed shape for shared mutable state is a *raw buffer guarded by one mutex: lock, touch the buffer, unlock.

@paradigm procedural
@import std.concurrent.sync
@import std.vector

func main() -> int32 {
    m := mutex_new()
    counter: *raw int64 = alloc_bytes(8)
    counter[0] = 0
    handles: *Vector<thread> = alloc(vec_new())
    mut w: int64 = 0
    while w < 4 {
        t, e := spawn(lambda () -> void {
            mut i: int64 = 0
            while i < 2500 {
                lock(m)
                counter[0] = counter[0] + 1
                unlock(m)
                i = i + 1
            }
        })
        if e.exists() {
            printerr(e)
            return 1
        }
        vec_push(handles, t)
        w = w + 1
    }
    mut k: int64 = 0
    while k < vec_len(handles) {
        je := join(vec_get(handles, k))
        je.ignore()
        k = k + 1
    }
    println(counter[0])
    mutex_free(m)
    free(counter)
    vec_free(handles)
    free(handles)
    return 0
}

This prints exactly 10000: an unlock happens before the next lock of the same mutex, so every increment reads the previous one’s write.

lock(m) blocks until the mutex is free and unlock(m) releases it. An unlock happens before the lock that next acquires the same mutex, which is the ordering that makes the guarded memory safe to touch. Inside a function body the idiom is lock(m) followed by defer unlock(m), so every return path releases. The handle is one word and copies freely, including into a spawned lambda’s captures, and every copy names the same lock.

Misuse faults by name

The mutex is the error-checking kind, so relocking a mutex the thread already holds and unlocking a mutex the thread does not hold (both undefined in the default pthread flavor) fault by name. The runtime adds the rest: a trylock probe makes freeing a held mutex fatal, an operation on a mutex already freed faults as an invalid mutex rather than a misleading holder message, and a waiter count makes freeing a condition variable a thread waits on fatal instead of the silent forever-hang the bare destroy gives.

Condition variables

cond_wait(cv, m) releases the mutex, sleeps until cond_signal(cv) wakes one waiter or cond_broadcast(cv) wakes all, and reacquires the mutex before returning. The caller must hold the mutex, every concurrent wait on one condition variable must name the same mutex, and wakeups can be spurious, so a wait always sits in a loop that rechecks its predicate under the lock. This fragment requires the procedural paradigm for while:

lock(m)
while buf[5] == 0 {
    cond_wait(notempty, m)
}
// consume under the lock, then
unlock(m)

Free a condition variable only after every waiter has left it. Freeing one a thread still waits on is fatal by name. A condition variable wait has no timeout, so a predicate nothing ever makes true is a deadlock. A channel receive is the wait that can time out, through chan_recv_timeout.

The thread pool

Added in 0.3.3. The pool is a process singleton of OS threads that runs fire-and-forget tasks, the substrate the async release schedules onto. submit is an always-available builtin like spawn and shares its whole argument rule: one lambda literal of type () -> void, captures copied to a private heap block, the same slice, closure, and interface capture ban, and a captured managed pointer borrowed, not owned. It returns only an error, because the pool owns the task and results flow through a channel.

@import std.concurrent.channel
@import std.concurrent.pool

func main() -> int32 {
    pe := pool_start(ncpu())
    if pe.exists() {
        printerr(pe)
        return 1
    }
    done: Channel<int64> = chan_new(8)
    se := submit(lambda () -> void {
        we := chan_send(done, 42)
        we.ignore()
    })
    se.ignore()
    v, re := chan_recv(done)
    re.ignore()
    println(v)
    pool_shutdown()
    chan_free(done)
    return 0
}

pool_start(workers) -> error in std.concurrent.pool starts the singleton with a fixed worker count, ncpu() -> int64 being the natural count. The error exists when the count is below one, the pool is already running, it was already shut down, or the operating system refuses a worker thread. A refused start leaves the pool startable again, but a successful start is the only one the process gets, and after a shutdown the pool stays down.

A submit never blocks the submitter, whatever the queue holds, and its error exists only when the pool is not running, in which case the task body never runs.

pool_shutdown() stops new submissions, runs everything already queued to completion, joins the workers, and is idempotent. When two threads race into it, the loser waits for the winner, so every caller returns holding the drain guarantee. A task still running at shutdown finishes normally, and a submission it makes after the flag flips is refused like any other, but a pool task calling pool_shutdown itself is fatal by name, since the worker would otherwise join itself or wait forever on its own completion.

Submission order is queue order, but tasks run on many workers at once, so nothing about completion order is promised. Queuing a task happens before its body runs, and everything a body did is visible to whoever receives its completion through a channel, the ordering the channel edge already provides. Shut the pool down before main returns for the same reason threads are joined: a worker mid task when the process exits dies mid write.

The memory model

dusk does not detect data races. When two threads touch the same memory, at least one writes, and no sanctioned path orders the accesses, the program has a data race and its behavior is undefined, exactly as in the C the runtime compiles down to.

The sanctioned paths provide the ordering they name:

Capture at spawn copies values into the thread’s private environment.
The sequentially consistent atomics in std.concurrent.atomic order the accesses they mediate.
A chan_recv happens after the chan_send that delivered the value.
An unlock happens before the next lock of the same mutex.
join orders everything the thread did before everything the joiner does after.

Sharing built by hand out of *raw T buffers is on the raw layer’s honor system across threads, exactly as it is within one, unless a mutex guards every touch.

The generational heap is thread safe, so alloc and free from any thread are defined, and the dereference check stays armed on every thread. In a program whose frees and uses are ordered by a sanctioned path, the check keeps its guarantee: a use after free, a double free, or a double join faults deterministically instead of corrupting memory. In a program that races, the check degrades to a best-effort backstop. Checking and using are two steps, so a dereference racing the free of the same allocation can pass the check and then touch retired memory, and a fat pointer overwritten while another thread reads its sixteen bytes can tear into a mismatched pair. Freed blocks stay parked in the runtime’s free list rather than returning to the operating system, which bounds the blast radius, but none of this makes a race defined.