Prog Concurrency API Design
Status: proposed
Date: 2026-03-21
Problem
Bosatsu/Prog currently has a synchronous execution model on every backend in this repository:
- JVM evaluator:
run_progsteps oneProgvalue inline. - Python runtime:
ProgExt.runsteps oneProgvalue inline. - C runtime:
bsts_Bosatsu_Prog_runsteps oneProgvalue inline.
That is enough for sequencing effects, but it leaves two missing pieces:
- We cannot start a child
Progand later join or cancel it. - We cannot move expensive pure work off the main thread, IO thread, or future event-loop thread.
For a future libuv-backed C runtime, the second point is required: CPU-heavy work must not run on the event loop. For JVM and Python, the same API should still make sense even if the concrete scheduler is different.
Decision Summary
- Add
JoinHandleandJoinResulttoBosatsu/Prog. - Add
start,join,cancel, andcomputetoBosatsu/Prog. - Make
joinreturn the child outcome as a value, not through the caller’sProgerror channel. - Define cancellation as cooperative and best-effort across all backends.
- Treat
computeas the standard “shift CPU work off the main scheduler” primitive. - Do not add
race,par_map,poll, masking, or finalizers in this first API.
Proposed API
package Bosatsu/Prog
export (
...,
JoinHandle,
JoinResult(),
start,
join,
cancel,
compute,
)
external struct JoinHandle[err: +*, a: +*]
enum JoinResult[err: +*, a: +*]:
Succeeded(value: a)
Errored(error: err)
Canceled
external def start[err, a](prog: Prog[err, a]) -> forall f. Prog[f, JoinHandle[err, a]]
external def join[err, a](handle: JoinHandle[err, a]) -> forall f. Prog[f, JoinResult[err, a]]
external def cancel[err, a](handle: JoinHandle[err, a]) -> forall f. Prog[f, Unit]
# Run a pure, potentially expensive computation on the backend's compute pool
# or worker threads rather than the main scheduler thread.
external def compute[a](thunk: Unit -> a) -> forall err. Prog[err, a]
Why join Returns JoinResult[err, a]
The sketched type
join(h: JoinHandle[e, a]) -> Prog[e, JoinResult[a]]
mixes two different concerns:
- did
joinitself fail? - what happened in the child fiber?
For this API, the important failure is the child outcome, not a second effect-level error channel for join. Returning JoinResult[err, a] makes the contract simpler:
joinitself is total in the typedProgerror channel.- child success, child error, and cancellation all live in one value.
- repeated
joincalls can deterministically return the same terminal value.
This is the closest Bosatsu analogue to cats-effect’s Outcome, but simplified because Bosatsu does not yet have finalizers or a need to return Prog inside the success case.
Semantics
start
start(p)is still lazy like every otherProgconstructor; nothing happens until the returnedProgis executed.- Once executed, the backend creates a child task and returns a
JoinHandleimmediately. - The child may already be running, queued, or even completed by the time the parent receives the handle. That race is allowed.
JoinHandleis opaque and may be shared freely.
join
join(h)waits untilhreaches a terminal state.- The terminal states are
Succeeded(a),Errored(e), andCanceled. joinis idempotent: every successful call returns the same terminalJoinResult.- Multiple fibers may join the same handle.
cancel
cancel(h)is idempotent.- If the child is already complete,
cancelis a no-op. - Otherwise,
cancelrecords a cancellation request. - Backends should stop scheduling the child at the next cancellation checkpoint.
- If a backend can interrupt a blocked operation, it should do so. If it cannot, work may continue in the background and its result is discarded.
- If cancellation wins the race against completion,
join(h)returnsCanceled.
Cancellation checkpoints
Cancellation is not preemptive in the middle of arbitrary pure code. The shared contract is:
- before entering a backend effect, the runtime may observe cancellation
- after resuming from a backend wait, the runtime may observe cancellation
computeis a cancellation boundary- long pure Bosatsu loops are not required to notice cancellation until they reach an effect boundary
This is the only semantics that works across JVM threads, Python threads, and libuv worker tasks without unsafe thread killing.
compute
compute(thunk)runsthunk(())away from the main scheduler thread.computeis for pure, CPU-bound work.- If the caller wants background compute, use
start(compute(thunk)). computedoes not promise parallel speedup on Python because of the GIL, but it still gives a portable “not on the main/event-loop thread” guarantee.- On libuv,
computeis the standard way to keep the event loop responsive while doing CPU work. - This “pure only” restriction is not optional. Bosatsu functions are pure and do not throw exceptions, so a value of type
Unit -> ahas no typed failure mode. If a later API is needed for shifting blocking effectful work, that should be a separate combinator overProg, not a broader meaning forcompute.
Example Usage
from Bosatsu/Prog import await, compute, join, start, JoinResult()
background_hash =
handle <- compute(() -> expensive_hash(input_bytes)).start()
join(handle)
The result is a Prog returning a JoinResult, because child completion, child error, and cancellation are all explicit values. Higher combinators such as race or par_map2 can be layered on top once the base semantics are stable.
Shared Runtime Shape
Backends should implement JoinHandle with a small shared state machine:
Running(cancel_requested, waiters...)Succeeded(a)Errored(e)Canceled
waiters here means backend-owned continuations, condition variables, promises, or event-loop callbacks. The Bosatsu surface does not expose this representation.
No new public Prog tag is required for the API itself. On JVM and Python, the new operations can still be exposed as ordinary Prog externals. A libuv runtime will likely need richer internal scheduler machinery, but that can stay behind the same Bosatsu API.
Shared Bosatsu/IO/Core Process Contract
The existing Bosatsu/IO/Core API already commits us to a specific process model:
spawn(cmd: String, args: List[String], stdio: StdioConfig) -> Prog[IOError, SpawnResult]
wait(p: Process) -> Prog[IOError, Int]
To keep all backends aligned, this design commits to the following shared semantics:
spawnuses argv semantics, not shell semantics.cmdis the program to execute andargsare literal argument strings.spawninherits the current working directory and environment of the host runtime unless a future Bosatsu API adds explicitcwdorenvfields.spawnraisesIOErroronly if process creation or requested stdio setup fails. A child process exiting non-zero is observed only throughwait.SpawnResult.stdin,SpawnResult.stdout, andSpawnResult.stderrareSome(handle)exactly when the correspondingStdioentry wasPipe. ForInherit,Null, andUseHandle, the returned field isNone.UseHandle(handle)validates direction at runtime:stdinrequires a readable handle;stdoutandstderrrequire writable handles. Closed or invalid handles raiseIOError.wait(p)is idempotent and may be called multiple times or concurrently. Every successful call returns the same cached exit code.wait(p)does not implicitly close or drain pipe handles returned fromspawn. If the caller requestedPipe, those returned handles remain ordinary Bosatsu handles with their own lifetime.- Canceling a Bosatsu fiber blocked in
wait(p)never kills the external process. It only cancels that Bosatsu wait. - If a backend has to emulate
UseHandlewith internal copy tasks rather than native OS-level stdio inheritance,wait(p)must not complete until those backend-owned bridge tasks have also settled. Otherwisewaitcould return before the requested redirection is actually complete. wait(p)returns a singleInt, so this API intentionally collapses normal exit and signal termination into one value. On POSIX backends that expose signal termination separately or as negative codes, Bosatsu should normalize signal termination to128 + signal_number.- If Bosatsu later needs to distinguish ordinary exit from signal termination, that should be a new structured
ExitStatusAPI rather than a silent change towait.
Backend Notes
Scala (MatchlessToValue + cats-effect)
- The Scala-side implementation should live in
MatchlessToValue, not in the current synchronousPredefImpl.run_progloop. - Assume the interpreter runs in a cats-effect
F[_], with the needed capabilities supplied by cats-effect typeclasses rather than JVM-specific thread primitives. startshould use cats-effect fiber spawning over the interpreted childProg, andJoinHandleshould wrap that fiber or a small cats-effect state object built fromRef/Deferred.joinshould wait on the cats-effect fiber and translate its terminal state into BosatsuJoinResult.cancelshould delegate to cats-effect fiber cancellation, which already has the right cooperative semantics for this API.computeshould also be interpreted insideMatchlessToValueon top of cats-effect. On JVM and Scala Native that can run on the cats-effect scheduler in the normal way. On Scala.js, cats-effect still gives portable fiber concurrency and cancellation, but it does not by itself create a background CPU thread; if we need literal off-main-thread CPU execution there, that is a later Web Worker integration problem.- The current Scala evaluator stack is not concurrency-safe as written.
MatchlessToValuehas mutable interpreter state, andtool/Output.scalaexplicitly notes thatMatchlessToValueis not currently thread-safe. A real implementation needs to make interpreter state instance-local and safe for multiple in-flight fibers before we rely on concurrentstart/joinwithin one process. - Shared Scala process support should not be hard-wired directly to
java.lang.ProcessBuilder, because Scala.js does not have that API. ProgRuntime[F]should instead carry a process backend, with distinct implementations for JVM,bosatsu_nodeunder Node.js, and browser-oriented Scala.js.- Browser Scala.js should fail
spawnandwaitwithUnsupported. bosatsu_nodeshould provide a Node-specific implementation over Nodechild_process, not pretend that generic browser Scala.js can run OS processes.fs2.io.process.ProcessBuilderis still relevant as an existing cross-JVM/Node API already used in this repository, but Bosatsu should not make its process semantics depend directly on fs2’sResourcelifecycle or pipe-only process model.
Python
startandcomputecan usethreading.Threador a small executor.joincan wait on athreading.ConditionorEvent.cancelis cooperative only; Python cannot safely kill an arbitrary thread.computestill matters even with the GIL because it keeps the caller’s main thread free and works correctly for native code that releases the GIL.spawnshould continue to usesubprocess.Popenwithshell=False.Stdio.Inheritmaps toNone,Pipetosubprocess.PIPE,Nulltosubprocess.DEVNULL, andUseHandleto the existing Python file object carried by the Bosatsu handle.- The Python
Processruntime object should cache exit code and own one shared waiter result so repeated or concurrentwaitcalls observe the same completion rather than racing independentPopen.wait()calls. - On POSIX, Python reports signal termination as a negative return code. Bosatsu should normalize that to
128 + signal_numberbefore returning fromwait. - Canceling a Bosatsu fiber blocked in
wait(process)must not kill the process.
C with libuv
- The current C runtime in this repository is not libuv-based today. It uses a synchronous
Progloop,FILE*-backed handles, blockingfread/fwrite,fopen, andnanosleep, and it still reportsspawnandwaitas unsupported. - A real libuv backend should not wrap the existing blocking C runtime in worker threads. If we depend on libuv, the C runtime should use libuv directly for the
Bosatsu/Progscheduler and for theBosatsu/IO/Coreprimitives. - That means the C backend needs a genuine event-loop-driven fiber scheduler, a suspend/resume effect ABI, and a new opaque
Handlerepresentation built onuv_file,uv_stream_t,uv_process_t,uv_timer_t,uv_fs_t, anduv_work_t. - The detailed plan is specified in the next section.
Detailed Scala/MatchlessToValue Design
Core split: pure evaluation stays pure, Prog execution becomes F
The important Scala design choice is to keep two layers distinct:
MatchlessToValueexpression evaluation remains pure and producesValue.- Executing a
Bosatsu/Prog::Prog[...]value becomes effectful and runs in cats-effectF[_].
That means we do not try to make the whole Bosatsu evaluator asynchronous. We only make the Prog runner asynchronous.
Recommended API shape
Keep the current pure API and add an effectful Prog runner adjacent to it.
Recommended shape:
object MatchlessToValue {
def traverse[F[_]: Functor, A](
me: F[Expr[A]]
)(resolve: (A, PackageName, Identifier) => Eval[Value]): F[Eval[Value]]
def runProgF[F[_]: cats.effect.kernel.Async](
prog: Value,
runtime: ProgRuntime[F]
): F[Either[Value, Value]]
def runProgMainF[F[_]: cats.effect.kernel.Async](
main: Value,
args: List[String],
runtime: ProgRuntime[F]
): F[ProgRunResult]
def runProgTestF[F[_]: cats.effect.kernel.Async](
progTest: Value,
args: List[String],
runtime: ProgRuntime[F]
): F[Either[Value, Value]]
}
ProgRuntime[F] should own:
- stdin/stdout/stderr abstractions
- file/process/time externals
- fiber spawning and cancellation support
- any state needed for
Var,JoinHandle, andcompute
Why this split matters on Scala.js
On Scala.js we cannot synchronously block waiting for IO.
So the rule should be:
- core evaluation code never calls
Await.result - shared Scala core never calls
unsafeRunSync runProgFreturnsF[...]- only the outermost Scala.js host boundary converts that
FtoFuture
In other words, Scala.js support is not “special await support inside MatchlessToValue”. It is “keep the whole Prog execution path inside IO, then convert to Future only where the JavaScript host needs it”.
What happens at the Scala.js boundary
The Scala.js adapter should:
- choose concrete
IOforF - run command/evaluation logic in
IO - call
unsafeToFuture()only at the topmost JS integration boundary
That boundary may be:
- a UI event handler
- a Node-facing API
- a JS-exported method
Every layer below that boundary stays in IO, not Future.
Current code paths that need to change
The current shared Scala code assumes a synchronous Prog runner in a few places.
Output
Right now:
Output.EvaluationResultstoresEval[Value]Output.RunMainResultstoresEval[PredefImpl.ProgRunResult]Output.TestOutputstoresOption[Eval[Test]]
That shape can mostly stay if effectful Prog execution happens before constructing the output.
EvalCommand
Right now EvalCommand.runOutput is pure and uses:
result.value.map(...)PredefImpl.runProgMainWithSystemStdin(...)
For the MatchlessToValue + cats-effect path, runOutput should become effectful in the surrounding command F:
- evaluate the main value purely as today
- if
--runis requested, callMatchlessToValue.runProgMainF - once the
F[ProgRunResult]completes, build an ordinaryOutput.RunMainResult(Eval.now(result))
That keeps Output simple while moving the actual program execution into F.
TestCommand
The same pattern applies to ProgTest:
- pure evaluation still discovers and builds the
ProgTestvalue - command execution calls
runProgTestFinsideF - once the test value is available, build ordinary
Testoutput data
LibraryEvaluation and Evaluation
These types should keep their pure value-evaluation responsibilities.
Add effectful helpers rather than making the whole type effectful:
evaluateMainValuestays pure- new
runMainF(...)helpers execute returnedMainvalues inF - new
runProgTestF(...)helpers execute returnedProgTestvalues inF
This keeps the pure compiler/evaluator pipeline reusable and isolates concurrency to the Prog boundary.
start, join, and cancel in cats-effect
Inside the Scala runtime, the implementation should directly use cats-effect fibers.
Recommended mapping:
start->Spawn[F].start(runProgF(child, runtime))join-> wait on the cats-effect fiber and map its result toJoinResultcancel->fiber.cancel
If extra bookkeeping is needed for Bosatsu handles or repeated joins, wrap the cats-effect fiber in a small runtime object built from Ref[F, State] and Deferred[F, JoinResult[...]].
compute on Scala.js
This is the one place where Scala.js is materially different.
On JVM and Scala Native:
computecan run on the cats-effect runtime in the normal way- the runtime can use its normal scheduler/fiber machinery
On Scala.js:
start,join, andcancelstill work as fiber concurrencycomputecannot promise a real background CPU thread in shared code- the shared implementation should therefore be documented as fairness-only on Scala.js: it yields back to the event loop and then evaluates the pure thunk
- if Bosatsu later needs actual off-main-thread compute on Scala.js, that is a separate Web Worker integration, not something
MatchlessToValuecan solve by itself
So the portable contract is:
- on JVM/Native,
computemay move work to another runtime thread - on Scala.js,
computepreserves async/fiber structure but does not create real parallel CPU execution
Process support in MatchlessToValue, bosatsu_node, and browser Scala.js
Bosatsu/IO/Core.spawn and wait need a concrete effectful runtime design too, and this is the one place where JVM and Scala.js cannot share a single host API.
The important design rule is:
- shared
MatchlessToValuecode depends only on a process capability insideProgRuntime[F] - JVM provides one implementation of that capability
bosatsu_nodeprovides a Node-specific Scala.js implementation- browser-oriented Scala.js provides an unsupported implementation
That avoids baking java.lang.ProcessBuilder assumptions in to shared Scala code.
Why not just use java.lang.ProcessBuilder everywhere
Because Scala.js does not have it.
That means the old JVM-only Predef strategy cannot be the shared design for process support once MatchlessToValue becomes the primary runtime.
What fs2 gives us
This repository already depends on fs2-io on both JVM and Scala.js, and cliJS already uses fs2.io.process.ProcessBuilder under bosatsu_node.
That matters because fs2 demonstrates:
- there is already a shared Scala API for launching child processes on both JVM and Scala.js
- on Scala.js, that fs2 implementation is Node-specific and delegates to Node
child_process.spawn - so a working Node-backed process layer for
bosatsu_nodeis realistic
But fs2 is not a drop-in Bosatsu process runtime contract.
Why not:
- fs2
ProcessBuildermodels command, args, cwd, env, and pipe-based stdio only; it does not directly model BosatsuStdioConfig - fs2
spawnreturns aResource, and releasing that resource kills the child if it is still running - fs2
Processstream docs explicitly warn that canceling in-progress stdin/stdout/stderr work may kill the process - on Scala.js, fs2 process support is a Node runtime story, not a browser story
So the right design is:
- use fs2 as precedent and optional implementation substrate where it fits
- keep Bosatsu’s own
Process/Handlesemantics as the source of truth - make the Node Scala.js runtime explicit rather than pretending all Scala.js targets are equal
Preferred platform split
JVM
On JVM, the ProgRuntime[F] process backend should continue to use java.lang.ProcessBuilder plus cats-effect-managed state.
Recommended JVM runtime object:
- the underlying
java.lang.Process - a cached
Deferred[F, Int]or equivalent shared completion cell for the normalized exit code - optional bridge fibers for
UseHandle(stdin),UseHandle(stdout), andUseHandle(stderr) - a shared completion result for those bridge fibers so repeated
waitcalls do not rerun cleanup
JVM spawn:
- build a
ProcessBuilder(cmd :: args)with no shell wrapper - do not override cwd or environment, so the existing Bosatsu API keeps inheriting both
- map
Inherit,Null, andPipedirectly toProcessBuilder.Redirect.INHERIT,Redirect.DISCARD, andRedirect.PIPE - for
Pipe, return BosatsuHandlevalues around the child streams exactly as the current JVM evaluator does - for
UseHandle, configure that child stream asRedirect.PIPE, then start a cats-effect bridge fiber stdin = UseHandle(h)copies from the supplied readable Bosatsu handle in to the child stdin stream, then closes child stdin on EOFstdout = UseHandle(h)orstderr = UseHandle(h)copies from the child stream to the supplied writable Bosatsu handle- if bridge setup fails, fail
spawnwithIOError - only literal
PipereturnsSome(handle)inSpawnResult
JVM wait:
- if exit code is already cached, return it immediately
- otherwise wait on
process.onExitinsideFrather than blocking a thread inwaitFor - normalize the resulting exit code as needed for the shared Bosatsu contract
- if the process used
UseHandlebridges, wait for those bridge fibers to settle before returning success fromwait - if a bridge fiber failed with an IO-level error, surface that as
Prog[IOError, Int]failure fromwait - canceling the Bosatsu wait fiber does not destroy the process
bosatsu_node
For bosatsu_node, the Scala.js runtime should provide a separate Node-specific process backend.
Recommended Node runtime substrate:
- use a small Scala.js facade over Node
child_process.spawn - use Node stream objects directly for child stdin/stdout/stderr
- keep the Bosatsu runtime object in Scala, just as on JVM, but back it with Node handles instead of Java process objects
This is better than trying to pretend browser Scala.js can participate, and it is also a better semantic fit than relying on raw fs2 Resource lifecycle as the Bosatsu process object.
Node spawn should:
- call Node
child_process.spawn(cmd, args, options)with no shell wrapper - map Bosatsu
Stdiodirectly to Node stdio entries where possible: Inherit->inheritNull->ignorePipe->pipeUseHandle(handle)-> pass the underlying Node stream or file descriptor when the Bosatsu handle wraps one- if a Bosatsu handle cannot be natively handed to Node for
UseHandle, fall back to the same bridge-fiber strategy used on JVM - return
Some(handle)only for literalPipe - cache both normal exit code and signal termination information from the Node child-process events
Node wait should:
- if exit has already been observed, return the cached normalized code immediately
- otherwise wait on a shared
Deferred[F, Int]completed from Node child-process exit events - if Node reports normal exit, return that exit code
- if Node reports signal termination, map the signal name through
os.constants.signalsand return128 + signal_number - if Node reports an unknown signal name that cannot be mapped, fail
waitwithIOError - if
UseHandlefell back to bridge fibers, wait for those bridge fibers before completingwait - canceling the Bosatsu wait fiber does not call
child.kill()
This gives bosatsu_node real process support without requiring impossible browser APIs.
Browser Scala.js
For browser-oriented Scala.js runtimes:
spawnshould fail withIOError.Unsupportedwaitshould fail withIOError.Unsupported- this should be wired in as an explicit browser runtime choice, not discovered accidentally at runtime after trying to import Node modules
So the policy is:
- generic shared Scala.js process support: unsupported
bosatsu_node: supported through a Node-specific backend- JVM: supported through the JVM backend
No internal Future in shared core
The shared Scala implementation should not switch to Future internally just because Scala.js ends in a Future.
Reasons:
- the rest of the tooling already composes in
F - cats-effect gives cancellation and fiber semantics that
Futuredoes not - converting to
Futuretoo early would throw away the exact semantics we are trying to add forstart,join, andcancel
So the design should be:
- shared core:
Evalfor pure value construction,FforProgexecution - Scala.js shell:
IOat the boundary, thenunsafeToFuture()
Concrete design choice
To minimize later ambiguity, this design commits to the following for Scala:
- keep
MatchlessToValue.traversepure - add effectful
runProgF/runProgMainF/runProgTestFhelpers - do not block anywhere in shared Scala code
- do not introduce
Futureinto shared evaluator code - on Scala.js, convert
IOtoFutureonly at the outermost host boundary - accept that shared
computeon Scala.js is fairness-only unless a later Web Worker backend exists - make process support a runtime capability with three explicit cases: JVM supported,
bosatsu_nodesupported, browser Scala.js unsupported
Detailed Python Runtime Design
Process support in ProgExt.py
The current Python runtime already has the right overall shape for spawn and wait: it uses subprocess.Popen, it returns Bosatsu handles for Pipe, and it caches the exit code on the process object.
The design should keep that shape and tighten the concurrency details.
spawn
Recommended Python spawn behavior:
- continue to call
subprocess.Popen([cmd, *args], shell=False, stdin=..., stdout=..., stderr=...) - keep inheriting cwd and environment, because the Bosatsu API does not yet expose either
- map
InherittoNone,Pipetosubprocess.PIPE,Nulltosubprocess.DEVNULL, andUseHandleto the existing Python stream object stored in the Bosatsu handle - validate readable vs writable directions exactly as the current implementation already does
- return
Some(handle)only for streams configured asPipe - keep the current UTF-8-oriented process pipe behavior; the Python runtime’s byte helpers already peel through
.bufferwhen present, so one Bosatsu handle can still support both UTF-8 and byte operations
wait
For correctness once Bosatsu adds concurrency, the Python Process object should carry a little more shared state than it does today:
- the underlying
subprocess.Popen - cached normalized exit code
- a lock
- an
Eventor equivalent shared completion signal - a flag indicating whether a background waiter thread has already been started
Recommended Python wait behavior:
- if the exit code is already cached, return it immediately
- otherwise, ensure exactly one daemon waiter thread is started for that process
- that waiter thread calls
Popen.wait(), normalizes the result, stores it, and signals the shared completion event - all Bosatsu
wait(process)calls block on that one shared completion event rather than issuing their own competingPopen.wait()calls - on POSIX, if
Popen.wait()reports a negative return code, normalize it to128 + signal_number - canceling a Bosatsu fiber blocked in
wait(process)does not kill the external process
This keeps the Python runtime simple, preserves the current subprocess-based design, and gives repeated/concurrent wait calls a precise shared result model.
Detailed C/libuv Runtime Design
Why the current C runtime must change
Today the C runtime is centered on c_runtime/bosatsu_ext_Bosatsu_l_Prog.c, where:
bsts_Bosatsu_Prog_runis a single synchronous stack machine.ProgTagEffectcalls a C callback that must return the nextProgimmediately.Bosatsu/IO/Coreuses blocking stdio and POSIX-style calls inc_runtime/bosatsu_ext_Bosatsu_l_IO_l_Core.c.
That model cannot support a real libuv backend:
uv_run()is the central libuv loop and is not reentrant.- libuv handles are long-lived and requests are short-lived objects that complete later in callbacks.
- file system requests, timers, process exit, stream reads, and
uv_queue_work()all complete asynchronously.
So the libuv backend cannot keep the current “effect callback returns the next Prog now” contract internally. It needs a scheduler that can suspend a fiber and later resume it from a libuv callback.
High-level runtime shape
Keep the public Bosatsu API unchanged and replace the C runtime internals with:
BSTS_Prog_Runtime
- owns one
uv_loop_t - owns the scheduler queue of ready fibers
- owns one
uv_idle_tused to drain the ready queue - tracks the root fiber result
- tracks outstanding child fibers and open runtime-owned handles
BSTS_Prog_Fiber
- current
Prognode being evaluated - continuation stack for
flat_mapandrecover - state:
Ready,Running,Suspended,Done - cancellation flag
- pointer to optional join state
BSTS_Prog_Join
- terminal result slot:
Succeeded,Errored,Canceled, or still running - intrusive list of fibers waiting in
join - ownership count so the runtime can free it after all waiters and handles are gone
BSTS_Prog_Request
- heap object used as the baton for each in-flight libuv request
- stores the owning fiber
- stores request-kind-specific payload such as buffers, paths, offsets, or write data
- stores cleanup and cancellation hooks
Root execution flow
bsts_Bosatsu_Prog_run_main and bsts_Bosatsu_Prog_run_test should become:
- allocate a fresh
uv_loop_twithuv_loop_init - initialize a
BSTS_Prog_Runtimearound that loop - initialize a scheduler
uv_idle_t - create the root fiber from the Bosatsu
MainorProgTest - enqueue the root fiber on the ready queue
- start the idle handle so ready fibers are drained
- call
uv_run(loop, UV_RUN_DEFAULT) - once the root fiber is terminal, cancel any detached child fibers that are still running
- continue draining close callbacks until
uv_loop_alive(loop)is false - call
uv_loop_close
This matches libuv’s requirement that the loop can only be closed after all handles and requests are closed.
Scheduler design
The runtime should use a cooperative fiber scheduler on top of the single libuv loop thread.
Ready queue
- Fibers that can continue immediately are pushed onto a ready queue.
- A single
uv_idle_tcallback drains that queue. - When the queue becomes empty, stop the idle handle.
- When the queue transitions from empty to non-empty, start the idle handle again.
Using uv_idle_t is important because active idle handles force libuv to poll with zero timeout instead of blocking for I/O. That lets pure Bosatsu fibers continue to make progress without reentering uv_run() from inside callbacks.
Step budget
Each time a fiber is scheduled, it should run only up to a fixed step budget, for example 1024 Prog nodes, before being re-enqueued.
This prevents one long pure Bosatsu loop from monopolizing the event loop thread and starving timers, process exit callbacks, stream reads, or completed filesystem requests.
No direct recursive resume from callbacks
libuv callbacks should never directly run an entire fiber to completion. They should:
- write the fiber’s next
Progvalue - mark the fiber
Ready - enqueue it
- return
The idle scheduler callback is the only place that drains ready fibers. This avoids callback nesting and respects the fact that uv_run() is not reentrant.
Internal effect ABI
The current C runtime treats ProgTagEffect as:
- payload argument
- pure callback
BValue -> BValue
That is too weak for libuv because an effect may complete later. The public Prog tag can stay the same, but the internal C ABI of the effect payload should change to a start/resume protocol.
Recommended internal shape:
typedef enum {
BSTS_EFFECT_IMMEDIATE,
BSTS_EFFECT_PENDING
} BSTS_Effect_Result_Tag;
typedef struct {
BSTS_Effect_Result_Tag tag;
BValue next_prog; /* valid only for IMMEDIATE */
} BSTS_Effect_Result;
typedef BSTS_Effect_Result (*BSTS_Effect_Start)(
BSTS_Prog_Runtime *runtime,
BSTS_Prog_Fiber *fiber,
BValue arg,
void *effect_data
);
Then:
bsts_prog_effect1/2/3/4createProgTagEffectvalues holding the effect argument and an opaque effect descriptor.- When the fiber stepper sees
ProgTagEffect, it calls the descriptor’sstart(...). - If the result is
BSTS_EFFECT_IMMEDIATE, evaluation continues in the same fiber activation withnext_prog. - If the result is
BSTS_EFFECT_PENDING, the fiber becomesSuspendedand will only resume from a later libuv callback.
The effect descriptor should also carry:
- a cancel hook
- a cleanup hook
- an effect kind tag for debugging
Fiber semantics in the libuv runtime
The synchronous Pure, Raise, FlatMap, Recover, and ApplyFix logic can remain the same as today.
Only the Effect case changes:
- synchronous effects such as
pure,raise_error,observe,new_var, and mostVaroperations complete immediately - asynchronous effects such as
sleep,compute, stream reads, stream writes, file reads, file writes,spawn, andwaitsuspend the current fiber - the completion callback later installs either
prog_pure(value)orprog_raise_error(err)as the fiber’s nextProgand re-enqueues the fiber
This lets existing Bosatsu code continue to think of Prog as one monad, while the C runtime turns that monad into a libuv-managed continuation machine.
JoinHandle implementation in C
start, join, and cancel should be implemented directly in the fiber scheduler.
C start
- Create a child
BSTS_Prog_Fiber. - Create a
BSTS_Prog_Joinstate object inRunning. - Point the child at that join state.
- Enqueue the child.
- Return the opaque
JoinHandle.
C join
- If the join state is terminal, return that
JoinResultimmediately. - Otherwise, add the current fiber to the join waiters list and suspend it.
- When the child reaches a terminal state, all join waiters are re-enqueued.
C cancel
- Set the child’s cancellation flag.
- If the child is currently suspended in a request that supports cancellation, invoke its cancel hook.
- If the child is ready but not running, leave it in the queue and let the next scheduler checkpoint observe cancellation.
- If the child is already terminal, do nothing.
Cancellation policy for pending libuv work
Cancellation must be effect-specific.
Canceling compute
- Use
uv_queue_work. - If
uv_cancel()succeeds before work starts, the after-work callback reportsUV_ECANCELEDand the fiber becomesCanceled. - If work has already started, cancellation becomes best-effort only. The worker thread runs to completion, but the after-work callback checks the fiber cancellation flag and discards the computed value.
Filesystem requests
- Requests such as
uv_fs_open,uv_fs_read,uv_fs_write,uv_fs_close,uv_fs_scandir,uv_fs_lstat, anduv_fs_renameshould be issued asynchronously with callbacks. - If
uv_cancel()succeeds, the callback will eventually run withUV_ECANCELED. - If the request is already executing, cancellation becomes best-effort only and the callback result is discarded if the fiber is already canceled.
Stream reads and writes
uv_write_trequests are canceled byuv_close()of the underlying handle; libuv reports the write callback withUV_ECANCELED.- A pending stream read is canceled by removing the Bosatsu read waiter from the handle’s pending-read queue. If the handle has no pending reads and no buffered demand,
uv_read_stop()may be called. - Canceling a Bosatsu fiber that is waiting on a stream read does not close the stream itself.
Canceling sleep
- Implement
sleepwith a one-shotuv_timer_t. - Canceling
sleepstops the timer and closes the timer handle. - The fiber then resumes as
Canceled.
Canceling wait(process)
- Canceling a fiber blocked in
Bosatsu/IO/Core::waitonly cancels the Bosatsu wait. - It does not kill the external process.
- If we later want process termination, that should be a separate process API, not part of fiber cancellation.
C Handle representation
The current C runtime stores:
typedef struct {
BSTS_Handle_Kind kind;
FILE *file;
int readable;
int writable;
int close_on_close;
int closed;
} BSTS_Core_Handle;
That must be replaced in the libuv backend with an internal union:
typedef enum {
BSTS_HANDLE_FILE,
BSTS_HANDLE_STREAM
} BSTS_Handle_Kind;
typedef struct BSTS_Core_Handle BSTS_Core_Handle;
struct BSTS_Core_Handle {
BSTS_Handle_Kind kind;
int readable;
int writable;
int closed;
int closing;
int eof;
int close_on_close;
union {
struct {
uv_file fd;
int64_t offset;
int append_mode;
} file;
struct {
uv_stream_t *stream;
ByteQueue read_buffer;
PendingRead *reads_head;
PendingRead *reads_tail;
PendingWrite *writes_head;
PendingWrite *writes_tail;
size_t buffered_bytes;
int read_started;
int backpressured;
} stream;
} as;
};
The Bosatsu Handle surface stays opaque. Only the runtime representation changes.
Regular files vs streams
The libuv backend should not force everything through one abstraction internally.
Regular-file handles
Use uv_file plus uv_fs_* requests:
open_file->uv_fs_openread_bytesandread_utf8->uv_fs_readwrite_bytesandwrite_utf8->uv_fs_writeclose->uv_fs_closestat->uv_fs_lstatrename->uv_fs_renamemkdir->uv_fs_mkdirlist_dir->uv_fs_scandir- temp paths ->
uv_fs_mkdtempanduv_fs_mkstemp
Each regular-file handle should track a logical offset:
- for
Read, start at0and increment after each successful read - for
WriteTruncate, start at0 - for
Append, open with append flags and let the OS append semantics decide the actual write position
Stream handles
Use uv_stream_t plus stream requests:
- stdin/stdout/stderr
- process pipes created by
spawn - any future socket-like handles
For streams:
- reading is driven by
uv_read_start - writing is driven by
uv_write - closing is driven by
uv_close
stdin/stdout/stderr under libuv
Do not keep wrapping C stdio stdin, stdout, and stderr as FILE*.
Instead:
- detect the underlying stdio kind
- if it is a TTY, wrap it with
uv_tty_t - otherwise wrap it with
uv_pipe_tusing the inherited file descriptor
This keeps Bosatsu stdio on libuv-native stream handles, which is required for nonblocking reads and async writes.
Read semantics
The Bosatsu surface already has one opaque Handle, but the runtime must distinguish file reads from stream reads.
read_bytes
For regular files:
- issue one
uv_fs_readrequest for up tomax_bytes - on
result == 0, returnNone - otherwise return exactly the bytes read
- advance the file offset by the number of bytes read
For streams:
- maintain a byte queue on the handle
- keep
uv_read_startactive while the stream is readable - if buffered bytes are available, satisfy the Bosatsu read immediately
- if EOF has already been observed and the buffer is empty, return
None - otherwise register the fiber as a pending read waiter and suspend it
read_utf8
For regular files:
- issue
uv_fs_readfor a raw byte chunk - decode only complete UTF-8 prefixes
- if the returned bytes end mid-code-point, keep the trailing partial bytes in a small per-handle decode buffer and prepend them to the next read
For streams:
- read raw bytes into the same stream byte queue used by
read_bytes - decode from that queue only when at least one full UTF-8 code point is available
- if the queue ends in a partial code point and EOF has not occurred, wait for more bytes
- if EOF occurs with an incomplete trailing sequence, return
InvalidUtf8
This is stricter and more correct than the current fread implementation, which can fail merely because a chunk boundary split a multibyte UTF-8 code point.
Backpressure for stream reads
If the libuv backend keeps uv_read_start active forever, unread data can accumulate without bound. The runtime should therefore implement explicit buffering thresholds.
Recommended policy:
- high-water mark: 1 MiB of buffered unread stream data
- low-water mark: 256 KiB
- once buffered unread data reaches the high-water mark, call
uv_read_stop - once consumers drain the buffer below the low-water mark, call
uv_read_startagain
This keeps process pipes and stdin responsive without turning unread output into unbounded memory growth.
Write semantics
write_bytes and write_utf8
For regular files:
- allocate a request baton owning the bytes to write
- issue
uv_fs_write - on success, advance the file offset for non-append files
- resume the suspended fiber with
Unit
For streams:
- copy or retain the bytes until the
uv_write_tcallback fires - queue writes in order per handle
- resume the waiting fiber from the write callback
flush
To match the current JVM, Python, and C meanings, flush should not mean durable disk sync.
Instead:
- for regular files,
flushwaits for all prior write requests on that handle to complete - for streams,
flushwaits for all queueduv_write_trequests on that handle to complete - no
uv_fs_fsyncis implied byflush
If Bosatsu later needs durable sync, that should be a separate explicit file API.
Closing handles
Closing regular files
- mark the handle closed
- if there are no in-flight requests, issue
uv_fs_close - if there are in-flight requests, defer close until the last callback completes
Closing streams
- mark the handle closed
- call
uv_close - free the libuv stream object only from
close_cb
This follows libuv’s rule that handle memory must stay valid until close_cb.
Bosatsu/IO/Core mapping in the libuv backend
The built-in effectful IO surface should map directly to libuv:
open_file->uv_fs_opencreate_temp_file->uv_fs_mkstempcreate_temp_dir->uv_fs_mkdtemplist_dir->uv_fs_scandir+uv_fs_scandir_nextstat->uv_fs_lstatmkdir->uv_fs_mkdir, with Bosatsu-side recursion helper issuing repeated requestsremove->uv_fs_unlinkoruv_fs_rmdir, with recursive tree walk built fromuv_fs_lstatanduv_fs_scandirrename->uv_fs_renameget_env->uv_os_getenvspawn->uv_spawnwait-> process exit callback overuv_process_tnow_wall->uv_clock_gettime(UV_CLOCK_REALTIME, ...)when available, otherwise existing platform fallbacknow_mono->uv_hrtime()oruv_clock_gettime(UV_CLOCK_MONOTONIC, ...)sleep->uv_timer_t
Process implementation
The current C runtime reports spawn and wait as unsupported. The libuv backend should make them first-class.
Recommended Process runtime object:
uv_process_t process- cached normalized Bosatsu exit code
- completion flag
- intrusive list of Bosatsu fibers waiting in
wait - optional wrapped handles for child stdin/stdout/stderr pipes
Process spawn
- build
uv_process_options_twith argv semantics and no shell wrapper - leave
cwd = NULLandenv = NULLso the current Bosatsu API continues to inherit both - map Bosatsu
Stdiotouv_stdio_container_t Inheritmaps toUV_INHERIT_FDorUV_INHERIT_STREAMNullmaps toUV_IGNOREPipecreates a freshuv_pipe_tand usesUV_CREATE_PIPEplus direction flags from the child-process perspective:- child stdin uses
UV_READABLE_PIPE - child stdout and child stderr use
UV_WRITABLE_PIPE UseHandle(handle)should use native inheritance, not a background copy task:- if the Bosatsu handle wraps a
uv_stream_t, useUV_INHERIT_STREAM - if the Bosatsu handle wraps a regular-file descriptor, use
UV_INHERIT_FD - validate direction before spawning: stdin requires a readable Bosatsu handle; stdout/stderr require writable Bosatsu handles
- return
Some(handle)only for literalPipe;UseHandlestill returnsNoneinSpawnResult - if any step fails after allocating temporary
uv_pipe_thandles, close them and failspawnwithIOError
Process wait
- if the process has already exited, return the cached exit code immediately
- otherwise add the current fiber to the process waiters list and suspend it
- in the
uv_exit_cb, normalize libuv’s(exit_status, term_signal)pair to the BosatsuIntcontract: - if
term_signal == 0, returnexit_status - otherwise return
128 + term_signal - cache that normalized code and wake all waiters
- canceling one Bosatsu
wait(process)removes only that waiter; it does not calluv_process_kill waitdoes not close or drain any returnedPipehandles- close the
uv_process_thandle only after the exit callback
compute implementation
compute should use uv_queue_work.
Worker-side execution
- the work callback evaluates the Bosatsu thunk
Unit -> a - it stores the resulting
BValuein the request baton - it does not call libuv APIs other than thread-safe primitives allowed from worker threads
Loop-side completion
- the after-work callback runs on the libuv loop thread
- if status is
UV_ECANCELED, resume the fiber as canceled - otherwise, if the fiber was canceled while the work was already running, discard the result and mark the fiber canceled
- otherwise resume the fiber with
prog_pure(result)
This gives compute the right “off the event loop” behavior without inventing a second scheduler.
libuv sleep
Implement sleep with a one-shot uv_timer_t:
- create timer handle
uv_timer_start(..., timeout_ms, 0)- on timer callback, close the timer handle and resume the fiber with
Unit - on cancellation, stop and close the timer handle and mark the fiber canceled
Var under libuv
Var does not need libuv. The current atomic/CAS implementation can stay:
- it is already nonblocking
- it is already effectful only at
Progexecution time - it composes with the fiber scheduler without extra event-loop work
Migration from the current C implementation
The current files should not be incrementally peppered with #ifdef LIBUV until they are unreadable.
Recommended implementation split:
- keep the current synchronous runtime as a separate backend for environments that do not want libuv
- add a libuv runtime implementation in separate files
- select the backend at build time
Suggested file layout:
c_runtime/bosatsu_prog_uv.hc_runtime/bosatsu_prog_uv.cc_runtime/bosatsu_ext_Bosatsu_l_Prog_uv.cc_runtime/bosatsu_ext_Bosatsu_l_IO_l_Core_uv.cc_runtime/Makefileor build config changes to link libuv
The existing synchronous files remain the reference implementation for the old backend, but the libuv backend becomes the concurrency-capable one.
Concrete change from today’s C runtime
The migration is not “add a few libuv calls”. It is:
- replace the single blocking
bsts_Bosatsu_Prog_runloop with a fiber scheduler on top ofuv_run - replace
FILE*handles with auv_file/uv_stream_tunion - replace blocking
fread/fwrite/fopen/nanosleeppaths withuv_fs_*,uv_read_start,uv_write,uv_timer_t, anduv_spawn - implement
spawnandwaitfor real - add request batons, close callbacks, and cleanup paths required by libuv
- keep the Bosatsu package surface unchanged
Chosen policy for the libuv backend
To minimize later decision churn, this design commits to the following:
- use libuv directly for all built-in C backend IO and scheduling
- do not keep
FILE*in the libuv backend - do not reinterpret
flushasfsync - do not kill external processes when canceling a Bosatsu fiber blocked in
wait(process) - do not reenter
uv_run()from callbacks - use a scheduler step budget for fairness
- use
uv_queue_workforcompute - use
uv_fs_*for regular files anduv_stream_toperations for streams
libuv + libgc interaction
The libuv backend is not just “an event loop dependency”. It is a multi-threaded runtime:
- the event loop itself runs on one thread
uv_queue_work()runs callbacks on libuv worker threads- asynchronous filesystem operations also use libuv’s global threadpool internally
At the same time, Bosatsu’s C runtime allocates nearly all runtime objects with Boehm GC (GC_malloc, GC_malloc_atomic, etc.), so the libuv backend must be designed as a multithreaded GC client from day one.
What libuv guarantees
Per libuv documentation:
- libuv provides cross-platform threading and synchronization primitives
- libuv has a global threadpool
- that threadpool is used for
uv_queue_work()and for filesystem operations
So if we adopt libuv, the C runtime is definitely multi-threaded even if Bosatsu user code never calls start.
What libgc requires
Per bdwgc upstream:
- the collector is built with multi-threading support enabled by default unless explicitly disabled
- multithreaded client code should define
GC_THREADSbefore includinggc.h - threads created by third-party libraries normally require explicit registration with
GC_register_my_thread() - such explicitly registered threads must later call
GC_unregister_my_thread()
This is the most important integration constraint in the entire design.
Consequence for the Bosatsu libuv backend
The event-loop thread is easy:
- initialize Boehm early with
GC_INIT() - define
GC_THREADSin runtime source files that use the collector - treat the main/event-loop thread as the implicitly registered main thread
The worker-thread story is the real design point:
- libuv worker threads are created by libuv, not by Bosatsu code
- bdwgc explicitly calls out threads created by third-party libraries as the case that normally requires manual registration
- therefore any libuv worker thread that touches GC-managed data must register itself with Boehm first
Required runtime rule
For correctness, the libuv backend should follow this rule:
- no
uv_work_cbmay allocate GC memory or manipulate pointers to the GC heap unless that worker thread has been explicitly registered with Boehm
That means compute cannot be implemented safely by simply calling the Bosatsu thunk on a raw libuv worker thread without GC registration.
Recommended implementation strategy
There are two realistic approaches.
Preferred approach
Register libuv worker threads with Boehm when they first enter Bosatsu-managed work:
- call
GC_allow_register_threads()once on the main thread after initialization and before the first worker-thread registration - at the start of a
uv_work_cbthat will touch Bosatsu values, obtain the stack base and callGC_register_my_thread(...) - run the Bosatsu work callback
- before the worker thread returns to libuv, call
GC_unregister_my_thread()
This follows the upstream guidance for third-party-created threads and works on Unix and Windows.
Conservative fallback
Keep libuv worker callbacks free of GC heap interaction:
- worker threads operate only on plain C buffers and POD state
- all GC-managed
BValuecreation/publication happens back on the loop thread in the after-work callback
This is easier for filesystem requests that already return plain C data from libuv, but it is too restrictive for compute, whose whole job is to evaluate a Bosatsu thunk.
So the fallback is useful for some request batons, but not sufficient as the only model once compute exists.
Cross-platform libuv + libgc runtime rules
For every platform, the libuv backend should assume:
- a thread-safe Boehm build
GC_THREADSis defined when compiling libuv-backend runtime sources that includegc.hGC_INIT()runs on the main thread before Bosatsu runtime work startsGC_allow_register_threads()is called before the first manual worker-thread registration- any libuv-created worker thread must call
GC_register_my_thread(...)before touching GC-managed values - that worker thread must later call
GC_unregister_my_thread() - no reliance on deprecated implicit thread discovery
These are not macOS-specific or Linux-specific rules. They are the shared runtime contract for any Bosatsu backend that combines libuv worker threads with Boehm-managed heap objects.
The doc intentionally chooses explicit registration over implicit discovery because bdwgc documents GC_use_threads_discovery() as deprecated/less robust.
Current platform build plan
This section is specifically about the platforms Bosatsu uses today.
Ubuntu and Debian
For Linux builds on Ubuntu/Debian, the build should assume:
libuv1-devlibgc-devpkg-config- normal C toolchain packages such as
build-essential
Why this is enough:
- Ubuntu’s
libuv1-devpackage description explicitly includes process/thread management, pipes, and work queues - Debian’s
libuv1-devfile list includesuv/threadpool.h,libuv.pc, andlibuv-static.pc - upstream libuv’s threadpool and threading APIs are part of the standard library surface, not an optional “concurrency flavor”
So for Ubuntu/Debian, the answer to “is libuv already built with concurrency support?” is effectively yes. We should treat distro libuv1-dev as the normal libuv with threadpool/thread APIs available, because that is exactly what the package exposes.
Recommended install line:
sudo apt-get update
sudo apt-get install -y build-essential pkg-config libuv1-dev libgc-dev
Recommended build flag discovery:
pkg-config --cflags --libs libuv bdw-gc
Minimum Bosatsu-specific compile setting:
CPPFLAGS="-DGC_THREADS $(pkg-config --cflags libuv bdw-gc)"
LIBS="$(pkg-config --libs libuv bdw-gc) -lm"
macOS with Homebrew
For macOS builds, the build should assume:
brew install libuv bdw-gc pkg-config- Apple Clang as the compiler
- Homebrew bottles for both
libuvandbdw-gcare available on current Apple Silicon and Intel macOS releases
What about Boehm threading support on Homebrew?
- the Homebrew
bdw-gcformula builds with CMake and does not pass an option disabling threads - upstream bdwgc states that multi-threading support is enabled by default unless explicitly disabled
So the right conclusion is:
- Homebrew
bdw-gcon macOS should be suitable for a multithreaded Bosatsu runtime - the same cross-platform libuv + libgc runtime rules above still apply on macOS
Recommended install line:
brew install libuv bdw-gc pkg-config
Recommended build flag discovery:
pkg-config --cflags --libs libuv bdw-gc
Minimum Bosatsu-specific compile setting:
CPPFLAGS="-DGC_THREADS $(pkg-config --cflags libuv bdw-gc)"
LIBS="$(pkg-config --libs libuv bdw-gc) -lm"
If pkg-config metadata is missing or unreliable on a local Homebrew setup, the build should fall back to explicit prefixes:
UV_PREFIX="$(brew --prefix libuv)"
GC_PREFIX="$(brew --prefix bdw-gc)"
CPPFLAGS="-DGC_THREADS -I$UV_PREFIX/include -I$GC_PREFIX/include"
LDFLAGS="-L$UV_PREFIX/lib -L$GC_PREFIX/lib"
LIBS="-luv -lgc -lm"
This mirrors the fallback style already used by the current c_runtime/Makefile for bdw-gc.
Build-system implication for Bosatsu
PR 1 should extend the current C build in this style:
- keep
pkg-configas the preferred mechanism - add
libuvdiscovery alongsidebdw-gc - add
-DGC_THREADSto the libuv backend compile flags - preserve explicit Homebrew-prefix fallbacks for macOS
- preserve the old backend build target so missing libuv does not break the current runtime
Future work: Windows support
Windows should be treated as a later, separate enablement step rather than part of the first libuv rollout.
Why Windows is a distinct project
Windows support adds two independent moving parts:
- libuv itself uses a different host API surface on Windows
- Boehm thread integration is different enough on Windows that we should validate it separately
The good news is that both upstreams support Windows:
- libuv upstream documents Windows as a first-class platform and says CMake is the supported build method there
- bdwgc upstream documents support for recent Windows versions and Win32 threads
- vcpkg currently ships both
libuvandbdwgcpackages, including Windows triplets
Recommended Windows dependency strategy
When Bosatsu eventually adds Windows C runtime support, prefer:
- CMake-based C runtime build
- MSVC toolchain
- vcpkg for dependency acquisition
Recommended future bootstrap:
vcpkg install libuv bdwgc
cmake -B build -S . -DCMAKE_TOOLCHAIN_FILE=<vcpkg-root>/scripts/buildsystems/vcpkg.cmake
cmake --build build --config Release
This avoids inventing Bosatsu-specific dependency instructions for Windows in the first pass.
Windows-specific validation concerns
Windows does not change the shared libuv + libgc rules above, but it does need separate validation under the Windows toolchain and host APIs.
Windows-specific runtime work items
A future Windows enablement PR should explicitly cover:
- build and link of the libuv backend under MSVC
- Boehm thread registration on libuv worker threads
- stdio handle initialization for Windows console and pipe handles
- path and process behavior on Windows via libuv
- CI on
windows-latest
The Windows work should not be merged piecemeal until:
- dependency acquisition is reproducible
- the runtime passes the same basic
Prog,IO/Core,compute, andstart/join/canceltests as the Unix targets
References
- libuv event loop: https://docs.libuv.org/en/v1.x/loop.html
- libuv basics, handles vs requests: https://docs.libuv.org/en/v1.x/guide/basics.html
- libuv filesystem APIs: https://docs.libuv.org/en/v1.x/fs.html
- libuv filesystem guide: https://docs.libuv.org/en/v1.x/guide/filesystem.html
- libuv threadpool and
uv_queue_work: https://docs.libuv.org/en/v1.x/threadpool.html - libuv request cancellation: https://docs.libuv.org/en/v1.x/request.html
- libuv handle closing: https://docs.libuv.org/en/v1.x/handle.html
- libuv timers: https://docs.libuv.org/en/v1.x/timer.html
- libuv processes: https://docs.libuv.org/en/v1.x/process.html
- libuv threadpool: https://docs.libuv.org/en/v1.x/threadpool.html
- libuv threading primitives: https://docs.libuv.org/en/stable/threading.html
- Ubuntu
libuv1-devpackage description: https://launchpad.net/ubuntu/jammy/%2Bpackage/libuv1-dev - Debian
libuv1-devfile list: https://packages.debian.org/bullseye/amd64/libuv1-dev/filelist - Homebrew
libuvformula: https://formulae.brew.sh/formula/libuv - Homebrew
bdw-gcformula: https://formulae.brew.sh/formula/bdw-gc - Homebrew
bdw-gcformula code: https://github.com/Homebrew/homebrew-core/blob/7c5d67716e44ab8c2ca27d1866c6a0504e17bf23/Formula/b/bdw-gc.rb - Homebrew
libuvformula code: https://github.com/Homebrew/homebrew-core/blob/cbaf449cc66bec68e7d12e7f81712284e94d1b25/Formula/lib/libuv.rb - bdwgc README: https://github.com/bdwgc/bdwgc
- bdwgc interface notes: https://www.hboehm.info/gc/gcinterface.html
- bdwgc simple example and threading notes: https://www.hboehm.info/gc/simple_example.html
- bdwgc
gc.hthread registration API: https://raw.githubusercontent.com/bdwgc/bdwgc/master/include/gc/gc.h - libuv upstream build instructions: https://github.com/libuv/libuv
- vcpkg
libuvpackage: https://vcpkg.io/en/package/libuv - vcpkg
bdwgcpackage: https://vcpkg.io/en/package/bdwgc - Python
subprocessdocumentation: https://docs.python.org/3/library/subprocess.html - Java
Processdocumentation: https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/lang/Process.html - Node
child_processdocumentation: https://nodejs.org/api/child_process.html - fs2 JVM process sources (
fs2-io_33.13.0): https://repo1.maven.org/maven2/co/fs2/fs2-io_3/3.13.0/fs2-io_3-3.13.0-sources.jar - fs2 Scala.js process sources (
fs2-io_sjs1_33.13.0): https://repo1.maven.org/maven2/co/fs2/fs2-io_sjs1_3/3.13.0/fs2-io_sjs1_3-3.13.0-sources.jar
Non-goals
- No resource-safe cancellation story in v1. Bosatsu does not yet have
bracket, masking, or finalizers. - No guarantee of immediate cancellation of pure code.
- No scheduler tuning API, priorities, or affinity knobs.
- No
race,timeout,par_map, orpollin the initial surface. - No separate “blocking foreign call offload” API in this design. The built-in C backend should move its own file/process/timer operations to libuv directly, and
computeremains pure-only.
Follow-up APIs After This One
Once the base handle semantics exist, the next useful layer is likely:
poll(handle) -> Prog[f, Option[JoinResult[err, a]]]race(left, right)par_map2yield_noworcedeif libuv fairness becomes important for long pure loops
Those should wait until the semantics of start, join, cancel, and compute are proven across all backends.
Staged Rollout And PR Plan
The rollout should maximize these properties:
- the repo stays green after every PR
- the current synchronous C backend keeps working until libuv parity exists
- public Bosatsu surface area grows only when all required backends in this repo can support it
- the hardest semantic feature,
cancel, lands last
Rollout principles
- Add the libuv C backend first, but do not expose new Bosatsu concurrency functions until the libuv backend can already run the current
Bosatsu/IO/CoreAPI. - Keep the current synchronous C runtime in the tree throughout the migration as a separate backend.
- Add internal runtime machinery before exposing public API, both in the libuv C backend and in Scala
MatchlessToValue. - Keep the current synchronous Scala
Predefrunner available until the effectfulMatchlessToValuerunner has enough parity to take over the command paths that executeProg. - Ship public concurrency in three API slices:
computeJoinHandle+JoinResult+start+joincancel- Every PR must have tests targeted to the exact slice it changes.
PR 1: libuv build plumbing, no behavior change
Scope:
- add libuv as an optional C runtime dependency
- add separate libuv runtime source files and build targets
- keep the existing synchronous C backend as the default/reference backend
- add CI or local build coverage that the libuv backend compiles
Do not do yet:
- no public Bosatsu API changes
- no semantic changes to existing runtime behavior
- no attempt to port IO functions yet
Acceptance:
- existing tests stay green on the current default backend
- the libuv backend builds successfully as an alternate target
PR 2: internal libuv Prog scheduler, still no public concurrency API
Scope:
- add
BSTS_Prog_Runtime,BSTS_Prog_Fiber, ready queue, anduv_idle_tscheduler - replace the old single blocking C
Progloop in the libuv backend with the new suspend/resume engine - support immediate effects needed for current
Progbehavior:pure,raise_error,flat_map,recover,apply_fix,observe, andVar
Do not do yet:
- no new Bosatsu exports
- no async IO/Core port beyond what is needed to boot the runtime
- no
start,join,cancel, orcompute
Acceptance:
- existing
ProgandVartests pass on the libuv backend - libuv runtime can run simple
MainandProgTestvalues with no concurrency features
PR 3: libuv time primitives
Scope:
- port
now_wall - port
now_mono - port
sleepusinguv_timer_t
Reason:
sleepis the first true async effect and validates the suspend/resume machinery before file and process work
Acceptance:
- current time and sleep behavior matches existing Bosatsu semantics
- the repo stays green on the default backend, and libuv-specific tests pass
PR 4: libuv stream-handle foundation for stdio writes
Scope:
- replace
FILE*stdout/stderr handling in the libuv backend withuv_stream_t - implement
write_utf8 - implement
write_bytes - implement
flushandclosefor stream handles
Do not do yet:
- no readable stream buffering yet
- no regular-file support yet
Acceptance:
- Bosatsu stdout/stderr programs work on the libuv backend
- flush semantics remain aligned with JVM/Python
PR 5: libuv stream reads and buffering
Scope:
- implement stdin and pipe reading on
uv_stream_t - implement
read_bytes - implement
read_utf8 - add per-handle buffering and backpressure thresholds
Do not do yet:
- no process support yet
- no spawned-pipe tests yet; those belong with
spawnin PR 8
Acceptance:
- stdin and generic
uv_stream_tread semantics match Bosatsu expectations - UTF-8 chunking is correct across multibyte boundaries
PR 6: libuv regular-file operations
Scope:
- implement the
uv_file-based handle branch - port
open_file - port regular-file
read_bytes,read_utf8,write_bytes,write_utf8 - port
closeandflushfor regular files - port temp file and temp dir creation
Acceptance:
- existing file IO Bosatsu programs run on the libuv backend
- append/read/write offset behavior is covered by tests
PR 7: libuv path, stat, directory, and environment functions
Scope:
- port
list_dir - port
stat - port
mkdir - port
remove - port
rename - port
get_env
Acceptance:
- the non-process
Bosatsu/IO/Coresurface is complete on the libuv backend - directory and recursive remove semantics are covered by tests
PR 8: libuv process support for the existing IO surface
Scope:
- implement
spawn - implement
wait - implement stdio mapping for
Inherit,Null,Pipe, andUseHandle - implement child stdio pipe mapping with
uv_process_tanduv_pipe_t - normalize process exit vs signal termination to the Bosatsu
Intcontract - cache process exit status and repeated waits
Acceptance:
- the existing
Bosatsu/IO/CoreAPI is now fully implemented on the libuv backend - no new Bosatsu concurrency functions have been exposed yet
spawn/waittests coverPipe,UseHandle, repeatedwait, and signal/exit-code normalization- CI includes libuv backend coverage for representative existing IO programs
PR 9: internal MatchlessToValue effectful runner, no public API change
Scope:
- add
ProgRuntime[F]plusrunProgF/runProgMainF/runProgTestF - implement current non-process
Progsemantics in the effectful Scala runner - port the Scala-side time primitives needed by current
Progexecution - keep the synchronous
Predefrunner available in parallel - add representative parity tests for JVM
MainandProgTestexecution on the new runner
Do not do yet:
- no public Bosatsu concurrency exports yet
- no switch of existing command paths by default yet
- no
spawn/waitparity requirement yet - no requirement yet that Scala process support be complete on all hosts
Acceptance:
- the effectful JVM Scala runner can execute representative
Prog,Main,ProgTest,Var, andsleepprograms - no public Bosatsu surface area changes in this PR
PR 10: MatchlessToValue current-surface parity and Scala.js runtime split
Scope:
- port the remaining current
Bosatsu/IO/Coreoperations exercised by the Scala runtime paths, at minimum stdin/stdout/stderr, file/env/time, and process support - implement JVM
spawn/waitparity on top of the effectful runner, includingUseHandlebridging and exit normalization - add the explicit Scala.js runtime split:
bosatsu_nodeprocess backend over Nodechild_process- browser Scala.js process backend that reports
Unsupported - switch the Scala command paths that execute
Progto the effectful runner - add tests for JVM parity and the explicit Node/browser process split
Acceptance:
- existing Scala command and test paths stay green on JVM with the effectful runner
bosatsu_noderemains supported, with process behavior coming from the Node backend rather than hidden JVM assumptions, and browser Scala.js reports explicitUnsupportedforspawn/wait- no public Bosatsu concurrency functions have been exposed yet
PR 11: switch public compute on across all backends
Scope:
- add
computetoBosatsu/Prog - implement it in:
- Scala
MatchlessToValue+ cats-effect on JVM - Scala.js with the documented fairness-only semantics
- Python runtime
- C libuv backend with
uv_queue_work - add docs and tests for
compute
Why this is its own PR:
computeis useful immediately- it is much simpler than full child-fiber handles
- it exercises the core “off the main scheduler” story without adding
JoinHandle
Acceptance:
computeworks on JVM Scala, Scala.js with the documented limitation, Python, and libuv C- the public API grows by exactly one function in this PR
PR 12: add JoinHandle, JoinResult, start, and join
Scope:
- add
JoinHandleandJoinResulttoBosatsu/Prog - add
start - add
join - implement the feature in:
- Scala
MatchlessToValue+ cats-effect on JVM and Scala.js - Python runtime
- C libuv backend
- add tests for repeated joins, join after completion, join from multiple waiters, and child error propagation
Do not do yet:
- no public
cancelyet
Why this split is useful:
- it keeps child-fiber semantics reviewable without mixing in cancellation
start+joinalready enable useful parallel composition
Acceptance:
- child success and child error semantics match across all backends
- repo remains green without having committed to public cancellation yet
PR 13: add cancel
Scope:
- add
canceltoBosatsu/Prog - implement request-specific cancel hooks in all runtimes
- add tests for:
- cancel before completion
- cancel after completion
- cancel of
compute - cancel of
sleep - repeated cancel
- cancel of a blocked
join - cancel of a fiber blocked in
wait(process)on backends that support processes
Why cancel is last:
- it is the most semantics-heavy part
- it forces us to define behavior for each pending effect
- by this point the scheduler, IO backend, and join handles already exist
Acceptance:
cancelbehavior is documented and tested across Scala, Python, and libuv C- canceling a Bosatsu wait on an external process does not kill that process on the backends that support process APIs
- no previously shipped API needs to change shape
PR 14: optional higher-level Bosatsu combinators
Only after the base concurrency surface is stable should we add Bosatsu-level helpers such as:
pollracepar_map2yield_noworcede
These should stay out of the critical path for the runtime migration.
PR dependencies and parallel tracks
The numbering above is the recommended merge order, not a claim that every PR must be developed strictly one after another. In the dependency bullets below, PR A -> PR B means PR B depends on PR A, so PR A must land first.
Dependencies:
PR 1 -> PR 2PR 2 -> PR 3PR 2 -> PR 4PR 4 -> PR 5, because readableuv_stream_tsupport should build on the same stream-handle representation introduced for writesPR 2 -> PR 6PR 2 -> PR 7(PR 4, PR 5, PR 6) -> PR 8, becausespawn/waitneed stream pipes plusUseHandleover already-existing Bosatsu handles, including file-backed handlesPR 7is not a narrow technical prerequisite forspawnitself, but it should land before or withPR 8so PR 8 can honestly be the “libuv reaches currentBosatsu/IO/Coreparity” milestonePR 9 -> PR 10(PR 8, PR 10) -> PR 11PR 11 -> PR 12 -> PR 13 -> PR 14
Parallelizable work:
- after
PR 2, the libuv C track can split into four mostly independent branches: PR 3for time primitivesPR 4 -> PR 5for stream handles, reads, and bufferingPR 6for regular filesPR 7for path/stat/directory/envPR 8should wait for the stream and file-handle work it depends on, then merge as the parity-closing process PR- the Scala internal migration can run in parallel with most of the libuv parity work:
PR 9 -> PR 10- in practice, that means there are two major pre-public parallel tracks:
- C/libuv parity:
PR 1 -> PR 2 -> {PR 3, PR 4 -> PR 5, PR 6, PR 7} -> PR 8 - Scala internal migration:
PR 9 -> PR 10 - the public concurrency surface should stay serial even if multiple people are available:
PR 11thenPR 12thenPR 13PR 14is intentionally optional and should wait until the base public surface has stabilized
Why this PR ordering is the safest
This order keeps the repository working at every stage because:
- the libuv backend is introduced before we depend on it for public API
- the current C backend remains available while libuv reaches parity
- the Scala runtime migration happens internally before any public concurrency surface depends on it
- public API additions are small and sequential
cancelonly lands after the simpler pieces are already proven
Smallest sensible public slices
To answer the “can we add a few functions at a time?” question directly:
- yes, and the right slices are still
computefirst, thenstart/join, thencancel - adding
startwithoutjoinis not useful enough - adding
cancelbeforestart/joinwould force semantics before the handle model exists - adding all four at once is unnecessary risk
- but those public slices should only begin after libuv C parity and the internal Scala runner migration are already in place