Minimal IO Functions for Basic Tools (core_alpha + Prog)
Status: proposed
Date: 2026-02-20
Issue: https://github.com/johnynek/bosatsu/issues/1744
PR: https://github.com/johnynek/bosatsu/pull/1745
Problem
core_alpha currently exposes only Bosatsu/IO/Std (print, println, stderr variants, and read_stdin_utf8_bytes).
That is enough for simple CLI output/input, but not enough for basic tool-building tasks (files, directories, process spawning, env lookup, and time/sleep).
The target API is a minimal core IO surface adapted to Bosatsu style and implemented uniformly on JVM evaluator runtime, transpiled Python runtime, and C runtime.
Decision Summary
- Add a new package
Bosatsu/IO/Coreincore_alphacontaining the file/process/env/time primitives and related IO types. - Keep
Bosatsu/Progas the single effect boundary; every effectful operation returnsProg[IOError, a]. - Keep
Bosatsu/IO/Error::IOErroras the shared error channel. - Keep
argvat theMainboundary (Main(args -> ...)) and do not duplicate it inBosatsu/IO/Core. - Represent
Pathas an opaque Bosatsustruct Path(to_String: String)and exportpath_sep+ path helpers. - Represent
Instant/Durationas opaque Bosatsu structs backed by nanosecondIntfields. - Keep
Bosatsu/IO/Stdsource-compatible and addread_lineandread_all_stdin. - Do not add direct
exittoBosatsu/IO/Core; model termination withMainreturn codes and explicit result types.
Where in core_alpha
New package
test_workspace/Bosatsu/IO/Core.bosatsu
Existing package updates
test_workspace/Bosatsu/IO/Std.bosatsu: wrappers overCore(stdout,stderr,write_text,read_text,flush) plusread_lineandread_all_stdin.test_workspace/core_alpha_conf.json: addBosatsu/IO/Coretoexported_packages.
Runtime implementation files
- JVM evaluator:
core/src/main/scala/dev/bosatsu/Predef.scala - Python runtime:
test_workspace/ProgExt.py,test_workspace/Prog.bosatsu_externals - C runtime: new
c_runtime/bosatsu_ext_Bosatsu_l_IO_l_Core.cand.h, plusc_runtime/Makefile
Bosatsu API Shape (adapted)
Naming follows current stdlib style (snake_case), except where a type name appears in the identifier (string_to_Path, path_to_String). Bosatsu/IO/Core exports externals directly (no duplicated *_impl wrappers).
package Bosatsu/IO/Core
from Bosatsu/Prog import Prog
from Bosatsu/IO/Error import IOError
export (
Path,
path_sep,
string_to_Path,
path_to_String,
path_join,
path_parent,
path_file_name,
Handle,
Process,
Instant,
Duration,
FileKind(),
FileStat(),
OpenMode(),
Stdio(),
StdioConfig(),
SpawnResult(),
stdin,
stdout,
stderr,
read_text,
write_text,
flush,
close,
open_file,
list_dir,
stat,
mkdir,
remove,
rename,
get_env,
spawn,
wait,
now_wall,
now_mono,
sleep,
)
external path_sep: String
# Opaque in public API because constructor is not exported.
struct Path(to_String: String)
# Lift/projection helpers stay in Bosatsu so callers only depend on Path.
# `None` means the input is not a valid cross-platform path in the
# portable profile described below.
def string_to_Path(s: String) -> Option[Path]
def path_to_String(path: Path) -> String
def path_join(base: Path, child: Path) -> Path
def path_parent(path: Path) -> Option[Path]
def path_file_name(path: Path) -> Option[String]
external struct Handle
external struct Process
# Opaque in public API because constructors are not exported.
struct Instant(epoch_nanos: Int)
struct Duration(to_nanos: Int)
enum FileKind:
File
Dir
Symlink
Other
struct FileStat(kind: FileKind, size_bytes: Int, mtime: Instant)
enum OpenMode:
Read
WriteTruncate
Append
enum Stdio:
Inherit
Pipe
Null
UseHandle(handle: Handle)
struct StdioConfig(stdin: Stdio, stdout: Stdio, stderr: Stdio)
struct SpawnResult(
proc: Process,
stdin: Option[Handle],
stdout: Option[Handle],
stderr: Option[Handle],
)
external stdin: Handle
external stdout: Handle
external stderr: Handle
external def read_text(h: Handle, max_chars: Int) -> Prog[IOError, Option[String]]
external def write_text(h: Handle, s: String) -> Prog[IOError, Unit]
external def flush(h: Handle) -> Prog[IOError, Unit]
external def close(h: Handle) -> Prog[IOError, Unit]
external def open_file(path: Path, mode: OpenMode) -> Prog[IOError, Handle]
external def list_dir(path: Path) -> Prog[IOError, List[Path]]
external def stat(path: Path) -> Prog[IOError, Option[FileStat]]
external def mkdir(path: Path, recursive: Bool) -> Prog[IOError, Unit]
external def remove(path: Path, recursive: Bool) -> Prog[IOError, Unit]
external def rename(from: Path, to: Path) -> Prog[IOError, Unit]
external def get_env(name: String) -> Prog[IOError, Option[String]]
external def spawn(cmd: String, args: List[String], stdio: StdioConfig) -> Prog[IOError, SpawnResult]
external def wait(p: Process) -> Prog[IOError, Int]
external now_wall: Prog[IOError, Instant]
external now_mono: Prog[IOError, Duration]
external def sleep(d: Duration) -> Prog[IOError, Unit]
Path representation tradeoff
- Chosen shape: native
struct Path(to_String: String)with hidden constructor, pluspath_sepand helper APIs. - Advantage: callers manipulate
Pathvalues without runtime-specific wrapper allocation. - Cost: runtime implementations still convert
Path.to_Stringto platform-native path objects at IO call boundaries. - Alternative considered:
external struct Pathparsed once per value. That can reduce repeated conversion but increases runtime payload complexity and portability risk. - Follow-up option: if profiling shows conversion overhead, keep the same surface API and switch internals to
external struct Path.
Path parsing and cross-platform behavior
string_to_Path: String -> Option[Path]is intentionally partial because not everyStringcan be used as a path on every target runtime.- POSIX baseline: pathnames are slash-separated byte sequences;
NULis not allowed, slash is the separator, null pathname is invalid, and exactly two leading slashes have implementation-defined meaning. - Java baseline (
java.nio.file.FileSystem.getPath): parsing is implementation-dependent, uses platform path rules, and throwsInvalidPathExceptionfor rejected strings (for example,NULon UNIX). - Python baseline (
pathlib):PurePathparsing is lexical (no filesystem access), while concrete IO operations apply host filesystem validation later. - To get deterministic behavior across macOS, Linux/Unix, and Windows,
string_to_Pathuses a Bosatsu-level portable parser instead of delegating directly to host-native parsing. - Parsing policy:
- Accept separators
/and\in input; normalize storedPath.to_Stringto/. - Accept roots in these forms: relative (
a/b), POSIX absolute (/a/b), Windows drive absolute (C:/a/b), UNC (//server/share/a). - Reject Windows drive-relative form (
C:tmp/file) because meaning depends on per-drive current directory. - Reject Windows device namespace prefixes (
\\\\?\\,\\\\.\\) in v1. - Reject ambiguous POSIX-like paths that start with exactly
//unless they parse as UNC with non-empty server/share components. - Reject strings containing
NUL(\\u0000) or control characters\\u0001..\\u001F. - Reject path components containing Windows-reserved characters
< > : " | ? *(except the drive colon inC:/...). - Reject Windows reserved device names as components (case-insensitive):
CON,PRN,AUX,NUL,COM1..COM9,LPT1..LPT9(including with extensions likeNUL.txt). - Reject components with trailing space or trailing dot to avoid Windows shell/API mismatch.
- Keep
.and..as lexical components; do not resolve symlinks or normalize away..at parse time. - Do not enforce
PATH_MAX/NAME_MAXat parse time; those checks remain runtime/filesystem specific. path_join(base: Path, child: Path) -> Pathstays total because both inputs are already validatedPathvalues.- Why
Option[Path]instead of totalString -> Path: parsing failure is expected for non-portable or malformed inputs, and callers can handleNonewithout exceptions in pure code. - Representative rejected inputs:
""(empty string)"a\u0000b""C:tmp\\x""foo/<bar>""NUL.txt""dir/ends-with-dot."- Representative accepted flow:
string_to_Path("src") -> Some(child)string_to_Path("/tmp") -> Some(base)path_join(base, child) -> /tmp/srcpath_join,path_parent, andpath_file_nameare lexical Bosatsu helpers (not externals); runtime-specific path handling is required only at IO call boundaries such asopen_file,stat, andlist_dir.- References:
- POSIX pathname definition and resolution: https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap03.html#tag_03_271, https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap04.html#tag_04_11
- Java parsing contract (
FileSystem.getPath): https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/nio/file/FileSystem.html#getPath(java.lang.String,java.lang.String...) - Python
pathliblexical behavior and flavor differences: https://docs.python.org/3/library/pathlib.html - Windows naming constraints: https://learn.microsoft.com/en-us/windows/win32/fileio/naming-a-file
Process termination tradeoff
- This design intentionally omits direct
exitfromBosatsu/IO/Core. - Abrupt process termination from inside arbitrary
Progcode makes cleanup and structured error handling less safe. - Preferred pattern: return status from
Main(Prog[err, Int]), or encode early termination in program types such asProg[IOError, Result[Int, a]].
Mapping from the requested primitives
stdin->Bosatsu/IO/Core::stdinstdout->Bosatsu/IO/Core::stdoutstderr->Bosatsu/IO/Core::stderrreadText->read_textwriteText->write_textflush->flushclose->closeopenFile->open_filelistDir->list_dirstat->statmkdir->mkdirremove->removerename->renameargv->Main(args -> ...)argument list (not duplicated inBosatsu/IO/Core)getEnv->get_envexit-> intentionally omitted; useMainreturn codes and typed early-termination (Result[Int, a]) insteadspawn->spawnwait->waitnowWall->now_wallnowMono->now_monosleep->sleep
Type mapping in Bosatsu / Predef / core_alpha
String,Int,Bool,List[a],Option[a],Unitmap to existingBosatsu/Predefbuiltins.Pathmaps to opaquestruct Path(to_String: String)inBosatsu/IO/Corewith helpers (string_to_Path,path_to_String,path_join,path_parent,path_file_name) and externalpath_sep.Int64 sizeBytesmaps toInt(BosatsuIntis arbitrary precision; runtimes convert native 64-bit values).Instantmaps to opaquestruct Instant(epoch_nanos: Int).Durationmaps to opaquestruct Duration(to_nanos: Int).FileKind,FileStat,OpenMode,Stdio,StdioConfig,SpawnResultare newenum/structtypes inBosatsu/IO/Core.HandleandProcessmap to opaqueexternal structtypes inBosatsu/IO/Core.- Program entrypoint args use
Bosatsu/Prog::Main(run: List[String] -> forall err. Prog[err, Int]). - Constructors for
Path,Instant, andDurationare intentionally hidden from consumers (type exported, constructor not exported).
Runtime semantics (shared contract)
read_textreturnsNoneonly for EOF; otherwiseSome(chunk)where chunk length is1..max_chars.read_text(max_chars <= 0)returnsInvalidArgument.list_dirreturns child paths sorted bypath_to_Stringfor deterministic behavior.statreturnsNonefor missing path,Some(FileStat(...))otherwise.stat.kindisSymlinkwhen path itself is a symlink (lstat-style classification).remove(recursive = true)removes directory trees without following symlinks.spawnnever invokes a shell;cmd+argsare executed directly.waitis idempotent: once complete, repeated waits return the same exit code.- Time precision is nanoseconds.
now_wallreturns a wall-clock timestamp value (Instant) whose intended encoding is UNIX epoch nanoseconds.now_monoreturns a monotonic elapsed-time reading (Duration) and is not affected by wall-clock changes.sleepconsumesDurationin nanoseconds and returnsInvalidArgumentfor negative durations.
Bosatsu/IO/Std additions
- Add
read_line: Prog[IOError, Option[String]]. - Add
read_all_stdin: Prog[IOError, String]. read_lineblocks until newline (\n) or EOF; returnsNoneonly when EOF is reached before reading any characters.read_linestrips trailing line ending (\nor\r\n) from returned text.read_all_stdinreads until EOF (stdin stream closed), not merely until “nothing currently available”.- Both functions are implemented in
Bosatsu/IO/Stdvia repeatedBosatsu/IO/Core::read_text(stdin, chunk_size).
Java/Python/C implementation plan
JVM (Predef.scala)
- Extend
jvmExternalswith allBosatsu/IO/Coresymbols. - Add internal runtime objects:
HandleValue(stdin,stdout,stderr, file, child pipe read/write)ProcessValue(wrapjava.lang.Process, cached exit status)- Implement IO effects using
prog_effectdispatch: - Files:
java.nio.file.Files+java.iostreams - Process:
ProcessBuilder+ redirected streams - Env:
System.getenv - Path conversion: convert
Path.to_Stringtojava.nio.file.Pathat each filesystem call - Wall clock:
java.time.Instant.now()converted to epoch nanoseconds - Monotonic clock:
System.nanoTime() - Sleep:
Thread.sleep/LockSupport.parkNanosfrom duration nanoseconds - Keep existing
ProgRunResulttestability by preserving capture-mode behavior for stdio when run under evaluator tests.
Python (ProgExt.py + externals map)
- Add runtime classes for
HandleandProcesswrapper values (Path/Instant/Durationremain Bosatsu struct values). - Add new
ProgExtconstructors returningeffect(...)thunks for each primitive. - Implement using stdlib:
- Files/dirs/stat/remove/rename/path ops:
os,pathlib,shutil - Spawn/wait/pipes:
subprocess.Popen - Wall clock:
time.time_ns - Monotonic clock:
time.monotonic_ns - Sleep:
time.sleep(nanos / 1_000_000_000.0) - Env:
os.environ.get - Update
test_workspace/Prog.bosatsu_externalsforBosatsu/IO/Coresymbol remapping.
C (c_runtime)
- Add
bosatsu_ext_Bosatsu_l_IO_l_Core.c/.hand include in build/install targets. - Represent opaque runtime handles/processes as
alloc_external(...)payloads (Path/Instant/Durationstay Bosatsu struct values). - Implement using POSIX APIs first (same approach as current IO/Error errno mapping):
- Files/dirs/stat/remove/rename/path ops:
open/fopen,readdir,lstat,mkdir,unlink/rmdir,rename - Spawn/wait/pipes:
fork/execvp/pipe/waitpid(orposix_spawnvariant) - Wall clock:
clock_gettime(CLOCK_REALTIME, ...) - Monotonic clock:
clock_gettime(CLOCK_MONOTONIC, ...) - Sleep:
nanosleep - Env:
getenv - Reuse and extend current errno-to-
IOErrormapper (c_runtime/bosatsu_ext_Bosatsu_l_IO_l_Core.c) into shared helpers forIO/Core.
Compatibility and migration
Bosatsu/IO/Stdremains source-compatible.- Existing
print/printlnbehavior is preserved. - New code can choose either high-level
IO/Stdor low-levelIO/Core. argvremains available at theMain(args -> ...)boundary.- Program termination should flow through
Mainreturn codes rather thanIO/Core::exit. - This design assumes the post-#1748
Bosatsu/Progshape (Prog[err, res]with no reader env).
Tests and conformance
- Add package-level evaluator tests in
core/src/test/scala/dev/bosatsu/EvaluationTest.scalafor each primitive family. - Add Python transpile + execute tests using
Prog.bosatsu_externalsmappings. - Add C transpile + runtime tests linked against
c_runtimefor file/process/time operations. - Add cross-runtime parity tests for:
- EOF semantics
- error tag parity (
IOErrorvariants) - spawn stdio pipe behavior
- wall/monotonic clock semantics
- sleep minimum behavior
Rollout
- Land this design doc.
- Open issue:
add minimal set of IO functions for basic tools. - Implement
Bosatsu/IO/Core+ runtime bindings in JVM/Python/C. - Rebase
Bosatsu/IO/StdonIO/Corewrappers and addread_line/read_all_stdin. - Regenerate core_alpha docs and release a new
core_alphaversion.