Compiler Leveraging Type Inhabitedness

Status: proposed
Date: 2026-02-19 Issue: https://github.com/johnynek/bosatsu/issues/1724

Goal

Use type inhabitedness in the compiler to:

identify uninhabited types with explicit, complete rules (in a supported fragment),
improve match totality by excluding impossible constructors,
let users omit uninhabited branch cases,
prune unreachable branches before codegen.

Motivation

We already rely on totality and strong typing, but we do not currently leverage full type inhabitedness information during match checking.

Examples we want to classify statically:

forall a. a is uninhabited.
struct Nothing(never: Nothing) is uninhabited.
A struct with at least one uninhabited field is uninhabited.
An enum where every constructor is uninhabited is uninhabited.
A function type returning an uninhabited type is uninhabited when all its arguments are inhabited.

Current State

Package.inferBodyUnopt runs TotalityCheck on untyped Expr before rank-n inference.
This cannot use final instantiated scrutinee types (for example Option[NothingLike]) to remove impossible constructors during coverage.

Scope and Completeness

Full inhabitation for unrestricted higher-rank/higher-kinded polymorphism is not a practical target here.
This design defines a complete decision procedure for a supported fragment and a conservative fallback outside it.

Supported complete fragment:

kind-Type (value) types,
closed or locally-quantified types built from:
ADTs from TypeEnv,
function types,
forall / exists over kind-Type variables.

Outside the fragment (for example higher-kinded quantifier-heavy cases), analysis returns Unknown and totality behavior remains conservative (no new pruning).

Formal Rules

Let I(t) mean “type t is inhabited”, U(t) mean “type t is uninhabited”.

Base and constructors

Base literal types (Int, String, Char, Float64, Bool, Unit) are inhabited.
For constructor C : τ1, ..., τn -> T:
I(C) iff I(τ1) && ... && I(τn).
For struct type T (single constructor):
U(T) iff at least one field type is uninhabited.
For enum type T:
U(T) iff every constructor is uninhabited.

Functions

For Fn(τ1, ..., τn) -> ρ:

U(Fn(...)->ρ) iff U(ρ) and all arguments are inhabited.
Equivalent inhabited form:
I(Fn(...)->ρ) iff I(ρ) or at least one argument is uninhabited.

This captures the requested rule:

returning uninhabited type + all args inhabited => uninhabited,
returning uninhabited type + some arg uninhabited => inhabited (vacuous function).

Quantifiers

For kind-Type quantifiers:

I(forall a. t) iff for all assignments of a, I(t).
U(forall a. t) iff there exists an assignment of a with U(t).
I(exists a. t) iff there exists an assignment of a with I(t).
U(exists a. t) iff for all assignments of a, U(t).

This yields:

U(forall a. a) (pick an uninhabited assignment),
I(forall a. a -> a) (identity),
I(forall a. List[a]) (empty list constructor),
U(forall a. (a, a)) if tuple construction requires I(a).

Recursive ADTs

For mutually recursive type constants, define equations:

I(T(args)) = OR over constructors c of T: (exists c-existentials. AND over c-fields fi: I(fi))

Solve as least fixed point over the finite equation graph for the query.
Least fixed point gives expected inductive behavior:

struct Nothing(never: Nothing) => uninhabited,
recursion only through impossible constructor paths stays uninhabited.

Algorithm Design

New module: core/src/main/scala/dev/bosatsu/Inhabitedness.scala

Core API:

verdict(t: Type, env: TypeEnv[Kind.Arg]): Inhabitedness.Verdict
constructorReachable(scrutinee: Type, cons: (PackageName, Constructor), env): Reachability

Proposed result types:

Inhabited
Uninhabited
Unknown (outside supported complete fragment)

Implementation strategy:

Normalize type (Type.normalize), split quantifiers, and classify by shape.
Build/evaluate equations with memoization by (Type, quantifier context).
Use finite quantifier elimination for kind-Type vars (boolean inhabited/uninhabited assignments).
Solve recursive SCCs with fixed-point iteration.
If unsupported kind-level features are encountered, return Unknown.

Using Inhabitedness in Matches

Introduce typed totality checking that uses scrutinee type + constructor reachability.

Coverage domain

For match scrutinee type S, define:

reachableConstructors(S) = constructors of S whose argument bundle is inhabited under current type instantiation.
Coverage requires only these constructors.
Constructors proven uninhabited are removed from missing-branch requirements.

Effects

Users may omit branches for uninhabited constructors.
Branches that explicitly match uninhabited constructors are reported as unreachable.
Pattern-set operations still work, but domain expansion (topFor) filters out unreachable constructors.

Pipeline Changes

Current:

SourceConverter -> TotalityCheck on Expr -> Infer -> TypedExprNormalization

Proposed:

SourceConverter -> Infer -> TypedTotalityCheck (inhabitedness-aware) -> TypedExprNormalization

Details:

Keep cheap source-level pattern-shape validation where helpful.
Move exhaustiveness/unreachable match checks to typed AST.
Package.inferBodyUnopt should run typed totality after successful inference.
Diagnostics stay at source regions via typed tags already attached to TypedExpr.

Match Pruning and Codegen

Branch pruning should run after typed totality (or as part of normalization):

Remove branches that are impossible from inhabitedness facts.
Keep source order of remaining branches.
Preserve guard semantics (guarded branches still do not count toward totality unless unguarded fallback exists).

Codegen benefits:

smaller match matrices,
fewer runtime constructor checks,
fewer dead branches emitted in Matchless IR.

Edge Cases

Polymorphic matches with free kind-Type vars:
prune only when constructor is unreachable for all assignments.
Outside supported fragment:
Unknown means no pruning and existing conservative totality behavior.
If a scrutinee type itself is uninhabited:
every branch is unreachable,
totality is vacuously satisfied.
Optional future extension: allow empty match syntax explicitly.

Test Plan

Add tests across TotalityCheck replacement + inference + normalization:

forall a. a classified uninhabited.
Recursive self-field struct classified uninhabited.
Struct with one uninhabited field classified uninhabited.
Enum with one inhabited and one uninhabited constructor:
missing inhabited constructor still errors,
missing uninhabited constructor does not error.
Explicit branch on uninhabited constructor is unreachable.
Function rule:
Unit -> Nothing uninhabited,
Nothing -> Nothing inhabited.
Guard interactions:
guarded branch still excluded from totality coverage as today.
Regression tests for existing match behavior when no inhabitedness facts apply.

Rollout Plan

Phase 1: add Inhabitedness module + unit tests.
Phase 2: add typed totality checker and wire into Package.inferBodyUnopt.
Phase 3: apply branch pruning in normalization/Matchless lowering.
Phase 4 (optional): syntax/AST support for empty matches on provably uninhabited scrutinees.

Trade-offs

Pros:

Better exhaustiveness precision.
Cleaner user code (no mandatory impossible cases).
Better generated code via dead-branch pruning.

Costs:

More compiler complexity (new analysis + typed totality phase).
Some cases remain conservative (Unknown) by design.
Totality diagnostics now depend on successful type inference.

The source code for this page can be found here.

Next: Pattern Guards Design