Latest Results
feat: add lambda expression support to BAML compiler (#3302)
## What problems was I solving
BAML had no support for lambda expressions (anonymous functions /
closures). Users couldn't write inline callbacks, pass functions to
higher-order methods like `.map()`, or use closures that capture
variables from enclosing scopes. Every transformation required a named
function definition, making code verbose and blocking a broad class of
functional programming patterns — from simple transforms to closures
returned from functions, IIFEs, and nested lambdas with shared mutable
state.
After this PR, lambda expressions are a first-class feature across the
entire compiler stack: they parse, format, type-check with bidirectional
inference, compile to bytecode with cell-based capture semantics, and
execute in the VM with full GC support. Users can write lambdas with
minimal annotation — parameter and return types are inferred from
context when possible.
## What user-facing changes did I ship
- **Lambda expression syntax**: `(params) -> [RetType] { body }` with
optional type annotations, optional return type, optional generic
parameters (`<T>(x: T) -> T { x }`), and optional `throws` clauses
- **Bidirectional type inference**: lambda parameter types inferred from
call context (e.g. `items.map((x) -> { x * 2 })` infers `x: int` from
`Array<int>.map`)
- **Closure capture semantics**: lambdas capture variables by shared
reference via cells — mutations in the closure are visible to the parent
and vice versa
- **Transitive capture propagation**: nested lambdas automatically
capture variables from grandparent+ scopes through intermediate lambdas
- **New container methods**: `Array.map<U>`, `Map.map<U>`,
`Map.map_keys<U>`, `Map.map_values<U>` — higher-order methods that
accept lambda callbacks
- **Error diagnostics**: type mismatches, arity mismatches, missing
param annotations (when no context available), and param type conflicts
are all reported
- **Test projects**: `lambda_basic` (9 functions), `lambda_advanced`
(~33 functions), `lambda_errors` (4 error cases) with full snapshot
coverage across all 8 compiler phases
## How I implemented it
This was a full-stack feature spanning every compiler layer, implemented
across 18 commits in two tasks: first the frontend (parser → AST → HIR →
TIR + formatter), then the backend (MIR → emit → bytecode → VM → GC).
### 1. Syntax & Parser
- **`syntax_kind.rs`** — Added `LAMBDA_EXPR` node kind to the CST
- **`parser.rs`** — Added `looks_like_lambda()` and
`looks_like_generic_lambda()` disambiguation predicates with depth-aware
paren scanning (up to 64 tokens lookahead). Added `parse_lambda_expr()`
with `parse_lambda_parameter_list()` (optional type annotations unlike
regular function params). Hooked into `parse_primary_expr` for both
`LParen` (regular lambdas) and `Less` (generic lambdas)
### 2. AST Lowering
- **`ast.rs`** — Added `Expr::Lambda(Box<FunctionDef>)` variant reusing
the existing `FunctionDef` struct with synthetic name `"<anonymous
function>"`
- **`lower_expr_body.rs`** — Added `lower_lambda_expr` with a fresh
`LoweringContext` for the lambda's own `ExprBody` arena. Made
`lower_params` public via `lower_cst.rs`
### 3. HIR Scope Registration & Capture Analysis
- **`hir/builder.rs`** — Added pass that pushes `ScopeKind::Lambda`,
registers params, and recursively walks the lambda's own `ExprBody`.
`ScopeBindings.captures` records which parent-scope variables each
lambda references. `ScopeBindings.captured_names` on function scopes
marks which locals are captured by any descendant lambda — the
foundation for cell wrapping in MIR
### 4. TIR Type Inference
- **`tir/builder.rs`** — Three major additions:
1. **`infer_expr` (synthesis)**: Lambda with no expected type — requires
annotated params, infers return from body
2. **`check_expr` (checking)**: Lambda with expected `Ty::Function` —
decomposes expected type for bidirectional param/return inference (the
key to `items.map((x) -> { x * 2 })`)
3. **`infer_lambda_body` helper**: Save/restore approach for locals,
declared_types, return_ty, generic_params, and `expressions` (to prevent
ExprId collisions between lambda and parent arenas). Lambda params
seeded on top of parent locals so captures work naturally
4. **Two-pass generic call inference**: Non-lambda args are inferred
first to bind type vars, then lambda args are checked with resolved
bindings (e.g. `apply((x) -> { x * 2 }, 21)` binds `T=int` from `21`
before checking the lambda)
5. **Method-level generic resolution**: Fixed `resolve_member_access` to
include method-level generics in bindings so `items.map<U>(...)`
resolves correctly
- **`infer_context.rs`** — Added `CannotInferLambdaParamType` diagnostic
variant
- **`normalize.rs`** — Fixed `Optional`/`Union` subtyping to work in
both directions
### 5. Formatter
- **`expressions.rs`** — Added `LambdaExpr`, `GenericParamList`,
`ThrowsClause` structs with full `FromCST`/`Printable` impls,
dispatching on `LAMBDA_EXPR` CST node
- **`tokens.rs`** / **`lib.rs`** — Supporting token and top-level
dispatch additions
### 6. Container Builtins
- **`containers.baml`** — Added `Array.map<U>(self, f: (T) -> U) ->
U[]`, `Map.map<U>(self, f: (K, V) -> U) -> U[]`, `Map.map_keys<U>`,
`Map.map_values<U>`
- **`array.rs`** / **`map.rs`** — VM implementations that invoke
closures via `call_indirect`
### 7. VM Structural Foundation
- **`types.rs`** — `Object::Closure { function: HeapPtr, captures:
Vec<Value> }` and `Object::Cell { value: Value }` with full `Display`,
`ObjectType`, and `value_type_tag` support
- **`bytecode.rs`** — 7 new instructions: `MakeClosure`, `MakeCell`,
`LoadDeref`, `StoreDeref`, `LoadCapture`, `StoreCapture`, and
`CaptureRef` (7th not in original plan — needed for forwarding cell
pointers to inner closures)
- **`gc.rs`** — `add_references_to_worklist` and
`fixup_object_references` trace Closure/Cell objects
- **`vm.rs`** — `collect_frame_roots` and `apply_frame_forwarding`
ensure frame function pointers survive GC relocation. Execution handlers
for all 7 instructions. `resolve_callable_target`, `load_function`,
`execute_call_from_locals_offset`, and `allocate_real_locals_for_frame`
all extended for `Object::Closure`
- **`lib.rs` (engine)** — Frame root collection and forwarding in GC
cycle
### 8. MIR IR & Lowering
- **`ir.rs`** — `Rvalue::MakeClosure { lambda_idx, captures }`,
`Place::Capture(idx)`, `MirFunction.lambdas: Vec<MirFunction>`,
`LocalDecl.is_captured: bool`
- **`lower.rs`** — Key migration: `expr_types` key changed from
`AstExprId` to `(FileScopeId, AstExprId)` with `current_scope` tracking,
eliminating ExprId collisions. Added `lower_lambda` method with full
save/restore of parent state (builder, body, source_map, locals,
exit_block, loop/catch context, pending_lambdas, capture_indices). HIR
`captured_names` → `is_captured` flag. `capture_indices:
Option<HashMap<Name, usize>>` for lambda body variable resolution.
`transitive_captures_needed: Vec<Name>` for nested lambda propagation
### 9. Emit Layer
- **`pull_semantics.rs`** — `make_closure()`, `load_capture()`,
`store_capture_value()` trait methods + `walk_rvalue_pull` and
`walk_place_pull` arms
- **`emit.rs`** — `lambda_object_indices`, `lambda_names`,
`captured_locals`, `loading_for_closure_capture` fields on
`StackifyCodegen`. Cell-wrapping preamble: `LoadVar → MakeCell →
StoreVar` for each `is_captured` local. Parent-side deref:
`LoadDeref`/`StoreDeref` instead of `LoadVar`/`StoreVar`. Capture
operand loading: `LoadVar` (bypassing deref) when
`loading_for_closure_capture` is set. Lambda body: `Place::Capture(idx)`
→ `LoadCapture(idx)` / `StoreCapture(idx)`. Transitive forwarding:
`CaptureRef(idx)` via `load_capture` when `loading_for_closure_capture`
is set
- **`lib.rs` (emit)** — `compile_lambdas_flat()` recursively compiles
lambda `MirFunction`s into bytecode `Function` objects, registering them
in `program.objects`. `lower_let_body` changed to return lambdas
alongside the body
### 10. Tests
- **`compiler2_tir/mod.rs`** — 216+ lines of TIR snapshot
infrastructure: `format_lambda_signature`, `render_expr_body_untyped`,
lambda capture annotations, compound argument expansion
- **52+ new snapshot files** across lexer, parser, HIR, TIR, MIR,
diagnostics, codegen, and formatter phases for all three lambda test
projects
- **Existing snapshot updates** — Minor codegen snapshot changes in 8
existing test projects (format_checks, control_flow, etc.) due to new
container methods
## Deviations from the plans
This feature was implemented across two plans:
- **Plan A** ("Lambda Expression Support"): parser → AST → HIR → TIR +
formatter
- **Plan B** ("Lambda Closure Compilation and Runtime Support"): MIR →
emit → VM → GC
### Implemented as planned
- **Plan A**: Test projects, parser with disambiguation,
`Expr::Lambda(Box<FunctionDef>)`, HIR lambda scope pass, formatter
`LambdaExpr` — all match plan
- **Plan B**: Phase 1 structural foundation (VM types, GC, bytecode, MIR
IR), Phase 2 ExprId fix + emit infrastructure, Phase 3 non-capturing
lambda lowering, Phase 4 capturing lambdas with cells, Phase 5
transitive captures — all match plan
### Deviations/surprises
- **Save/restore vs nested builder (Plan A)**: Plan A discussed both
approaches and preferred nested builder for TIR. Implementation used the
simpler save/restore approach, noting ExprId collisions as fixable (and
Plan B fixed them via the `(FileScopeId, AstExprId)` key migration)
- **Two-pass call inference (Plan A)**: Plan A specified a single-pass
with `contains_typevar` check. Implementation went further with a true
two-pass approach (non-lambda args first, then lambda args) for better
generic resolution
- **Method-level generics fix (Plan A)**: Plan A said
`resolve_member_access` needed "no changes." Implementation found and
fixed a bug where method-level generic params (like `<U>` on
`Array.map<U>`) weren't being added to bindings
- **Optional/Union subtyping fix (Plan A)**: Plan A said `normalize.rs`
needed "no changes." A bug was found and fixed where `Optional<T>`
wasn't recognized as a subtype of `Union` types containing `T` and
`Null`
- **7th bytecode instruction `CaptureRef` (Plan B)**: Plan B specified 6
instructions. Implementation added a 7th — `CaptureRef` pushes the raw
cell pointer from a closure's captures without reading through the cell.
Necessary for forwarding captured cells to inner closures during
transitive capture propagation
- **`loading_for_closure_capture` flag (Plan B)**: Emit layer uses a
boolean on `StackifyCodegen` to distinguish "load for closure capture"
(cell pointer) vs "load for use" (cell value). Plan B mentioned the
concept but didn't detail this mechanism
- **`lower_let_body` returns lambdas (Plan B)**: Plan B didn't account
for lambdas inside let-binding initializers. Implementation changed
`lower_let_body` to return `Option<(MirFunctionBody, Vec<MirFunction>)>`
### Additions not in plans
- **`lambda_advanced` test project**: 334-line comprehensive test suite
exercising generic inference, chained maps, nested generics,
optional/union patterns, shadowing, and complex composition
- **Builtin `map`/`map_keys`/`map_values` methods**: Added to `Array<T>`
and `Map<K, V>` in `containers.baml` with VM implementations — needed to
test real lambda-as-argument patterns
- **216+ lines of TIR test infrastructure**: Dedicated snapshot
rendering for lambda signatures, lambda bodies, capture annotations, and
compound argument expansion
- **LSP and onionskin exhaustive match updates**: `check.rs` and
`compiler.rs` needed `Expr::Lambda` arms
### Items planned but not implemented
- **FFI/codegen for external targets** (Python, TypeScript) — out of
scope
- **PPIR streaming support for closures** — out of scope
- **Closure inlining/devirtualization** — out of scope
- **GC optimizations** (escape analysis, open/closed upvalue
transitions) — out of scope
- **Recursive closures** (closures that capture themselves) — out of
scope
- **Expression-body lambdas** (without braces) — out of scope
- **Default lambda params** — out of scope
## How to verify it
### Setup
```bash
git fetch
git checkout vbv/add-lambda-expression-support-to-baml-compiler
```
### Automated Tests
```bash
# Full test suite
cargo test -p baml_tests
# Lambda-specific tests
cargo test -p baml_tests -- lambda_basic
cargo test -p baml_tests -- lambda_advanced
cargo test -p baml_tests -- lambda_errors
# Component tests
cargo test -p bex_vm
cargo test -p baml_compiler_parser
cargo test -p baml_compiler2_tir
```
### Manual Testing
- [ ] Review `lambda_basic.baml` — verify each lambda pattern
(annotated, inferred, zero-param, capture, nested, generic, multi-param)
- [ ] Review `lambda_advanced.baml` — verify complex patterns (chained
maps, nested captures, higher-order return, IIFEs, shadowing)
- [ ] Review `lambda_errors.baml` — verify each error case produces
clear diagnostics
- [ ] Review TIR snapshots — verify correct inferred types for all
lambda expressions
- [ ] Review MIR snapshots — verify `[captured]` annotations and
`make_closure` rvalues with capture operands
- [ ] Review codegen snapshots — verify cell-wrapping preamble,
`LoadDeref`/`StoreDeref`, `LoadCapture`/`StoreCapture`, `CaptureRef`,
`MakeClosure` instructions
## Description for the changelog
Add lambda expression support to BAML: anonymous functions with
`(params) -> { body }` syntax, bidirectional type inference, cell-based
closure captures with shared mutation semantics, transitive capture
propagation for nested lambdas, and higher-order container methods
(`Array.map`, `Map.map`, `Map.map_keys`, `Map.map_values`).
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Lambda expressions: generics, parameter/return annotations, optional
throws, block bodies, closures with captured variables.
* Container mapping helpers: Array.map, Map.map, Map.map_keys,
Map.map_values.
* **Bug Fixes / Diagnostics**
* New diagnostic for uninferrable lambda parameter types.
* **Tests**
* Extensive new unit/integration tests exercising lambdas, captures, map
usage, and error cases.
* **Runtime / Tooling**
* VM, bytecode, GC, formatter and diagnostic integrations updated to
support closures, capture cells, and new bytecode.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Latest Branches
0%
bep-020-optional-chaining-null-coalescing -3%
avery/reintegrate-providers -4%
vbv/add-lambda-expression-support-to-baml-compiler © 2026 CodSpeed Technology