Proposal: Handling Special Values in Infr

Status: Draft / Future Consideration Date: 2026-03-21

R's Special Values

R has a rich set of special values, each with distinct semantics:

Value	R Type	Meaning
`NA`	logical	Generic missing value
`NA_real_`	double	Missing numeric
`NA_integer_`	integer	Missing integer
`NA_character_`	character	Missing character
`NA_complex_`	complex	Missing complex
`NaN`	double	Not a Number (e.g., `0/0`)
`Inf` / `-Inf`	double	Positive/negative infinity
`NULL`	NULL	Absence of a value

Current State in Infr

All special values are fully supported as literals — they are lexed, parsed, type-inferred, and transpiled correctly. The type checker infers them to their base types:

NA, NA_real_ → numeric
NA_integer_ → integer
NA_character_ → character
NA_complex_ → complex
NaN, Inf → numeric
NULL → null

NULL has first-class type support: nullable types (T? as shorthand for T | NULL) and is.null() type narrowing in conditionals both work today.

NA, NaN, and Inf have no special type-level representation — they are simply members of their base type.

Precedent: TypeScript and JavaScript

TypeScript faced a similar problem with JavaScript's null and undefined. The parallels are instructive:

R	JavaScript	TypeScript Solution
`NULL`	`null`	`T \| null` union type
`NA`	`undefined` (loosely)	`T \| undefined` union type
`is.null()`	`=== null`	Control-flow narrowing
`is.na()`	`=== undefined`	Control-flow narrowing
`NaN`, `Inf`	`NaN`, `Infinity`	No special type treatment

What TypeScript did

strictNullChecks (TS 2.0, 2016) — Before this flag, null and undefined were assignable to every type. With it on, they became distinct types requiring explicit unions (string | null). This is widely considered TypeScript's most impactful strictness flag.
Control-flow narrowing — if (x !== null) narrows string | null → string. No new syntax was needed; TypeScript recognized existing JS idioms as type guards.
NaN and Infinity got no special types — They remain number. Number.isNaN() returns boolean but does not narrow. TypeScript decided the complexity wasn't worth it.

Key difference: NA is harder than `undefined`

TypeScript's approach maps well onto Infr's NULL handling (already implemented). But NA is a fundamentally different beast:

undefined doesn't propagate — 1 + undefined produces NaN (a different value/type).
NA propagates silently — 1 + NA produces NA (same type, infectious).
undefined is a property of the variable — a variable is either undefined or it isn't.
NA is a property of individual vector elements — a numeric vector can contain a mix of real values and NAs. This is an element-level concept with no JS equivalent.

This means full NA tracking is genuinely novel territory beyond what TypeScript attempted.

Proposal: Phased Approach

Phase 1: Document Current Behavior + `na.rm` Lint

Effort: Low | Value: Medium | Priority: Do now

No type-system changes. Two concrete deliverables:

Spec update — Add a "Special Values" section to infr-spec.md formalizing that NA, NaN, Inf are valid literals inferred to their base types.
na.rm diagnostic — In strict mode, warn when calling aggregate functions (mean(), sum(), max(), min(), etc.) without na.rm = TRUE on data that hasn't been explicitly filtered. This is a simple heuristic lint — no new types needed — and it catches one of the most common classes of R bugs.

Phase 2A: `is.na()` / `is.nan()` Type Narrowing

Effort: Medium | Value: Medium | Priority: Next

Extend the existing narrowing infrastructure (which handles is.null() today) to recognize is.na(), is.nan(), is.finite(), and is.infinite() as type guards. Initially these would serve as documentation and intent markers without changing inferred types, since NA isn't a separate type yet.

if (!is.na(x)) {
  # Checker recognizes this branch as NA-safe
}

This lays the groundwork for Phase 2B by establishing the control-flow patterns.

Phase 2B: `strictNaChecks` — Explicit NA Types

Effort: High | Value: High | Priority: Design carefully, implement later

Introduce an na type modifier, analogous to how NULL works today. Under a strictNaChecks flag (likely a strictness level in infr.toml):

NA becomes its own type rather than collapsing into numeric.
Vectors that might contain NA are typed as na | numeric (paralleling T | NULL).
is.na() narrows na | numeric → numeric in the false branch.
c(1, NA, 3) infers na | numeric instead of numeric.

x: na | numeric <- c(1, NA, 3)

if (!is.na(x)) {
  x + 1   # OK: x is numeric here
}

x + 1     # Warning: x might be NA

Open design questions:

Element-level vs variable-level: In R, NA is a property of individual vector elements, not the variable. Should na | numeric mean "this vector contains at least one NA" or "this scalar might be NA"? The former is more accurate but much harder to track.
Propagation rules: 1 + NA → NA. Should the checker model this? It would require tracking NA-ness through every arithmetic/comparison operator.
Ergonomics: If every unfiltered data-frame column becomes na | T, annotation burden could be heavy. TypeScript mitigated this with ! (non-null assertion); Infr might need something similar for NA.
Interaction with na.rm: Functions like mean(x, na.rm = TRUE) should strip na from their return type. This requires literal-value overload resolution (dispatching on na.rm = TRUE vs FALSE).

Phase 3: NA-Aware Function Signatures

Effort: Very High | Value: High | Priority: Future / experimental

If Phase 2B is implemented, declaration files could express NA behavior through overloads:

mean(x: na | numeric, na.rm: FALSE) -> na | numeric
mean(x: na | numeric, na.rm: TRUE) -> numeric
mean(x: numeric) -> numeric

This would enable the checker to trace NA-ness through data pipelines and warn only where it matters — a powerful capability, but one that requires significant investment in overload resolution and propagation tracking.

Summary

Phase	Effort	Value	Depends On
Phase 1: Spec + `na.rm` lint	Low	Medium	Nothing
Phase 2A: Narrowing as intent markers	Medium	Medium	Phase 1
Phase 2B: `strictNaChecks` with `na` type	High	High	Phase 2A
Phase 3: NA-aware function signatures	Very High	High	Phase 2B

The key insight from TypeScript's experience: strictNullChecks was transformative but took years to design, ship, and migrate the ecosystem. Infr's NULL handling already mirrors it. strictNaChecks would be breaking genuinely new ground — worth pursuing, but deserving of careful, incremental design.

R's Special Values​

Current State in Infr​

Precedent: TypeScript and JavaScript​

What TypeScript did​

Key difference: NA is harder than undefined​

Proposal: Phased Approach​

Phase 1: Document Current Behavior + na.rm Lint​

Phase 2A: is.na() / is.nan() Type Narrowing​

Phase 2B: strictNaChecks — Explicit NA Types​

Phase 3: NA-Aware Function Signatures​

Summary​