nw-tlaplus-verification
TLA+ formal verification for design correctness and PBT pipeline integration
Best use case
nw-tlaplus-verification is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
TLA+ formal verification for design correctness and PBT pipeline integration
Teams using nw-tlaplus-verification should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/nw-tlaplus-verification/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How nw-tlaplus-verification Compares
| Feature / Agent | nw-tlaplus-verification | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
TLA+ formal verification for design correctness and PBT pipeline integration
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# TLA+ / Formal Verification
When and how to use TLA+ for design verification. Complements PBT (which verifies implementation).
## Decision Tree: When to Use TLA+ vs PBT vs Both
```
Is the risk in the DESIGN or the IMPLEMENTATION?
|
+-- Design risk (protocol correctness, distributed coordination, concurrency)
| -> Does the system involve concurrent or distributed state?
| Yes -> Use TLA+ for design verification
| Then use PBT to verify implementation matches design
| No -> PBT alone is likely sufficient
|
+-- Implementation risk (edge cases, serialization, data transforms)
| -> Use PBT alone
|
+-- Both
-> TLA+ validates design, PBT validates implementation
```
### Use TLA+ When:
- Design bug would cause data loss or significant customer impact
- System involves concurrent or distributed state manipulation
- Subtle interactions between components are hard to reason about informally
- Informal reasoning or testing has already failed to prevent bugs
### Skip TLA+ When:
- Simple CRUD with straightforward business logic
- UI/UX behavior
- Performance optimization (TLA+ models correctness, not performance)
- Design is well-understood; risk is only in implementation bugs
- Rapid prototyping where design changes frequently
## TLA+ in 60 Seconds
TLA+ describes **what** a system should do, not how. A specification consists of:
- **Variables**: State components
- **Init**: Valid starting states
- **Next**: How system transitions between states
- **Invariants**: Conditions that must hold in every reachable state
TLC model checker exhaustively explores all reachable states within bounded model. If invariant violated, TLC provides counterexample trace -- shortest path from initial state to violation.
## PlusCal as Entry Point
PlusCal is imperative-looking language transpiling to TLA+. Most programmers find it easier to learn first.
```
(* --algorithm Transfer
variables accounts = [a |-> 100, b |-> 100];
process Transfer \in 1..2
variables from, to, amount;
begin
Pick:
from := "a"; to := "b"; amount := 50;
Check:
if accounts[from] >= amount then
Debit:
accounts[from] := accounts[from] - amount;
Credit:
accounts[to] := accounts[to] + amount;
end if;
end process;
end algorithm; *)
\* Invariant: total money is conserved
MoneyConserved == accounts["a"] + accounts["b"] = 200
```
**Labels** define atomicity boundaries. Everything between two labels executes atomically. Concurrency interleavings happen at label boundaries.
PlusCal limitation: can't express all TLA+ patterns (complex fairness, some non-determinism forms). Start with PlusCal, switch to raw TLA+ when needed.
## TLC Model Checker Workflow
1. Write spec in PlusCal or TLA+
2. Configure model: concrete values for constants, invariants to check
3. Run TLC: exhaustive breadth-first exploration of all reachable states
4. If violation found: TLC provides counterexample trace (shortest path to violation)
5. Fix design, re-check
6. Gradually increase model parameters (more nodes, more messages)
### Model Configuration (.cfg file)
```
SPECIFICATION Spec
CONSTANT
Nodes = {n1, n2, n3}
MaxMessages = 4
INVARIANT
MoneyConserved
NoDoubleDelivery
```
### Managing State Space Explosion
State space grows O(constants^variables). 3 nodes = ~1,000 states; 4 nodes = ~100,000. Most bugs manifest with 2-3 nodes.
Mitigation:
- Start with 2-3 nodes, low bounds (most bugs manifest with small configs)
- Use symmetry reduction for interchangeable elements
- Use state constraints to limit exploration
- Use simulation mode (`-simulate`) for large models (random sampling instead of exhaustive)
## Common Modeling Patterns
### Message Passing
Network as set of in-flight messages. Send adds to set, receive removes. Unreliable network: messages removed without receipt (loss) or kept after receipt (duplication).
### Shared Memory
Variables represent memory locations. Locks modeled as variables indicating holding process.
### Failures
Node crash: non-deterministic removal from active set. Volatile state lost, persistent retained. Restart: rejoin with fresh volatile state.
### Leader Election
Nodes propose, vote, become leader when majority achieved. Safety: at most one leader per term.
### Transactions
Two-phase commit: resource managers prepare, transaction manager commits when all prepared. Safety: no partial commits.
## Safety and Liveness
**Safety** ("nothing bad happens"): Expressed as invariants. Violated by finite counterexample trace. Example: "Two processes never hold the same lock."
**Liveness** ("something good eventually happens"): Requires fairness assumptions. Cannot be violated by finite prefix. Example: "Every request eventually gets a response."
Start with safety. Add liveness after safety established.
## The TLA+ to PBT Pipeline
Key integration point between formal verification and testing:
1. **TLA+ specification**: Identify invariants, safety properties, state machine transitions
2. **Model check with TLC**: Verify design correctness
3. **Implementation phase**: Code the verified design
4. **PBT phase**: Translate TLA+ invariants into PBT properties
Properties discovered during TLA+ become PBT test oracles:
- TLA+ invariant "total money conserved" becomes `assert sum(all_accounts) == initial_total`
- TLA+ safety "no double-delivery" becomes stateful PBT postcondition
- TLA+ state machine transitions map to stateful PBT commands
TLA+ invariants are exhaustive. PBT samples. If TLA+ proves A, B, and C, your PBT must check all three.
### What Each Catches
| TLA+ Catches (Design) | PBT Catches (Implementation) |
|----------------------|----------------------------|
| Protocol deadlocks | Off-by-one errors |
| Consistency violations under failure | Incorrect serialization |
| Missing error handling paths | Memory management issues |
| Race conditions in algorithms | Edge cases in data transforms |
Neither catches: performance bugs | usability issues | integration with external systems.
## Tooling
- **VS Code extension** (`alygin.vscode-tlaplus`): Primary IDE. Syntax highlighting, model checking, counterexample visualization. Requires Java 11+.
- **TLA+ Toolbox**: Original Eclipse-based IDE. Still functional but VS Code preferred.
- **Command-line TLC**: `java -jar tla2tools.jar -config Spec.cfg Spec.tla`
- **Apalache**: Symbolic model checker using SMT (Z3). Use when TLC hits state explosion (>10M states).
- **Community Modules**: Reusable TLA+ modules (FinSets, Sequences, Bags, Json, TLCExt).
## Real-World Adoption
- **AWS**: DynamoDB, S3, EBS -- found subtle bugs in every system, including potential data loss. Engineers learned TLA+ in 2-3 weeks.
- **Azure Cosmos DB**: All five consistency levels formally specified. Specs open-sourced.
- **Elasticsearch**: Replication, cluster coordination, shard allocation. Models open-sourced.
- **MongoDB, CockroachDB, Kafka**: Protocol verification.
## Iterative Development Approach
1. Start with 2 processes, 1 message type
2. Add failure modes (crash, restart, message loss)
3. Add more processes (verify N-participant scaling)
4. Add liveness properties (verify progress under fairness)
5. Add optimizations (verify optimized paths maintain correctness)Related Skills
nw-source-verification
Source reputation tiers, cross-referencing methodology, bias detection, and citation format requirements
nw-formal-verification-tlaplus
TLA+ and PlusCal for specifying distributed system invariants. Decision heuristics for when formal verification adds value, key patterns, state explosion management, and alternatives comparison.
nw-ux-web-patterns
Web UI design patterns for product owners. Load when designing web application interfaces, writing web-specific acceptance criteria, or evaluating responsive designs.
nw-ux-tui-patterns
Terminal UI and CLI design patterns for product owners. Load when designing command-line tools, interactive terminal applications, or writing CLI-specific acceptance criteria.
nw-ux-principles
Core UX principles for product owners. Load when evaluating interface designs, writing acceptance criteria with UX requirements, or reviewing wireframes and mockups.
nw-ux-emotional-design
Emotional design and delight patterns for product owners. Load when designing onboarding flows, empty states, first-run experiences, or evaluating the emotional quality of an interface.
nw-ux-desktop-patterns
Desktop application UI patterns for product owners. Load when designing native or cross-platform desktop applications, writing desktop-specific acceptance criteria, or evaluating panel layouts and keyboard workflows.
nw-user-story-mapping
User story mapping for backlog management and outcome-based prioritization. Load during Phase 2.5 (User Story Mapping) to produce story-map.md and prioritization.md.
nw-tr-review-criteria
Review dimensions and scoring for root cause analysis quality assessment
nw-test-refactoring-catalog
Detailed refactoring mechanics with step-by-step procedures, and test code smell catalog with detection patterns and before/after examples
nw-test-organization-conventions
Test directory structure patterns by architecture style, language conventions, naming rules, and fixture placement. Decision tree for selecting test organization strategy.
nw-test-design-mandates
Four design mandates for acceptance tests - hexagonal boundary enforcement, business language abstraction, user journey completeness, walking skeleton strategy, and pure function extraction