RVS Research Landscape

Overview

Verification Stack: Model → Silicon

L7

Model — Does the neural network converge? ML theory.

L6

Quantization — Does Q4_0 ≈ FP32? Error bounds.

L5

Compiler — C/LLVM IR → correct assembly? CompCert, Alive2. Vector codegen = open gap.

L4

Compute Kernel — Does assembly implement the spec? (the innermost vectorized hot loop, not OS kernel) ★ RVS — 13 kernels proven.

L3

ISA Spec — Machine-readable formal spec. Sail (RISC-V), ASL (ARM), Wasm spec (W3C). Trusted.

L2

RTL — Does hardware implement ISA? riscv-formal, Arm CCA, Intel FV.

L1

Silicon — Does the chip match RTL? EDA / fab verification.

L1 = physical, L7 = application (cf. OSI model). Lower layers are trusted by upper layers. Click A1 for a concrete walkthrough using llama.cpp as running example.

Section 1

Formal Work by Layer × ISA

RISC-V

Wasm

ARM

x86

LoongArch

L5

CompCert (no RVV)
Alive2 (opt only)
Next codegen TV

Empty
Wasm codegen unverified

Alive2 (opt only)
Empty No NEON/SVE codegen

Alive2 (opt only)
Empty No AVX codegen

Empty
No LLVM backend FV

L4

★ RVS
13 kernels UNSAT
3-layer pipeline

★ RVS
Cross-ISA proof
RVV == Wasm SIMD128

Empty
ASL spec exists, no symbolic SIMD compute kernel prover

Empty
No Sail model. ACL2 partial.

Empty
No formal ISA spec

L3

Sail Official (Cambridge)

Wasm spec W3C formal semantics

ASL Arm Spec Language (ASLi)

Partial ACL2 (incomplete)

Empty Manual only

L2

riscv-formal SymbiYosys

N/A (no hardware)

Arm CCA Internal FV

Intel FV Internal

Vendor internal

L7 (Model), L6 (Quantization), L1 (Silicon) omitted — not in our scope. Shown in overview above.

Why RISC-V and Wasm are center

Our pipeline requires both a formal L3 spec and a symbolic execution toolchain (the L3→L4 bridge). Only RISC-V and Wasm have both.

ISA

L3 Spec

L3→L4 Bridge

Status

RISC-V

Sail (complete, machine-readable)

Isla (symbolic executor → SMT)

★ Ready

Wasm

W3C spec (formal semantics)

Reference interp (+ simpler semantics)

★ Ready

ARM

ASL (complete, machine-readable)

ASLi = interpreter only (not symbolic)

No bridge

x86

Intel SAE = prose (ACL2 partial)

Nothing

No spec

LoongArch

Manual only

Nothing

This is the technical reason RISC-V and Wasm are our primary targets. ARM has L3 but not the bridge; x86 has neither. Building an ASL→SMT symbolic executor (an "Isla for ARM") is a significant engineering effort and a potential future research direction — but not ours.

Section 2

RISC-V L4 Ecosystem: Extension × Concern

RISC-V is modular — each extension family has independent instructions. Within each, compute kernel verification decomposes into four concerns by solver technique.

Integer
QBV (bitvector)

Memory
QBV + Array

Control Flow
Induction / BMC

Float
FP theory

RV64I/MBase + Multiply

Trivial

Scalar ops are simple. Pipeline supports them; no research novelty. riscv-formal covers L2.

Trivial

Standard load/store

Trivial

Branch/jump

N/A

RV64VVector SIMD

★ RVS

13 kernels

Q4_0–Q6_K, vecadd, conv1d, ReLU

Partial

Loads axiomatized as fresh symbols. vse32 store trace captured (72 write events).

Partial

Single iteration verified. vsetvli, strip-mining loops, mask (vm=0) TBD.

Skip

FP16 scale factor in quant dot products. Different solver domain.

Zvk*Vector Crypto

Sail spec ready (2024). Reuse pipeline.

Empty

Crypto I/O simpler

Empty

Simpler control flow

N/A

Zb*Bitmanip

Empty

clz, ctz, rotate — straightforward BV

N/A

RV64F/DScalar Float

N/A

Skip

Berkeley FPU (L2). FP theory at L4.

RV64AAtomic

N/A

Empty

Needs memory model

N/A

same BV solver ←——→ new technique ←————→ different domain

"Trivial" = our pipeline handles it, but scalar ops lack the SIMD complexity that makes L4 verification a research contribution. The tool covers them; the paper does not.

Section 3

Wasm L4 Ecosystem: Proposal × Concern

Wasm is also modular (proposals). The decomposition mirrors RISC-V — each row maps to a RISC-V extension.

Integer
QBV

Memory
Linear mem model

Control Flow
Structured (simpler)

Float
IEEE 754

SIMD128128-bit fixed vectors

★ RVS

Cross-ISA proof

RVV == Wasm SIMD128 (Q8_0). Maps to RV64V integer.

Empty

Linear memory is simpler than RV (no PMA/PMP). Maps to RV64V memory.

Empty

Structured CF (block/loop/if) — no arbitrary jumps. Simpler than RV.

Empty

f32x4, f64x2. No FP16 native. Maps to RV64V float.

Relaxed SIMDPlatform-dependent

Same as SIMD128

relaxed_madd etc.

WebCrypto+ SIMD crypto kernels

Empty

N/A

eWASM / zkWasmSmart contracts + ZK

Empty

Smart contract correctness (CosmWasm, Polkadot). ZK execution proofs need formal Wasm spec.

Empty

Deterministic memory needed for ZK.

Empty

Gas metering + structured CF.

N/A

ThreadsShared memory + atomics

N/A

Empty

Needs Wasm memory model. Maps to RV64A.

N/A

RISC-V ↔ Wasm Correspondence

RV64V integer

↔

SIMD128 integer

Zvk* crypto

↔

WebCrypto + SIMD

RV64V memory

↔

Linear memory (simpler)

RV branches

↔

Structured CF (simpler)

RV64A atomics

↔

Threads proposal

RV64F/D float

↔

f32x4 / f64x2 (both IEEE 754)

Wasm is structurally simpler: no PMA/PMP (linear memory), no arbitrary jumps (structured CF), no variable-length vectors (fixed 128-bit). This makes cross-ISA proofs tractable — the hard part is always on the RISC-V side.

Section 4

Expansion Map: Adjacency-Based Territory Growth

Next targets are adjacent to conquered territory. Two parallel fronts: Compiler (L5) expands vertically, Crypto expands horizontally.

RVV Int

Wasm Int

RVV Crypto

Wasm Crypto

L5

wave 2

codegen TV

wave 4

future

L4

★ done

13 kernels

★ done

cross-ISA

wave 2

Zvk* AES/SHA

wave 3

cross-ISA crypto

Wave 1 (done) RVV integer + Wasm cross-ISA Wave 2 (next, parallel) L5 compiler TV + Zvk* crypto Wave 3 Wasm crypto cross-ISA Wave 4+ Expand codegen to all targets

Why L5 first? We already hold both endpoints: MLIR/LLVM IR specs (top) and Isla ISA traces (bottom). The compiler codegen gap in between is the last unverified link — and nobody does it for any vector ISA. Vertically adjacent to our L4 territory, maximum infrastructure reuse, highest impact.

Section 5

Roadmap & Conference Deadlines

1

L4 Integer SIMD + Cross-ISA + Multi-Config FMCAD 2026May 2026

Current paper. 13 kernels, 3-layer pipeline, RVV==Wasm, multi-config (SEW=8/16/32).

2

L5 Compiler Codegen Translation Validation PLDI / CAV 2027Nov 2026

LLVM IR → RVV asm. Nobody does this for any vector ISA. Highest impact.

3

Zvk* + Wasm Cross-ISA Crypto CCS / S&P 2027Apr 2027

Zvk* Sail spec ready. Cross-ISA: Zvk* == Wasm SIMD128 crypto. web3 (eWASM, zkWasm).

4

L4 Control Flow FMCAD / DAC 2027Feb 2027

vsetvli + bounded loops + mask. Closes partial. Standalone or extension.

5

Wasm + web3 Verification Tool paper

Smart contracts, zkWasm correctness. Natural extension after crypto cross-ISA.

Appendix A — Reference Only (not in scope)

ARM L4 Ecosystem: Extension × Concern

ARM has a mature ISA spec (ASL) and rich extension set. Included for landscape completeness; not in our research scope.

Integer
QBV

Memory
QBV + Array

Control Flow
Induction / BMC

Float
FP theory

AArch64Base (v8/v9)

Trivial

Scalar ops. ASL spec complete. Arm ISA FV team covers L2.

Trivial

Standard load/store. TrustZone address space.

Trivial

Branch/compare.

N/A

NEON128-bit SIMD (v7/v8)

Empty

ASL spec exists. ASLi can extract semantics but no symbolic kernel prover built. Fixed 128-bit vectors.

Empty

NEON load/store lanes.

Empty

No predication (unlike SVE).

Empty

FP16/BF16 in NEON. Nobody at L4.

SVE / SVE2Scalable Vector (v9)

Empty

Variable-length (128–2048 bit). VLA model like RVV. ASL spec exists. Highest research value for ARM L4.

Empty

Gather/scatter, first-faulting loads.

Empty

Predicate-first model (unique to SVE). More expressive than RVV masks.

Empty

SVE FP ops.

CE / AES / SHACrypto Extensions

Empty

AES/SHA/SM3/SM4 instructions. ASL spec complete. No L4 prover.

N/A

SMEScalable Matrix (v9.2)

Empty

Matrix tile operations. Very new (2023). Outer product + accumulate.

Empty

Streaming mode memory.

Empty

Streaming SVE mode transition.

Empty

BF16/FP16 matrix ops.

ARM's L3 (ASL) is strong — machine-readable, maintained by Arm Ltd. The gap is L4: nobody has built a symbolic kernel verifier on top of ASL for NEON/SVE/CE. ASLi (interpreter) exists but is not a symbolic executor. The closest analog to our Isla would be building an ASL-to-SMT pipeline, which is a significant engineering effort.

Appendix B — Reference Only (not in scope)

x86 L4 Ecosystem: Extension × Concern

x86 has the largest deployed SIMD surface area but the weakest formal spec. Included for completeness.

Integer
QBV

Memory
QBV + Array

Control Flow
Induction / BMC

Float
FP theory

x86-64Base scalar

Trivial

ACL2 model (Centaur/AMD). Intel FV internal. K Framework partial model.

Partial

Complex memory model (TSO). Intel internal verification.

Trivial

Branch/jump.

N/A

SSE / AVX2128/256-bit SIMD

Empty

No formal spec (Intel SAE is prose). ACL2 has some SSE models. Widest deployment, worst formalization.

Empty

Aligned/unaligned loads.

Empty

No masking (pre-AVX-512).

Empty

SSE/AVX FP ops.

AVX-512512-bit SIMD + mask

Empty

No Sail model. Most complex x86 SIMD. Opmask registers (like RVV vm). ~5000 instruction variants.

Empty

Gather/scatter.

Empty

Opmask predication.

Empty

FP16 (AVX-512 FP16).

AES-NI / SHACrypto

Empty

AESENC/AESDEC/SHA instructions. No formal spec. Intel whitepaper only.

N/A

AMXAdvanced Matrix (Sapphire Rapids+)

Empty

Tile-based matrix multiply. Very new. Analogous to ARM SME.

Empty

Tile load/store.

Empty

TILECFG state.

Empty

BF16/FP16 tiles.

x86's fundamental problem is L3: there is no machine-readable formal ISA spec. Intel's Software Architecture Manual (SAE) is prose; ACL2 models (Centaur) cover a subset. This makes x86 the hardest target for our pipeline — you'd need to build the ISA model first. Alive2 covers LLVM optimizations (L5 partial) but not codegen to specific SIMD extensions.

References

Cell References

Click any cell above to jump here. Click a cell ID below to jump back.

A — Overview

A1–A7

How L7→L1 works in practice: llama.cpp as a running example

Understanding how LLM inference works with llama.cpp — Omri Mallis (tokenization → transformer → KV cache → sampling, engineer-friendly)
Explore llama.cpp architecture and inference workflow — Arm Learning Paths (prefill/decode stages, GGML graph, NEON/SVE kernels)
Understanding internals of llama.cpp — Siddharth Jain (C++ context, buffer, forward pass)
The stack maps to: Model (L7) → Quantization/GGUF (L6) → C++/GGML compiler (L5) → dot product kernel (L4) → ISA instructions (L3) → RTL (L2) → silicon (L1).

A3

L5 Compiler

CompCert — compcert.org
Alive2 — alive2.llvm.org

A4

L4 Compute Kernel — our work

RVS — github.com/yiidtw/rvs

A5

L3 ISA Spec

Sail RISC-V — github.com/riscv/sail-riscv
W3C Wasm Spec — webassembly.github.io/spec
ARM ASL — Arm Architecture Specification Language (published by Arm Ltd.)

A6

L2 RTL

riscv-formal — github.com/YosysHQ/riscv-formal

B — Layer × ISA

B1

L5 × RISC-V

CompCert — compcert.org (no RVV backend)
Alive2 — alive2.llvm.org (LLVM IR opt validation only, not codegen)

B6

L4 × RISC-V — our work

RVS Demo — rvs.kfa.sh (13 kernels, 3-layer pipeline)
Source — github.com/yiidtw/rvs

B7

L4 × Wasm — our work

RVS Demo — rvs.kfa.sh (cross-ISA proof: RVV == Wasm SIMD128)
Source — github.com/yiidtw/rvs

B11

L3 × RISC-V

Sail RISC-V — official formal ISA model (Cambridge / RISC-V International)

B12

L3 × Wasm

W3C WebAssembly Specification — formal operational semantics

B13

L3 × ARM

ASL (Arm Architecture Specification Language) — machine-readable spec maintained by Arm Ltd.
ASLi — ASL interpreter by Alastair Reid (Arm Research). Interprets ASL specs but does NOT produce symbolic SMT traces; it is a concrete interpreter, not a symbolic executor like Isla.

B14

L3 × x86

ACL2 x86 model (Centaur/AMD) — partial formal model for x86 ISA in ACL2 theorem prover. Covers scalar subset; SIMD incomplete.
Intel SAE (Software Architecture Manual) is prose only — no machine-readable formal spec.

B16

L2 × RISC-V

riscv-formal — RTL vs ISA verification using SymbiYosys (Clifford Wolf / YosysHQ)

C — L3 Spec & Bridge

C1

RISC-V L3 Spec

Sail — complete, machine-readable, covers RV64IMAFDC + V extension

C2

RISC-V L3→L4 Bridge

Isla — symbolic executor for Sail models, produces SMT traces (Cambridge REMS group)

C4

Wasm L3 Spec

W3C WebAssembly Specification — formal typing and reduction rules

C7

ARM L3 Spec

ASL — Arm Architecture Specification Language. Machine-readable, maintained by Arm Ltd. Published with each architecture release.

C8

ARM L3→L4 Bridge (missing)

ASLi is an interpreter, not a symbolic executor. It cannot produce SMT traces from ASL specs. Building an "Isla for ARM" (ASL→SMT) is an open engineering challenge.

D — RISC-V Extension × Concern

D5

RV64V × Integer — our work

RVS — 13 kernels: Q4_0–Q6_K, vecadd, conv1d, ReLU (all UNSAT)

D9

Zvk* × Integer

Sail Zvk* — vector crypto extensions added to Sail model (2024)

E — Wasm Proposal × Concern

E1

SIMD128 × Integer — our work

Wasm SIMD proposal — github.com/WebAssembly/simd
RVS cross-ISA proof — RV64V == Wasm SIMD128

F — Expansion Map

F1

L5 × RVV Integer — Compiler Codegen TV — next target

CompCert — compcert.org (verified C compiler, no RVV backend)
Alive2 — alive2.llvm.org (LLVM IR opt only, not codegen)
Gap: nobody verifies LLVM vector codegen (IR → RVV asm) for any ISA. We hold both endpoints — MLIR/LLVM IR specs (top) + Isla ISA traces (bottom). Translation validation per compilation is the most practical path.