Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

EVM as a Virtual Machine

The Ethereum Virtual Machine (EVM) is a stack-based virtual CPU that executes Ethereum bytecode. This bytecode is the compiled form of Ethereum smart contracts, usually written in Solidity, Vyper, or another EVM-compatible language.

At the lowest level:

  • The EVM processes opcodes (ADD, PUSH, JUMP, SSTORE, etc.).

  • Execution state includes:

    • Stack: LIFO data stack for operand storage.
    • Memory: transient byte array for temporary values.
    • Storage: persistent key-value store for each contract.
    • Program counter: points to the next opcode.
    • Gas: metering system that charges for execution steps.

EVM as an Interpreter Pattern

From the free monad and GoF design patterns perspective:

  • Instruction Set = EVM opcodes (similar to your enum Console).
  • Program AST = Actually, EVM code is already compiled, so it’s a flat sequence of instructions, not a rich high-level AST. High-level languages like Solidity first have a compiler that builds an AST, optimizes it, and then emits EVM bytecode.
  • Interpreter = The EVM specification defines how each opcode changes the machine state.
  • Multiple interpreters = While the EVM spec is fixed, different clients (Geth, Nethermind, Besu, Erigon) are alternative implementations of the same interpreter.

In GoF terms:

  • The EVM is an Interpreter over an instruction language (bytecode).
  • Opcodes are Command objects executed in sequence.
  • The program state (stack, memory, storage) acts like the context in Interpreter pattern.

EVM and Free Monad Analogy

If we model the EVM in a free monad style:

  1. Syntax: Enum of opcodes with parameters, e.g.

    #![allow(unused)]
    fn main() {
    enum EvmInstr<A> {
        Add(A),
        Push(U256, A),
        SStore(U256, U256, A),
        // ...
    }
    }
  2. Program: Free<EvmInstr, ()> would represent an abstract Ethereum program before compilation.

  3. Semantics: Interpreter function that executes these instructions against the EVM state.

The difference:

  • The real EVM executes a linear bytecode sequence (already defunctionalized into an array of low-level instructions).
  • A free monad version would allow you to build programs algebraically, chain them with flat_map, and then interpret or compile them into EVM bytecode.

EVM and CPS / Defunctionalization

The EVM’s real execution loop is very similar to a defunctionalized CPS interpreter:

  • In CPS: “do instruction, then call continuation”.
  • In EVM: “execute opcode, then increment program counter” is equivalent to applying the continuation for that instruction.
  • Defunctionalization: The “continuations” are already turned into a position in bytecode (program counter). This is the tag identifying the next action.
  • The interpreter (switch or match on opcode) is the apply function of the defunctionalized continuations.

So:

  • CPS form: Smart contract execution as “do opcode → continuation”.
  • Defunctionalized form: Replace continuations with pc + bytecode array.
  • Free monad form: Build higher-level EVM programs before compiling them into the defunctionalized form.

Key Insight

EVM execution = defunctionalized CPS interpreter for Ethereum bytecode.

ConceptFree Monad WorldEVM Reality
Instruction typeEnum of opcodesFixed EVM opcodes
Program representationAST built from Free<F, A>Linear bytecode array
ContinuationClosure in FlatMapProgram counter (PC)
InterpreterPattern match on instruction enumSwitch on opcode, mutate VM state
Multiple interpretersDifferent interpreters for same ASTMultiple Ethereum client implementations

Mini EVM

Here’s a mini-EVM in Rust using the same free monad style from the Console example, so you can clearly see the mapping to Ethereum’s real EVM execution model.

Instruction Set (EVM subset)

#![allow(unused)]
fn main() {
#[derive(Clone)]
enum EvmInstr<A> {
    Push(u64, A),              // Push constant to stack
    Add(A),                    // Pop two, push sum
    SStore(u64, u64, A),       // Store key-value
    SLoad(u64, fn(u64) -> A),  // Load value from storage
}
}

Free Monad Core

#![allow(unused)]
fn main() {
#[derive(Clone)]
enum Free<F, A> {
    Pure(A),
    Suspend(F),
    FlatMap(Box<Free<F, A>>, Box<dyn Fn(A) -> Free<F, A>>),
}

impl<F: Clone + 'static, A: 'static> Free<F, A> {
    fn pure(a: A) -> Self {
        Free::Pure(a)
    }
    fn flat_map<B: 'static, G>(self, f: G) -> Free<F, B>
    where
        G: Fn(A) -> Free<F, B> + 'static,
    {
        match self {
            Free::Pure(a) => f(a),
            other => Free::FlatMap(Box::new(other), Box::new(f)),
        }
    }
}
}

Smart Constructors

#![allow(unused)]
fn main() {
fn push(val: u64) -> Free<EvmInstr<()>, ()> {
    Free::Suspend(EvmInstr::Push(val, ()))
}

fn add() -> Free<EvmInstr<()>, ()> {
    Free::Suspend(EvmInstr::Add(()))
}

fn sstore(key: u64, value: u64) -> Free<EvmInstr<()>, ()> {
    Free::Suspend(EvmInstr::SStore(key, value, ()))
}

fn sload(key: u64) -> Free<EvmInstr<u64>, u64> {
    Free::Suspend(EvmInstr::SLoad(key, Free::pure))
}
}

EVM State

#![allow(unused)]
fn main() {
use std::collections::HashMap;

struct EvmState {
    stack: Vec<u64>,
    storage: HashMap<u64, u64>,
}
}

Interpreter

#![allow(unused)]
fn main() {
fn run_evm<A>(mut prog: Free<EvmInstr<A>, A>, state: &mut EvmState) -> A {
    loop {
        match prog {
            Free::Pure(a) => return a,
            Free::Suspend(EvmInstr::Push(v, next)) => {
                state.stack.push(v);
                prog = Free::Pure(next);
            }
            Free::Suspend(EvmInstr::Add(next)) => {
                let b = state.stack.pop().unwrap();
                let a = state.stack.pop().unwrap();
                state.stack.push(a + b);
                prog = Free::Pure(next);
            }
            Free::Suspend(EvmInstr::SStore(k, v, next)) => {
                state.storage.insert(k, v);
                prog = Free::Pure(next);
            }
            Free::Suspend(EvmInstr::SLoad(k, cont)) => {
                let val = *state.storage.get(&k).unwrap_or(&0);
                prog = cont(val);
            }
            Free::FlatMap(inner, cont) => match *inner {
                Free::Pure(a) => prog = cont(a),
                Free::Suspend(op) => prog = Free::Suspend(op),
                _ => unreachable!(),
            },
        }
    }
}
}

Example Program

fn main() {
    let program = push(2)
        .flat_map(|_| push(3))
        .flat_map(|_| add())
        .flat_map(|_| sstore(1, 5))
        .flat_map(|_| sload(1))
        .flat_map(|val| push(val));

    let mut state = EvmState {
        stack: vec![],
        storage: HashMap::new(),
    };

    run_evm(program, &mut state);

    println!("Final stack: {:?}", state.stack);
    println!("Storage: {:?}", state.storage);
}

How This Maps to the Real EVM

  • EvmInstr enum = EVM opcode set (syntax).
  • Free<EvmInstr, A> = Abstract EVM program before compilation.
  • run_evm = Interpreter (like Geth, Besu, Nethermind).
  • EvmState = EVM stack, memory, and storage.
  • The FlatMap continuations here are exactly what the real EVM defunctionalizes into a program counter and bytecode array.

Here is the defunctionalized mini-EVM: continuations are replaced by a program counter.

Core idea

  • Continuation in Free FlatMap becomes an integer pc.
  • Program is a flat Vec<Op>.
  • The interpreter is a loop that reads code[pc], mutates state, then updates pc.

Rust code

use std::collections::HashMap

#[derive(Clone, Debug)]
enum Op {
    Push(u64),
    Add,
    SStore(u64, u64),
    SLoad(u64),       // pushes value at key
    Halt,
}

#[derive(Default)]
struct VM {
    pc: usize,
    stack: Vec<u64>,
    storage: HashMap<u64, u64>,
    code: Vec<Op>,
}

impl VM {
    fn new(code: Vec<Op>) -> Self {
        VM { code, ..Default::default() }
    }

    fn step(&mut self) -> bool {
        match self.code.get(self.pc).cloned().unwrap_or(Op::Halt) {
            Op::Push(v) => {
                self.stack.push(v);
                self.pc += 1;
            }
            Op::Add => {
                let b = self.stack.pop().expect("stack underflow")
                let a = self.stack.pop().expect("stack underflow")
                self.stack.push(a + b);
                self.pc += 1;
            }
            Op::SStore(k, v) => {
                self.storage.insert(k, v);
                self.pc += 1;
            }
            Op::SLoad(k) => {
                let v = *self.storage.get(&k).unwrap_or(&0);
                self.stack.push(v);
                self.pc += 1;
            }
            Op::Halt => return false,
        }
        true
    }

    fn run(&mut self) {
        while self.step() {}
    }
}

fn main() {
    // Program: push 2; push 3; add; sstore 1,5; sload 1; halt
    let code = vec![
        Op::Push(2),
        Op::Push(3),
        Op::Add,
        Op::SStore(1, 5),
        Op::SLoad(1),
        Op::Halt,
    ]

    let mut vm = VM::new(code)
    vm.run()

    println!("Final stack: {:?}", vm.stack)
    println!("Storage: {:?}", vm.storage)
}

Mapping to Free and CPS

  • Free continuation Fn(A) -> Free is now the integer pc.
  • match on Op is the defunctionalized apply function.
  • Vec<Op> is the defunctionalized program - a linear bytecode.
  • The loop while self.step() is the CPS trampoline without closures.