Benchmark Host and Guest program

Prerequisite

Before you start, please install the latest Nexus zkVM and read how to create a Nexus host project. We strongly recommend you follow that guide first to understand the basic concepts of host and guest programs and how they work with the Nexus zkVM.

This guide demonstrates how to benchmark the performance of Nexus zkVM using a Fibonacci sequence calculation as an example. We’ll analyze both the guest program’s RISC-V cycle count and the host program’s execution time for various Fibonacci sequence lengths.

After running cargo nexus host benchmark, a new Rust benchmark project directory is created with the following structure:

./benchmark
├── Cargo.lock
├── Cargo.tom
├── rust-toolchain.toml
└── src
    ├── main.rs
    └── guest
        ├── Cargo.toml
        ├── rust-toolchain.toml
        └── src
            └── main.rs

The guest program is located in src/guest/src/main.rs, and the host program is located in src/main.rs.

Host program

First, we take a look at the host program. In short, our host program will:

  1. Compile the guest program to a RISC-V ELF binary.

  2. Prove the execution of the guest program with the public input and public output from environment variables.

  3. View the print logs from the guest program.

  4. Check the exit code of the proved program. If the exit code is ExitSuccess, it means the guest program execution completed successfully as we expect it to.

  5. Test verify the proof, with the public input and public output pair from environment variables.

To benchmark, we will also add a few lines of code to measure the execution time both proving and verifying the execution of the guest program, as well as print out the proof size.

To implement the above, let’s modify the host program src/main.rs as follows:

use nexus_sdk::{
    ByGuestCompilation, Local, Prover, Verifiable, Viewable,
    compile::{Compile, Compiler, cargo::CargoPackager},
    stwo::seq::Stwo,
};
use std::env;
use std::time::Instant;

const PACKAGE: &str = "guest";

fn main() {
    println!("=== Nexus zkVM Host Program Execution ===");

    let start_time = Instant::now();

    print!("1. Compiling guest program...");
    let compile_start = Instant::now();
    let mut prover_compiler = Compiler::<CargoPackager>::new(PACKAGE);
    let prover: Stwo<Local> =
        Stwo::compile(&mut prover_compiler).expect("failed to compile guest program");
    let compile_duration = compile_start.elapsed();
    println!(" {} ms", compile_duration.as_millis());

    let elf = prover.elf.clone(); // save elf for use with verification

    // Read public_input from environment variable, default to 10 if not set
    let public_input: u32 = env::var("PUBLIC_INPUT")
        .ok()
        .and_then(|s| s.parse().ok())
        .unwrap_or(10);

    // Read public_output from environment variable, default to 89 if not set
    let public_output: u128 = env::var("PUBLIC_OUTPUT")
        .ok()
        .and_then(|s| s.parse().ok())
        .unwrap_or(89);

    println!("Using public_input: {}", public_input);
    println!("Using public_output: {}", public_output);

    print!("\n2. Proving execution of VM...");
    let prove_start = Instant::now();
    let (view, proof) = prover
        .prove_with_input::<(), u32>(&(), &public_input)
        .expect("failed to prove program");
    let prove_duration = prove_start.elapsed();
    println!(" {} ms", prove_duration.as_millis());

    println!("\n3. Execution Logs:");
    println!("-------------------");
    match view.logs() {
        Ok(logs) => println!("{}", logs.join("")),
        Err(e) => eprintln!("Error: Failed to retrieve debug logs - {}", e),
    }
    println!("-------------------");

    match view.exit_code() {
        Ok(code) => {
            if code == nexus_sdk::KnownErrorCodes::ExitSuccess as u32 {
                println!(
                    "\n4. Execution completed successfully (Exit code: {})",
                    code
                );
            } else {
                eprintln!("\n4. Execution failed (Exit code: {})", code);
                return;
            }
        }
        Err(e) => {
            eprintln!("\nError: Failed to retrieve exit code - {}", e);
            return;
        }
    }

    print!("\n5. Verifying execution... ");
    let verify_start = Instant::now();
    proof
        .verify_expected::<u32, u128>(
            &public_input, // no public input
            nexus_sdk::KnownErrorCodes::ExitSuccess as u32,
            &public_output, // no public output
            &elf,           // expected elf (program binary)
            &[],            // no associated data,
        )
        .expect("failed to verify proof");
    let verify_duration = verify_start.elapsed();

    println!("Succeeded!");
    println!("   Verification time: {} ms", verify_duration.as_millis());
    println!("   Proofsize: {} bytes", proof.size_estimate());

    let total_duration = start_time.elapsed();
    println!("\n=== Execution and Verification Complete ===");
    println!("Total execution time: {} ms", total_duration.as_millis());
}

Guest program

The guest program in src/guest/src/main.rs will be a simple Fibonacci program: it receives an public input n and returns fib(n+1) as the public output.

The public input is marked as #[cfg_attr(target_arch = "riscv32", nexus_rt::public_input(x))], which means when proving the public input will be passed as the guest program variable x at runtime.

The public output is the return value of the #[nexus_rt::main] function, which is fib(n+1) in this case.

#![cfg_attr(target_arch = "riscv32", no_std, no_main)]
#[cfg(target_arch = "riscv32")]
use nexus_rt::println;
#[cfg(not(target_arch = "riscv32"))]
use std::println;

use core::ops::Add;

#[derive(Copy, Clone)]
struct BN([u128; 6]);

const ONE: BN = BN([1, 0, 0, 0, 0, 0]);

// carrying_add is unstable, so we define it here
fn adc(x: u128, y: u128, c: bool) -> (u128, bool) {
    let (z1, c1) = x.overflowing_add(y);
    let (z2, c2) = z1.overflowing_add(if c { 1 } else { 0 });
    (z2, c1 || c2)
}

impl Add for BN {
    type Output = BN;
    fn add(self, rhs: Self) -> Self::Output {
        let (a, o) = adc(self.0[0], rhs.0[0], false);
        let (b, o) = adc(self.0[1], rhs.0[1], o);
        let (c, o) = adc(self.0[2], rhs.0[2], o);
        let (d, o) = adc(self.0[3], rhs.0[3], o);
        let (e, o) = adc(self.0[4], rhs.0[4], o);
        let (f, _) = adc(self.0[5], rhs.0[5], o);
        Self([a, b, c, d, e, f])
    }
}

fn fib_iter(n: u32) -> BN {
    let mut a = ONE;
    let mut b = ONE;

    for n in 0..n + 1 {
        if n > 1 {
            let c = a + b;
            a = b;
            b = c;
        }
    }
    b
}

#[nexus_rt::main]
#[cfg_attr(target_arch = "riscv32", nexus_rt::public_input(x))]
fn main(x: u32) -> u128 {
    let b = fib_iter(x);

    // We print the result, this will show in the host program execution log.
    println!("{}", b.0[0]);
    b.0[0]
}

Fibonacci Benchmark

The results below were obtained in 2025-02 on a MacBook Air M1 with 16 Gigabytes of RAM.

Profile host execution time

Because we take the public input and output from environment variables, we can use a simple script to run the benchmark with different inputs and outputs. Save this file as bench.sh, remember to make it executable with chmod +x bench.sh, and run it with ./bench.sh.

#!/bin/bash

# Arrays for input and output
public_inputs=(0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19)
public_outputs=(1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765)

# Loop through the arrays
for i in "${!public_inputs[@]}"; do
    public_input="${public_inputs[$i]}"
    public_output="${public_outputs[$i]}"

    echo "Running with PUBLIC_INPUT=$public_input and PUBLIC_OUTPUT=$public_output"

    # Run the Rust program with environment variables set
    # Ensure the Rust program is built in release mode
    PUBLIC_INPUT=$public_input PUBLIC_OUTPUT=$public_output cargo run --release

    echo "----------------------------------------"
done

Next, we put together a table to show the results. Units are in milliseconds.

n-th FibonacciCompile (ms)Prove (ms)Verify (ms)Total Time (ms)Proof size (bytes)
0161159211176451968
1149152211168451968
2148151411167551968
3151153811170151440
5151147712164151968
8128146511160646008
13136145411160249304
21138144012159149880
34149155112171351968
55149142213158451440
89151143212159651440
144153142711159349460
233149143412159650384
377138141711156750368
610148145311161351968
987150151012167349280
1597149144311160450384
2584146146013162050864
4181147144911161051968
6765149144011160251968

Notice the proving time is indifferent for the first 20 numbers of the Fibonacci sequence. This is likely due to the overhead of setting up zkVM dominating the cost, as the actual Fibonacci computation is not significant yet.

Let’s push further and calculate from the 200th Fibonacci number.

n-th FibonacciCompile (ms)Prove (ms)Verify (ms)Total Time (ms)Proof size (bytes)
200156174161891776258544
201154171791831751859904
202152174681831780459248
204158172381811757959248
205151172421841757957920

Now, the proving time is increasing with the size of the Fibonacci sequence as we would expect. Notice, however, that the verification time remains negligible compared to the proving time.

Suggestions for Developers

Based on these benchmark results, here are some key takeaways and suggestions for developers working with Nexus zkVM:

  1. Optimize Guest Programs: Focus on minimizing RISC-V cycles in your guest programs. The benchmark clearly shows that the number of RISC-V cycles directly impacts overall proving time, which is the most significant component of total execution time.

  2. Consider Algorithmic Efficiency: When working on computationally demanding scenarios, prioritize efficient algorithm design and implementation. The Fibonacci sequence example demonstrates how complexity can quickly escalate proving time.

  3. Profile Regularly: Regularly profile your Nexus zkVM projects to identify performance bottlenecks and opportunities for optimization.