Every year in the United States, ~50,000 graduating medical students (doctors, for short) are placed into the residencies where they will begin their careers by the National Resident Matching Program (or The Match). To do so, the program finds a stable matching: an assignment of doctors to medical programs (hospitals, for short) such that there is no pair of doctor and hospital that would prefer to be assigned each other over the match(es) they ended up with. Each candidate doctor ranks their preferred hospitals and each hospital ranks their preferences in candidates, and the matching proceeds algorithmically.

One of the original problems in the field of algorithmic mechanism design, the matching program uses the Roth-Peranson algorithm (1999), a modification of the earlier Gale-Shapley algorithm (1962). These contributions in large part earned Roth & Shapley the 2012 Nobel Prize in Economics.

The Match is a classic example of a computation whose transparency and accountability can benefit from private verifiable computation: at present, doctors receive no direct assurance that an algorithmic decision that will shape their careers (and their career earnings) is being executed correctly. However, The Match cannot simply publish a transcript of the matching, as it would harm the privacy of the doctors in particular (“Alice said she wanted to come back home for residency, but she didn’t rank any local hospitals” or “when we interviewed Bob he said fifteen years ago he ranked us highly in The Match, but he only had us seventh”).

Given that anonymizing the doctors can only accomplish so much, a much better solution would be to produce a proof of the correctness of the matching where the preference rankings themselves are kept private. Using the Nexus zkVM, we can do exactly that.

To start, we will follow along with the SDK Quick Start. After installing the zkVM, we run:

$ cargo nexus host matching

to initialize the host and guest programs.

We begin with the guest program. This is the program to be proven, which will need to implement the matching itself. In our case, we will just use the simpler Gale-Shapley approach, which runs in quadratic time. We will not explain how the following implementation works in detail, as there are plenty of existing explainers of the algorithm, its runtime, and its correctness.

matching/src/guest/src/main.rs
#![cfg_attr(target_arch = "riscv32", no_std, no_main)]

extern crate alloc;
use alloc::collections::BTreeSet;
use alloc::{vec, vec::Vec};

fn stable_matching(
    n: usize,
    mut employers: BTreeSet<usize>,
    mut candidates: BTreeSet<usize>,
    employer_prefs: Vec<Vec<usize>>,
    candidate_prefs: Vec<Vec<usize>>,
) -> Vec<Option<usize>> {
    let mut hires: Vec<Option<usize>> = vec![None; n];
    while !employers.is_empty() {
        let next = employers.pop_first().unwrap();
        let prefs = &employer_prefs[next];

        for c in prefs {
            if candidates.contains(c) {
                hires[next] = Some(*c);
                assert!(candidates.remove(c));
                break;
            } else {
                let current = hires
                    .iter()
                    .position(|&x| x.is_some() && *c == x.unwrap())
                    .unwrap();
                if candidate_prefs[*c].iter().position(|&x| next == x)
                    < candidate_prefs[*c].iter().position(|&x| current == x)
                {
                    assert!(employers.insert(current));
                    hires[next] = Some(*c);
                    hires[current] = None;
                    break;
                }
            }
        }
    }

    hires
}

Looking at the interface, we can see the stable matching algorithm takes in a set of candidates (the doctors), a set of employers (the hospitals), preference rankings in matrix form (so that row i of the employer_prefs matrix encodes the ranking for the ith doctor), and a parameter n = employers.len() that counts the number of required matches to be made. It then returns a vector of length n, where hires[j] = i means the jth hospital will hire the ith doctor.

To use this function, we will also need a main function that handles the input and output of the zkVM, especially as we will want the preference matrices themselves to be kept private.

matching/src/guest/src/main.rs
#[nexus_rt::main]
#[nexus_rt::private_input(prefs)]
#[nexus_rt::public_input(lists)]
fn main(
    prefs: (Vec<Vec<usize>>, Vec<Vec<usize>>),
    lists: (BTreeSet<usize>, BTreeSet<usize>),
) -> Vec<Option<usize>> {
    let (mut employer_prefs, mut candidate_prefs) = prefs;
    let (mut employers, mut candidates) = lists;

    let n = employers.len();
    let m = candidates.len();

    nexus_rt::print!("Matching {} employers with {} candidates... ", n, m);

    let hires = stable_matching(n, employers, candidates, employer_prefs, candidate_prefs);

    nexus_rt::println!("completed.");

    hires
}

Note that in the above, we do not actually need to write #[nexus_rt::private_input(prefs)], as all inputs default to private unless they are explicitly declared to be public.

This program includes no proprietary or private information, and so can be publicly distributed, reviewed, and audited for correctness by interested parties. Meanwhile, the public input and output of this program only leaks which hospital each candidate ultimately matched with. This is necessary and sufficient for each doctor to confirm the algorithm did, in fact, assign them to that hospital, and is generally considered to be public information anyway: even if the doctors themselves choose not to publicize it, medical schools and hospitals often publish lists of matchings.

To prove an execution of this guest program, we use the host program. Our host program compiles the guest program and runs it using the preferences provided by both the candidates and hospitals. Since the host program will not be made public, it is perfectly fine for us to include private information like the preference rankings directly within it. The specific ranking matrices we use here were randomly generated, for demonstration purposes.

matching/src/main.rs
use nexus_sdk::{
    compile::{cargo::CargoPackager, Compile, Compiler},
    stwo::seq::Stwo,
    ByGuestCompilation, Local, Prover, Viewable,
};
use std::collections::BTreeSet;

type Matches = Vec<Option<usize>>;
type Prefs = (Vec<Vec<usize>>, Vec<Vec<usize>>);
type Lists = (BTreeSet<usize>, BTreeSet<usize>);

const PACKAGE: &str = "guest";

fn main() {
    println!("Compiling guest program...");
    let mut prover_compiler = Compiler::<CargoPackager>::new(PACKAGE);
    let prover: Stwo<Local> =
        Stwo::compile(&mut prover_compiler).expect("failed to compile guest program");

    let n = 10;
    let m = 8;
    assert!(m < n);

    let mut employers = BTreeSet::<usize>::new();
    let mut candidates = BTreeSet::<usize>::new();

    for i in 0..n {
        employers.insert(i);
        if i < m {
            candidates.insert(i);
        }
    }

    let employer_prefs = vec![
        vec![4, 1, 2, 7, 3, 0, 6, 5],
        vec![0, 1, 4, 5, 7, 2, 6, 3],
	    vec![3, 7, 4, 6, 2, 0, 5, 1],
        vec![4, 6, 0, 2, 7, 3, 5, 1],
        vec![2, 6, 1, 3, 7, 0, 4, 5],
        vec![1, 4, 5, 3, 7, 0, 6, 2],
        vec![1, 5, 4, 0, 6, 2, 7, 3],
        vec![3, 6, 1, 2, 7, 0, 5, 4],
	    vec![7, 5, 3, 4, 1, 6, 2, 0],
        vec![0, 6, 1, 4, 5, 2, 7, 3],
    ];

    let candidate_prefs = vec![
        vec![0, 5, 7, 8, 3, 1, 4, 2, 6, 9],
        vec![8, 2, 0, 9, 3, 4, 5, 6, 1, 7],
        vec![3, 6, 4, 0, 5, 9, 7, 8, 2, 1],
        vec![2, 9, 7, 3, 4, 1, 5, 8, 0, 6],
        vec![7, 5, 9, 4, 6, 8, 0, 1, 3, 2],
        vec![4, 1, 7, 6, 2, 9, 8, 0, 5, 3],
	    vec![0, 5, 3, 6, 8, 7, 4, 1, 9, 2],
        vec![7, 5, 6, 1, 4, 0, 2, 8, 9, 3],
    ];

    println!("Proving execution of vm...");
    let (view, proof) = prover.prove_with_input::<Prefs, Lists>(
        &(employer_prefs, candidate_prefs),
        &(employers, candidates),
    ).expect("failed to prove program");

    println!(
        ">>>>> Logging\n{}<<<<<",
        view.logs().expect("failed to retrieve debug logs").join("")
    );
    assert_eq!(view.exit_code().expect("failed to retrieve exit code"), nexus_sdk::KnownExitCodes::ExitSuccess as u32);

    let hires = view.public_output::<Matches>().expect("failed to retrieve public output");

    print!("Found matching (employer, candidate): ");
    for i in 0..n {
        if i > 0 {
            print!(", ")
        }
        print!(
            "({}, {})",
            i,
            if hires[i].is_some() {
                hires[i].unwrap().to_string()
            } else {
                "None".to_string()
            }
        );
    }
    println!("\n");

    /// the host program would go on to publish `view` and `proof` publicly.
}

Once the program and its public input and output (the view) and the proof are published, any interested party — a participating doctor or hospital, or a third party such as a medical school, journalist, or regulator — can verify the correctness of the execution without needing access to the preferences.

println!("Verifier recompiling guest program...");
let mut verifier_compiler = Compiler::<CargoPackager>::new(PACKAGE);
let path = verifier_compiler.build().expect("failed to (re)compile guest program");

print!("Verifying execution...");
proof.verify_expected_from_program_path::<&str, Lists, Matches>(
  &(employers, candidates),                       // at view.public_input
  nexus_sdk::KnownExitCodes::ExitSuccess as u32,  // at view.exit_code
  &hires,                                         // at view.public_output
  &path,                                          // path to program binary
  &[]                                             // no associated data,
).expect("failed to verify proof");

In the case the publicly claimed matching (the hires vector) has been tampered with to match candidates to hospitals in violation of the Gale-Shapley guarantees, this verification will provably fail.

One limitation of the guest program used above is that a verifier cannot directly confirm that the preferences themselves have not been tampered with. The guarantee of the zkVM is only existential over private inputs: that there exists some private input for which the execution is correct, not necessarily that the private input is what it should be in the context of the broader application. With a bit more effort however, we can even mitigate this concern.

At a high level, we’d extend the system so that when submitting their preferences, a concerned candidate would also supply The Match with a locally-generated, random passphrase known only to them. These passphrases would also be provided to the guest program as private input. The guest program would then go row by row through the preference matrix and compute hashes of each candidate’s preferences salted with their passphrase, and append those records to the public output. The candidate could then go and check the correctness of their hash to verify that their preferences were input honestly. The hash (as a random oracle) also prevents anyone else without a candidate’s passphrase from determining that candidate’s provided preferences.

As a final comment, although in this case we focus on stable matching, this design of using the zkVM for transparency and accountability applies to any form of algorithmic decision making: including decisions produced by more complex decision models such as those created through machine learning.