Announcing the first Arecibo release



We’re happy to announce the first crate release of Arecibo, our fork of Nova. Arecibo is intended to be an “incubation project” for Nova, testing out implementations of improvements to the Nova folding framework, and back-contributing these changes to the upstream project. Please consider that as a project meant to evolve quickly, it makes few guarantees of backward-compatibility, and that the Arecibo-specific changes have not been reviewed by the Nova maintainers.

However, we are building our main project, Lurk on top of Nova, and we need a crate release for Arecibo to support the Lurk release. This note is meant to describe the main feature you can find in that crate: our implementation of SuperNova.

Technical Release Note: SuperNova Protocol Integration into Nova #

Introduction #

This release note highlights the recent integration of the SuperNova protocol into Arecibo’s cryptographic suite, marking a significant advancement in the field of cryptographic proofs, particularly in the area of folding schemes. SuperNova, by Abhiram Kothapalli and Srinath Setty, introduces a groundbreaking approach to proof statements in virtual machine modelization and customizable circuit operations, significantly reducing the computational costs and complexities traditionally associated with universal circuits in Incrementally Verifiable Computation (IVC).

Nova’s existing folding scheme, effective in handling repeated execution of a universal circuit, faces limitations due to costs proportional to the size of the universal circuit. SuperNova addresses this by allowing the breakdown of the universal circuit into smaller sub-circuits for each alternative instruction, thus reducing the complexity and cost of the folding operation. This integration is especially beneficial for virtual machine modeling and extends to cases where a core set of instructions is augmented with user-defined operations, as seen in languages like Lurk.

Initial testing of this integration within the Lurk proof language, particularly its coprocessor feature in the current Beta release, has shown promising results. The primary audience for this significant development includes developers and researchers in the field of cryptography, who stand to benefit from the enhanced efficiency and reduced costs in proof statement processing.

The overarching goal of integrating SuperNova into Nova is to establish the Non-Uniform Incremental Verifiable Computation (NIVC) paradigm as a standard in folding scheme implementations. By back-contributing this stable contribution to the Nova repository, we aim to incubate SuperNova’s functionalities, ensuring wider adoption and recognition in the cryptographic community

Technical Overview #

SuperNova enhances Nova’s folding schemes by introducing a more efficient approach to non-uniform Incremental Verifiable Computation (NIVC). It leverages cryptographic primitives that enable a prover and a verifier to fold two N-sized NP instances into a single N-sized instance. This folding is such that the resulting instance is only satisfiable if the original instances are satisfiable, and it specifically applies to a variant of the R1CS (Relaxed Rank-1 Constraint Systems).

In each step, SuperNova’s prover folds an R1CS instance, representing the prior step of program execution, into a running instance with an associated witness. This instance, along with advice from the folding scheme, is then fed into an augmented circuit. This circuit comprises a verifier circuit and one of the $k$ functions from the set ${F_1, …, F_k}$. The verifier circuit itself includes two key components: one for verifying the non-interactive folding scheme for R1CS, and another for computing a transition function $ϕ$.

SuperNova distinguishes itself by using multiple running instances, one for each instruction supported by the machine. The verifier circuit determines which running instance to fold into, guided by the transition function $ϕ$ embedded within it. In its natural design, the size of the verifier circuit scales linearly with $k$. However, SuperNova employs offline memory checking techniques to render the verifier circuit size independent of $k$.

After $N$ steps, the prover is left with $k$ running instances and an R1CS instance representing the final step of the program execution. The prover can prove these by sending associated witnesses for verification. While the proof size is not dependent on the number of steps in the program execution, it is proportional to the sum of sizes of circuits for functions in ${F_1, …, F_k, ϕ}$1. Although initially not zero-knowledge, this issue is resolved similarly to Nova. Compressed proofs and zero-knowledge are achieved by invoking a general-purpose zkSNARK, such as Spartan, making SuperNova a powerful addition to the Nova framework in handling complex cryptographic proofs.

Contributions and Improvements #

Original protocol sketches by the community. #

Cryptographic protocols take a village, and this integration of SuperNova is no different. Foundational contributions came from various community researchers:

The project was kicked-off with Wyatt Benno’s foundational kick-off, laying down the groundwork for integrating the SuperNova protocol with Nova.

Later on, Sun Ming provided a landmark PR with an initial version of the complete folding argument, which kick-started the work incubating it in Arecibo.

Srinath Setty’s improvement in PR 250, while not directly related to SuperNova, were also incredibly helpful to us for implementing a compressing SNARK. The main feature was the refactor of the pre-processing variant of Spartan, replacing the “grand product argument” from [Quarks §5] with argument based on log-derivates popularized by [Multivariate lookups based on logarithmic derivatives].

A note on how this new argument for the pre-processing SNARK works: If a prover claims that two multilinear polynomials $P(\underline X)$ and $Q(\underline X)$ are permutations of each other, they would produce a proof that the following products are equal, for a randomly sampled challenge $r$ $$ \prod_{\underline x \in {0, 1}^k} \big( P(\underline x) + r \big) = \prod_{\underline x \in {0, 1}^k} \big( Q(\underline x) + r \big). $$

Taking the log-derivate on each side, the claim is equivalent to showing that the following sums are equal instead:

$$ \sum_{\underline x \in {0, 1}^k} \frac{1}{ P(\underline x) + r } = \sum_{\underline x \in {0, 1}^k} \frac{1} {Q(\underline x) + r }. $$ Given that Spartan is based on Sumcheck, the second claim is much easier to work with. The key benefits of this new approach are:

  • Reducing proof size by 40 field elements and 4 commitments
  • Removing the need for a second Sumcheck instance by combining all claims into a single instance, saving ~$\log(N)$ hashes for the verifier

While these improvements are impressive on their own, they are even more beneficial to SuperNova where the savings scale with the number of instances. Moreover, we were able to reuse the existing Sumcheck batching functionality to handle multiple instances rather seamlessly.

Outline of our contributions: #

With the amazing work from the original contributors, a lot of our work was already cut out for us.

After some light refactoring, we were able to reduce the number of variables and constraints in the recursive verification circuit, and to streamline the API for dealing with the “program counter” (also represented as $\phi$ in the literature).

We also resolved some correctness issues in the recursive SNARK verification, which is primarily used in our test suite to validate the folding logic, rather than computing a full compressed SNARK.

While working on the batched pre-processing SNARK implementation, we discovered a bug in the original protocol that would have allowed a malicious prover to craft a valid proof for arbitrary public inputs. The vulnerability is described in this note and was disclosed to Setty who promptly addressed and fixed it in Nova PR 274. This fix unfortunately would not have scaled to the batched setting and would have resulted in a large efficiency loss. We therefore opted to implement a more efficient counter-measure that applies to both the direct and batched setting, ensuring uniformity between both variants (called Bounded Witness Sumcheck in our note).

Finally, the bigger chunk of our contribution was to implement a final SNARK that made sense for folded SuperNova proofs.

Compressing SuperNova proofs #

The current implementation of Nova uses a modified version of the Spartan SNARK protocol as a final compressing SNARK for relaxed R1CS instances. In the move to SuperNova, we implemented a batched version of Spartan that can prove the satisfiability of $n \geq 1$ relaxed R1CS instances in a single, efficiently verifiable SNARK proof.

A naive solution to compressing SuperNova proofs would be for the prover to provide a SNARK proof for each individual circuit in the SuperNova NIVC instance. In the case of SuperNova implemented on a cycle of curves, there will be an additional SNARK proof for the secondary circuit as well. In total this implies that the prover and verifier costs would both scale linearly in the number of circuits.

As currently implemented in compressed SNARK feature pull request, we implement an alternative construction wherein the multiple Spartan proofs can be batched and proved/verified together. This reduces the cost of the verifier to the complexity of verifying a single (or two, in the case of a cycle of curves) Spartan proofs.

We will not go into the specifics of the Spartan or the Sumcheck protocols. For that we refer the reader to the fantastic resources linked below. Instead we aim to give a rough idea of the implementation—focusing on some of the subtleties related to batching proofs for circuits of different sizes.

For full details, see SuperNova CompressedSNARK description.

Batching Sumchecks and Spartan #

Given two (we will restrict to the case of $n = 2$ Sumcheck instances, but the arguments are readily generalizable) multivariate polynomials $P_1(x_1, \ldots, x_k)$ and $P_2(x_1, \ldots, x_k)$ together with two claims of the form $$ H_i = \sum_{\underline x \in {0, 1}^k} P_i(\underline x) $$ there is a well-known technique to batch the two Sumcheck proofs into one. First the verifier provides one extra field element $r$ as a challenge to the prover (or in the non-interactive model, the prover queries the random oracle for $r$) and the prover then proceeds with a single Sumcheck for a claim of the form $$ H_1 + r H_2 = \sum_{\underline x \in {0, 1}^k} P_1(\underline x) + r P_2(\underline x). $$ Note here we are using the notation $\underline x$ as shorthand for the list of inputs $\underline x = (x_1, x_2, \ldots, x_k)$.

The Spartan protocol, naively understood as being built out of two Sumchecks, can similarly be batched. Consider two relaxed R1CS shapes $(A_i, B_i, C_i)$, instances $\mathbb U_i = (\overline W_i, \overline E_i, u_i, \mathsf x_i)$, and witnesses $\mathbb W_i = (W_i, E_i)$ ($i = 1, 2$). Suppose $A_i$ are square $s \times s$ matrices with $s = 2 ^ k$. Let $Z_i = (W_i, u_i, \mathsf x_i)$ be the satisfying relaxed R1CS assignment. Denote $\widetilde A_i$, $\widetilde W_i$, $\widetilde Z_i$, etc… the multilinear polynomial extensions of the matrices and vectors above.

Recall that the outer Sumcheck of the Spartan protocol involves an additional multilinear extension polynomial $\widetilde{\mathrm{eq}}(\underline r, \underline x)$ which is partially evaluated at a random challenge $\underline \tau = (\tau_1, \tau_2, \ldots, \tau_k)$ provided by the verifier (in practice turning a polynomial zero-check into a polynomial Sumcheck).

Proceeding with the above notation, the rough idea of the batched Spartan work is to take the two outer Sumcheck claims $$ 0 = \sum_{\underline x \in {0, 1}^k} \widetilde{\mathrm{eq}}(\underline \tau, \underline x) \left[\left(\sum_{\underline y \in {0, 1}^s} \widetilde A_i(\underline x, \underline y) \widetilde Z_i(\underline y)\right)\left(\sum_{\underline y \in {0, 1}^s} \widetilde B_i(\underline x, \underline y) \widetilde Z_i(\underline y)\right) - \left(u_i \cdot \sum_{\underline y \in {0, 1}^s} \widetilde C_i(\underline x, \underline y) \widetilde Z_i(\underline y) + \widetilde E_i(\underline x)\right)\right] $$ and batch them into a single claim of the form $$ 0 = \sum_{\underline x \in {0, 1}^k} \widetilde{\mathrm{eq}}(\underline \tau, \underline x) \left(G_1(\underline x) + r G_2(\underline x)\right) $$ where we take $G_i$ to be the inner expression of the above summands. This reduces the prover’s claim of the above sum to an evaluation of the $G_i$ at a random challenge $\underline r_x = (r_{1,x}, r_{2, x}, \ldots, r_{k, x})$. These claimed evaluations are of the form $$ \sum_{\underline y \in {0, 1}^k} \widetilde A(\underline r_x, \underline y) \widetilde Z(\underline y) $$ together with a similar expression with $\widetilde B$ and $\widetilde C$. They and can be proven with another collection of Sumcheck proofs. The evaluations of $\widetilde E_i(\underline r_x)$ will be handled at the end in a batched opening to the polynomial commitments.

In practice, the current implementation of Spartan for a single relaxed R1CS instance already batches these “inner” Sumchecks. The batched Spartan protocol simply has a larger set of Sumcheck claims to batch.

A subtlety not yet mentioned has to do with the case when the polynomials entering the Sumcheck claims have a different number of variables. This will happen in practice when batching Spartan proofs corresponding to R1CS shapes with numbers of constraints that differ by a multiplicative factor of 2 or more. Suppose for example $P_1$ is a polynomial in $k_1$ variables, and $P_2$ is a polynomial in $k_2$ variables. To account for this difference, we simply consider both $P'_1$ and $P'_2$ as polynomials in $n = \mathrm{max}{k_1, k_2}$ variables by introducing new free variables to the original polynomials.

With this perspective, the initial claim $H_i$ is scaled by $2^{n-k_i}$ to account for the summation over the new free variables, since each one doubles the number of occurrences the terms in the original sum. $$ \sum_{\underline x \in {0, 1}^n} P'_i(\underline x) = \sum_{\underline y \in {0, 1}^{n-k_i}} \sum_{\underline x \in {0, 1}^{k_i}} P'_i(\underline y,\underline x) = \sum_{\underline y \in {0, 1}^{n-k_i}} \sum_{\underline x \in {0, 1}^{k_i}} P_i(\underline x) = 2^{n-k_i} \cdot H_i. $$

As an optimization, we can avoid any unnecessary computation due to the extra free variables using the following observation. Let $S^{(j)}_i(X_j)$ denote the univariate polynomial computed by the Sumcheck prover in round $0 \leq j < n$ for the $i$-th claim. In the first $n - k_i$ rounds of the protocol, this polynomial is constant and equal to $2^{n-k_i-j-1}\cdot H_i$. At this point there are $k_i$ rounds remaining, we will have recovered the original claim since $S^{(n-k_i-1)}i(r{n-k_i-1}) = H_i$.

This approach to Sumcheck batching avoids having to pad the polynomials in different sized claims thereby saving memory. Moreover, the optimization we describe above ensure the prover performs the same amount of work as it would the individual claims were proved using independent Sumcheck instances of the respective sizes.

Implementation details #

To summarize, our main contributions in this respect fall into three categories:

  1. We implement a new method SumcheckProof::prove_cubic_with_additive_term_batch as a combination of prove_quad_batch and prove_cubic_with_additive_term. The former is used to batch the inner Sumchecks of the already existing Spartan implementation and the latter is used for the outer Sumcheck.

  2. We introduce a new trait, BatchedRelaxedR1CSSNARKTrait, mirroring the already existing RelaxedR1CSSNARKTrait. The difference being that the batched version accepts a list of relaxed R1CS shapes, instances, and witnesses for the prover and verifier. We then implement this trait using the above batched Spartan protocol.

  3. We define a supernova::CompressedSNARK type that is generic in the above BatchedRelaxedR1CSSNARKTrait. When instantiated with the batched Spartan implementation it yields a final compressing SNARK for supernova proofs with an efficient verifier. Included are benchmarks and tests of correctness. In particular, there are tests which check the correctness for NIVC instances where the different number of constraints between the circuits will result in Sumchecks over polynomials with different numbers of variables.

Organization of SuperNova contributions #

A lot of the SuperNova contributions provide NIVC versions of already existing code meant for Nova IVC. In this section we provide a table to elucidate some of the structure of the SuperNova code by comparing it to the locations of already existing Nova implementations.

Augmented Circuitcircuit.rssupernova/circuit.rs
Direct Spartanspartan/snark.rsspartan/batched.rs
Spartan with Spark preprocessingspartan/ppsnark.rsspartan/batched_ppsnark.rs
(batched) Sumcheck primitivesspartan/Sumcheck.rsspartan/Sumcheck.rs

Benchmarks #

We have a few preliminary benchmarks available for our SuperNova implementation. While these benchmarks have a limited scope, they show a nice improvement in performance in favor of our approach to batched sumchecks compared to the standard Nova, on the pre-processing SNARK. Our integration with Lurk, on more sizeable circuits, should show results that are more reflective of the real world soon™.

Closing Thoughts #

In closing, the successful implementation of the Arecibo project truly exemplifies the adage that it takes a village to achieve greatness. Our heartfelt gratitude goes to all the contributors, including Srinath Setty, Abhiram Kothapalli, Sun Ming, Wyatt Benno, and others, whose collective expertise has been instrumental in this endeavor. The rapid iterations and collaborative efforts in the Arecibo repository have been key to the advancement of this implementation. It is now time for us to work on contributing this back to Nova, and work in the next stages of our project. Feel free to explore Arecibo, we would be greateful for your feedback!

Resources #

Prior references on Nova, SuperNova, the sumcheck protocol, and the LogUp sub-protocol used as a sub-part of the pre-processing SNARK.

  1. There is a nuance: arguably our implementation of $\phi$ is the one implied by the paper, since that requires access to the whole witness. In that case, the cost of $\phi$ is actually the sum of k distinct $\phi$ circuits. ↩︎