Ethereum co-founder Vitalik Buterin and the head of AI at the Ethereum Foundation, Davide Crapis, have outlined a ZK-based design that would let people pay for artificial intelligence and other digital services without revealing who they are or how they use those services, aiming to fix what they describe as a core flaw in current API business models.
The pair argue that today’s approach to charging for high-frequency services, such as large language model (LLM) inference, cloud APIs, or blockchain infrastructure, forces providers to trade off between privacy, security, and efficiency.
Two flawed ways to charge for AI and APIs
According to their description, providers today typically rely on one of two models.
The first is a conventional Web2 identity system, where users log in with email, credit card details, or other personal identifiers, which makes billing and abuse prevention straightforward but also ties every request to a real-world identity, creating a detailed usage trail that can fuel profiling and lead to significant privacy risks, especially when prompts contain sensitive information.
The second approach is to charge on-chain for every request, which can reduce reliance on traditional identity but introduces new problems, such as slow, expensive transactions and an on-chain history that creates a transparent map of a user’s activity that is hard to disguise.
Buterin and Crapis say what is needed instead is a system in which someone can send funds once, then make thousands of paid requests anonymously. The service operator would still be protected against spam and guaranteed payment but would not be able to link any single request to a specific person or tie multiple requests to each other.
How the zero-knowledge credit system works
The proposed “ZK API usage credit” protocol ties anonymity to a financial stake through a series of cryptographic checks.
Under the proposed method, a user first creates a secret key and derives from it an identity commitment that is recorded, along with their initial deposit, in a smart contract. The contract stores these commitments in an on-chain Merkle tree, registering participants without revealing who they are.
Each request consumes a numbered “ticket,” and before sending it, the user generates a zero-knowledge proof showing that their current ticket index is still within their allowed credit. In doing so, the proof demonstrates that the total possible spend up to that index is covered by the initial deposit plus the sum of all previously earned refunds, without revealing the size or timing of any individual request.
Because different requests can have different costs, the protocol assumes that each call starts by reserving a maximum fee, which is deducted from the user’s balance. After the server processes the request and learns the actual cost, it issues a signed refund ticket for any unused amount. The user keeps these tickets off-chain and includes their total value in later proofs, allowing them to reclaim unused credit and extend their usable capacity without exposing a detailed spending history.
Rate limits, spam controls, and dual staking
Each ticket index is linked to a nullifier derived from the user’s secret key. If a user tries to reuse the same index with a different message, the server can compare the two signals, recover the secret key, and cut into the user’s stake. This design limits throughput to what the deposit can safely cover and makes any attempt at double-spending both detectable and expensive.
To handle content policy violations, the design introduces a second layer of stake alongside the main rate-limit collateral. The total amount a user locks is split into an RLN stake, governed purely by the mathematics of the protocol and claimable when double-signaling is proven, and a policy stake, governed by the provider’s rules.
When a request passes the cryptographic checks but clearly breaches usage policies, such as asking for weapon-building instructions or help in bypassing security controls, the server can call a function on the contract that burns the policy portion tied to that request’s nullifier. Since the provider cannot keep this amount, it is sent to a burn address, removing the incentive to fabricate violations for profit.
Each such burn is recorded on-chain with the relevant nullifier and any accompanying evidence. Although the user’s identity remains concealed, outside observers can review how often a provider resorts to policy slashing and under what stated grounds, adding a measure of public accountability to the system.