"ATOMICALS VIRTUAL MACHINE"
AVM: Smart Contracts on Bitcoin by Simulating The Bitcoin Virtual Machine
Abstract. Until now, all overlay digital asset protocols on Bitcoin have operated according to fixed rules for the creation and transfer of digital assets. Completely flexible (Turing Complete) smart contracts can be created for overlay digital assets by allowing programmers to define state machine logic for creation and transfer rules. We propose a solution, called Atomicals Virtual machine (AVM), to enable smart contracts by leveraging Bitcoin as a global database, storing smart contract code in transactions and executing them in a sandboxed run-time via overlay digital asset indexers. The original Bitcoin Script Op Code instruction set is used as the programming language since it has all the important properties necessary for efficient execution in resource constrained environments. State hashes track the overlay transactions and provide an easy way for participants to communicate their synchronization state. The solution we present is the natural evolution of overlay digital asset protocols on Bitcoin and simultaneously serves as a testing ground for the original Bitcoin Op Codes to demonstrate their flexibility and safety.
1. Introduction
The Bitcoin Peer-to-Peer Electronic Cash System [1] has come to be primarily used as a store of value, so called “digital gold”, with medium of exchange being a somewhat distant secondary use case. Many of the original Op Codes which are required for advanced scripting were disabled by Satoshi Nakamoto prior to his departure. The explanations offered were security reasons — in particular the avoidance of potential denial of service attacks. The removed Op Codes are maintly the arithmetic and binary manipulation operations which developers rely upon even in the most basic programming environments. Without these crucial Op Codes, Bitcoin application developers and therefore end users are limited to narrow categores of usage. Particularly important is OP_CAT (data concatenation) which ultimately can be used to create custom spend and carry-forward constraints called covenants. As a result of these past decisions, Bitcoin can only be digital gold and not also the powerful smart contract system that Satoshi Nakamoto envisioned. In other words, Bitcoin is necessarily “digital gold” because it is not possible to create smart contracts owing to the fact that crucial Op Codes were disabled in the hopes of protecting the nascent electronic cash system.
“The nature of Bitcoin is such that once version 0.1 was released, the core design was set in stone for the rest of its lifetime. Because of that, I wanted to design it to support every possible transaction type I could think of. The problem was, each thing required special support code and data fields whether it was used or not, and only covered one special case at a time. It would have been an explosion of special cases. The solution was script, which generalizes the problem so transacting parties can describe their transaction as a predicate that the node network evaluates.” – Satoshi Nakamoto [2]
Various protection mechanisms have been added to Bitcoin since the Op Codes were first disabled. One such limit is MAX_SIGOPS which limits the maximum number of Signature Operations allowed in any given transaction. We also have the benefit of hindsight to see how the risks of the original Op Codes have played out in the Bitcoin forks such as Bitcoin Cash and Bitcoin Satoshi Vision, both of which have reactivated the vast majority of all the original Op Codes many years ago. To date there have been no security problems, no denial of service attacks, and practically no controversy with respect to the functionality of the reactivated Op Codes. On the contrary it has resulted in a significant expansion of development possibilities on those very Bitcoin forks.
Even with the current limitations of smart contracts on Bitcoin, a number of overlay protocols have emerged that allow the creation and transfer of digital assets on Bitcoin itself using overlay protocol indexers. The first major mainstream overlay protocol was the non-fungible token standard called Ordinals, which was quickly followed by the release of a fungible token standard called BRC20. Shortly after a number of other digital asset protocols have emerged such as Atomicals Digital Objects along with ARC20, a fungible token standard which leverages the satoshi units themselves as the unit of account. Various other overlay protocols were created recently such as Bitcoin Stamps and the Runes fungible token protocol in April 2024. The current generation of these overlay protocols work largely in the same way in that data in Bitcoin transactions are used to create and manage digital assets. Overlay protocol indexers handle the tracking and lifecycle of digital assets by reading data directly from the specially marked Bitcoin transactions. What is missing in all overlay protocols is the ability for programmers to customize the behaviour of digital assets — there has not yet been a way to create smart contracts for overlay protocols until now.
We present a method to create and execute smart contracts for various overlay digital assets by simulating the Bitcoin Virtual Machine and it’s Script Interpreter. The Bitcoin blockchain acts as the timestamp and data provider for smart contract programs stored on-chain, however the execution of the programs are performed by overlay protocol indexers using a sandboxed run-time. Overlay protocol indexer nodes are operated by application developers, services providers and users alike which creates a kind of emergent consensus. The concept and technique is generally applicable to any of the overlay protocols, with the appropriate modifications to their respective overlay protocol indexers.
We demonstrate the power and elegance of the original Bitcoin design. This new paradigm can serve as a testing ground for various Op Codes, albeit for digital asset overlay protocols, in the hopes that Bitcoin will eventually reactivate all of the original Op Codes, realizing the ultimate potential of Satoshi Nakamoto’s creation.
2. Bitcoin as Global Database
The Bitcoin network is fundamentally a distributed timestamp server designed for the purpose of solving the double spend problem. More generally the design of the system lends itself for the transmission and storage of more than mere monetary transactions. In fact, explicit features were included for the express purpose of storing data such as invoices and large files. Satoshi Nakamoto included various Op Codes such as OP_RETURN to allow arbitrary data attachments and OP_PUSHDATA4 which allows data pushes up to 4 Gigabytes in size. Even the very first Bitcoin transaction called the Genesis Coinbase Transaction included textual data: “The Times 03/Jan/2009 Chancellor on brink of second bailout for banks”.
Throughout the years there have been numerous attempts to discourage using Bitcoin as a data storage medium by restricting the maximum push data size to 520 bytes and limited the OP_RETURN payload size to 40 bytes (later expanded to 80 bytes). Indeed it was portrayed as an attack vector and would lead to runaway “blockchain bloat” that would crowd out pure monetary usages. It seemed like a reasonable protection mechanism at the time.
In recent times, the Bitcoin developers introduced Segregated Witness (SegWit) and the Taproot upgrades which effectively reintroduced the capability to store larger volumes of arbitrary data similar to the early versions of Bitcoin. The opportunity was quickly seized by application developers to leverage Bitcoin as the immutable global data ledger. The market for digital assets on Bitcoin has grown exponentially to billions of dollars of market capitalization and has generated hundreds of millions of dollars in network fees paid to miners in a relatively short period of time.
3. Overlay Protocols
Hal Finney introduced and predicted the emergence of “overlaying other protocols onto Bitcoin” that would leverage Bitcoin as a global, decentralized, and consistent database for digital assets. The basic idea is to signal the creation of overlay assets and associate data with a particular transaction history. It adds another dimension to Bitcoin as monetary system, in that the transaction outputs themselves can represent any other type of digital asset such as tokens, credits, digital media — even proxies for claims over physical assets.
“In discussion on the BitDNS thread I came up with an idea for overlaying other protocols onto Bitcoin. From one point of view, Bitcoin is a global, decentralized, yet consistent database. This DB is used to record transfers of coins, but it could potentially be used for more. There are many applications for a global consistent database.
Borrowing from my BitDNS description, the way this would work is we would use the mysterious and extravagant “scripting” system to add additional data to regular Bitcoin transactions. These would look like NOPs to current clients and be ignored, but overlay aware clients would look inside this NOP block and see the extra protocol-specific data, and interpret it according to the overlay protocol.
Specifically i could imagine using OP_NOP1 to signal overlay data, then OP_PUSHDATA to push the specific data, then OP_DROP to drop it from the stack, followed by the regular tx opcodes. This will have no effect on regular clients and look like a regular transaction (can be a dummy tx, 0.01 to yourself) but overlay aware code sees a protocol transaction.
As an example, Bitcoin could be used as an inexpensive timestamp service, allowing you to prove that a certain document existed on or before a certain date. All you need to do is create a dummy transaction to yourself, and hack the client to do an OP_PUSHDATA of the hash of the document, then OP_DROP it. The hash will be around for all time in the block chain and stand as proof that the document existed at that date.” – Hal Finney [3]
The first overlay protocol that has achieved significant adoption is Ordinals Theory. It is described in the Ordinals Handbook: “Individual satoshis can be inscribed with arbitrary content, creating unique Bitcoin-native digital artifacts that can be held in Bitcoin wallets and transferred using Bitcoin transactions. Inscriptions are as durable, immutable, secure, and decentralized as Bitcoin itself.”
Shortly after the growth of Ordinals, developers recognized the need for a fungible token standard on Bitcoin and therefore the BRC20 standard was created to address that need. BRC20 leveraged Ordinals Theory to create an overlay account model that would be associated with wallet addresses and be able to send and receive token units, essentially creating a layered overlay protocol on top of Ordinals Theory, which itself being an overlay protocol on top of Bitcoin.
Later in 2023, another protocol called Atomicals Protocol Digital Objects was created to address the growing market need of token standards and indexing technology. It is described in the Atomicals Guidebook: “The Atomicals Protocol is a simple, yet flexible protocol for minting, transferring and updating digital objects (traditionally called non-fungible tokens) for unspent transaction output (UTXO) blockchains such as Bitcoin. An “Atomical” is a way to organize the creation, transfer and updates of digital objects – it is essentially a chain of digital ownership defined according to a few simple rules.”. The Atomicals Protocol includes a fungible token standard called ARC20 which has the unique property that each unit of a token is backed by at least 1 satoshi unit and operates according the the same rules of sending and receiving Bitcoin itself using the unspent transaction output (UTXO) architecture.
By early 2024 another overlay protocol called Runes was released, which was implemented directly in the Ordinals indexer to complete the Ordinals Theory to finally include a fungible token standard.
4. State Machines
All of the overlay protocols on Bitcoin thus far have been based on fixed or predefined state machines. All overlay protocols share in common essentially two state machines: one for signalling the creation of digital assets and the other for governing the transfer of those digital assets. The state machine rules are essentially unchangeable and hard-coded in their respective overlay protocol indexers — application developers have no way to customize digital asset behavior.
What is needed is a way for application developers to define the creation and transfer lifecycles of their digital assets. We present a model of dynamic state machine programming that allows application developers to fully customize and define arbitrary rules for their digital assets.
The basic idea is to allow developers to put their smart contract code in the data segments of transations to make available for all parties to execute. By having the code stored on the blockchain, it is easy for different parties to synchronize state by executing the logic in the same way. The smart contract programming language should the following key properties at a minimum:
- Predictable run-time
- Arbitrary flexibility – Turing Completeness [4]
- Efficient execution on resource constrained systems
From the above requirements, we see that Bitcoin Script lends itself very well to being such an instruction set to define creation and transfer rules of digital assets. Virtually any type of rule should be made possible, while also limiting the execution time to prevent denial of service attacks — which essentially amounts to avoiding infinite loops. Bitcoin Script is now generally acknowledged to be Turing Complete, being a Two Stack Pushdown Automata (2-Stack PDA) and has the benefit of not having any looping instruction, but can achieve the same effect as loops using the technique of loop unrolling. Therefore, it has the special property that the run-time of a program is linearly proportional to the size of the program itself.
Smart contract program code is stored in Bitcoin transactions and overlay protocol indexers execute the code for the various method calls and state transitions. All interested parties execute the same logic and arrive at the identical state transitions, which forms an emergent consensus.
Synchronization of state can be achieved using a state hash which communicates the internal state of an overlay protocol indexer to each other and to external observers. By publishing state hashes for each block, it is easy for various parties to assess whether they are following the same rules and whether they have arrived at the same state with respect to each other.
There is no need for a complex state commitment scheme because all data is on stored on chain and timestamped in chronological order, allowing anyone to arrive at the exact same state. This scheme reflects the Bitcoin ethos “Don’t trust, verify” in that each user can validate the entire Bitcoin blockchain and therefore ascertain the states of all smart contract programs.
Any overlay protocol can adopt this dynamic state machine programming technique and allow digital assets to be virtualized into and out of smart contracts. In essense, it means we can create the concept of deposits and withdraws as a matter of convention, since after all everything else is already a matter of convention that depends on the complete historical record of transactions on the blockchain being used to build up the indexer states. We discuss two kinds of virtualization below: account-based and UTXO-based.
With BRC20 the token balances are account abstractions and can be described as a type of virtual digital asset – a digital asset on top of a digital asset (Bitcoin). Atomicals ARC20 tokens are abstractions that maintain an affinity to the underlying satoshi units themselves, but nonetheless are also a type of virtual digital asset. These are more accurately termed “virtual digital assets” because their existence is an abstraction on top of another digital asset (Bitcoin).
To create virtual account-based abstraction, we can define a state machine which accepts any type of token to be deposited into the contract and can later be withdrawn, similar to how the Ethereum blockchain’s Solidity programming language permits methods to be annotated as payable to indicate that Ether may be paid to that method and later withdrawn according to the rules of the smart contract program. The tracking and management of these digital assets could be done with special Op Codes such as OP_FT_WITHDRAW
and OP_NFT_WITHDRAW
and a mechanism for payable methods to accept tokens inside the smart contract state.
Building upon the account-based abstraction, it is possible to define protected smart contract memory that can only be written to by the contract itself. Recall that Atomicals Digital Objects already provides a general purpose key-value storage for non-fungible tokens, we can define a memory space that can only be written to using special key-value storage access Op Codes such as OP_KV_GET
, OP_KV_EXISTS
, OP_KV_DELETE
, and OP_KV_PUT
for retrieval, existence, deletion and write access respectively. This storage technique elevates smart contracts on Bitcoin to a similar level of functionality as the Ethereum blockchain.
Another approach from an enitrely other angle is what we call the virtual UTXO (vUTXO) architecture. The virtual UTXO architecture basically adheres and respects the chain of transaction output-spends and binds an output to a specific locking script, which can only be unlocked by providing a valid unlocking script. This effectively creates a virtual overlay UTXO-set that is stored and managed in the overlay protocol indexers, allowing complete freedom for virtual digital asset developers to all the limitations of the limited instruction set provided by Bitcoin miners. The main caveat is that there must be an expiry time, refresh policy, and an eviction strategy for handling stale vUTXO to prevent the overlay protocol indexer state from becoming too large.
To ensure the consistent execution of smart contracts, the AVM interpreter runs in a sandboxed environment which is called by the host indexer. In this way different host indexer programming languages and environments can more easily achieve consensus compatibility by having a canonical way to execute scripts.
The sandboxed interpreter is a stripped down version of the Bitcoin Script Interpreter with some notable differences such as accepting the execution locking script (scriptPubKey) and the unlocking script (scriptSig) directly along with the various other data such as token state and protected memory snapshots.
.
8. Conclusion
We have proposed the Atomicals Virtual Machine (AVM), a smart contract system for overlay digital assets on Bitcoin by simulating the Bitcoin Virtual Machine. Up until now, overlay digital assets on Bitcoin were governed entirely by predefined state transition rules: allow nothing more than the creation and transfer of those digital assets. To solve this, we proposed a general technique to allow smart contracts by leveraging Bitcoin as a global database and storing smart contracts in transactions for the execution in a sandboxed run-time via the overlay digital asset indexers. The original Bitcoin Script instruction set is sufficiently capable and powerful because it is a 2-Stack PDA and has been proven to be Turing Complete. By using state hashes, it is easy for participants to validate that indexer states are synchronized correctly. The system is flexible enough and a natural evolution of existing overlay protocols and demonstrates the tremendous capability of the original Bitcoin Script and Virtual Machine.
References
[1] Satoshi Nakamoto, “Bitcoin: A Peer-to-Peer Electronic Cash System” https://bitcoin.org/bitcoin.pdf, 2008.
[2] Satoshi Nakamoto, “The nature of Bitcoin is such that once version 0.1 was released, the core design was set in stone for the rest of its lifetime…”, https://satoshi.nakamotoinstitute.org/posts/bitcointalk/126/, 2010.
[3] Hal Finney, “In discussion on the BitDNS thread I came up with an idea for overlaying other protocols onto Bitcoin…”, https://bitcointalk.org/index.php?topic=2077.msg26888, 2010.
[4] Wikipedia contributors. “Turing completeness.” Wikipedia, The Free Encyclopedia. Wikipedia, The Free Encyclopedia, https://en.wikipedia.org/wiki/Turing_completeness, 2024
[5] Github: Atomicals AVM Whitepaper en GitHub