Taproot is coming
First major bitcoin update since 2017
Taproot is coming
More privacy, less memory requirements and better handling of complex scripts is what the Taproot update is supposed to bring to Bitcoin. A look under the hood.
Bitcoin has not received a major protocol upgrade since 2017. This is expected to change in mid-November when the update called Taproot is activated. Preparations for this have been underway for quite some time. Back in January 2018, Bitcoin core developer Gregory Maxwell published a first draft of Taproot on the Bitcoin mailing list. Even then, the goal of the draft was more anonymity and efficiency.
The foundations for Taproot were laid in the turbulent Bitcoin year of 2017 with the introduction of Segregated Witness (SegWit). Disputes over potential block size changes and the introduction of SegWit had eventually led to a split in bitcoin.
Downward compatible update
As with SegWit, Taproot is also a soft fork. The changes to the protocol will thus be backward compatible. This is advantageous because it can take some time in a decentralized network until the participants switch to new software and all transactions have been converted to a new format. Currently, around 80 percent of transactions in the Bitcoin network use the SegWit format.
The Taproot upgrade features the new Pay-to-Taproot (P2TR) transaction type, which combines the concepts of Schnorr signatures and “Merklized Alternative Script Trees” (MAST), furthermore it increases transaction privacy and requires significantly less memory for complex output conditions. In total, the upgrade is composed of three Bitcoin Improvement Proposals (BIPs):
- Schnorr-Signaturen (BIP340)
- SegWit-Ausgabe-Konditionen(BIP341)
- Validierung von Taproot-Skripten (BIP342)
The above BIPs define a standard for Schnorr signatures and taproot construction. In addition, BIP341 builds on the earlier Merklized Alternative Script Trees (MAST) proposal. To understand what Taproot and Schnorr signatures improve in detail, we first take a look at the as-is state of Bitcoin transactions.
How Alice pays for the cappuccino
One of the most important concepts in Bitcoin is transactions. These “transfer” value in the form of satoshis (sats), the smallest Bitcoin unit, from one address to another address on the Bitcoin network. So, for example, from Alice in a café who just paid for a cappuccino with Bitcoin and sent it to Kate’s public wallet address.
A transaction usually references previous transactions as transaction inputs. The only exception to this are the Coinbase transactions, with which the reward is paid out to the successful miner for each new block, thus creating new Bitcoins. Transactions are not encrypted. Thus, it is possible to search any transaction persisted in the Bitcoin blockchain and view the details.
Output conditions as script
In our café example, Alice transfers, so to speak, the property rights to the satoshis issued for the cappuccino to Kate. If Kate later buys something with these satoshis, she would in turn pass on the ownership rights to the satoshis. The amount for the cappuccino is stored in an Unspent Transaction Output (UTXO), along with other information. There, Alice specifies who is allowed to spend the satoshis again for the cappuccino they just paid for. This contract is recorded in a so-called output condition in the scriptPubKey field of the transaction.
Output conditions are complex or less complex scripts written in the Bitcoin’s own Script language. Script is a stack-based language that resembles Forth and does not have Turing completeness. Because of the ability to enrich scripts with custom logic, the name “programmable money” is also popularly used for Bitcoin. Ethereum and other cryptocurrencies are also referred to as smart contracts, whereby Ethereum relies on Turing completeness.
Different transaction types
For example, the issue conditions for Alice’s Cappuccino transaction could be: The satoshis for the cappuccino can only be reissued by someone who can provide a valid signature matching Kate’s wallet address (public key hash). In this case, that would only be possible with Kate’s private key. The transaction type just described is called pay-to-public key hash (P2PKH) and is the most frequently used transaction type, accounting for around 78 percent.
The receiving address for a P2PKH transaction can be easily identified in a block explorer by the leading “1” in the address. However, the output conditions can also be formulated in a more complex way, for example in a multisignature setup where the script expects several signatures at once. In such a case, another transaction type should be used: With Pay-to-Script-Hash (P2SH), complex output conditions are captured by a hash in the transaction in the scriptPubKey field. The non-hashed plaintext version of the output condition is called Redeem-Script….
Save storage space
If a P2SH transaction is to be successfully issued, it must contain the matching Redeem script. This must match the previously generated “Redeem hash” and contain additional data such as signatures for successful script evaluation. In a complex multi-signature setup, or in general with complex output conditions, the Redeem script can take up considerable memory. Even though in a 3-out-of-5 multisignature setup of 5 signatures only 3 signatures are needed to release the satoshis, the Redeem script also includes sections for all alternative paths.
This is reflected in higher transaction costs, because there are fees for each byte of storage. Furthermore, it unnecessarily reveals a lot of details about all scripts in the output conditions. It is precisely these two problems that Bitcoin Improvement Proposal 341 aims to minimize.
Schnorr signatures
The Schnorr signature was designed between 1989 and 1991 by German mathematics professor Claus Peter Schnorr. It is a cryptographic method for digital signatures and is now making its way (BIP340) into Bitcoin. In a digital signature, a sender uses a secret signature key (private key) to calculate a value to a digital message from any data. This value allows anyone to verify the non-repudiable authorship and integrity of the message using the public verification key.
Currently, Bitcoin transactions use the Elliptic Curve Digital Signature Algorithm (ECDSA) for signatures. Why the unknown person who invented Bitcoin did not use Schnorr signatures from the beginning is unclear. One reason could be the low prevalence of Schnorr implementations in various cryptographic libraries at the time. Claus Peter Schnorr’s patent on the signature of the same name expired at least shortly before Bitcoin appeared.
Schnorr signatures offer several advantages over ECDSA:
- These are provably secure according to the Random Oracle model if a sufficiently random hash function is used and the discrete logarithm of elliptic curves problem used in the signature is sufficiently hard. No such proof exists for ECDSA.
- Schnorr signatures are not malleable (non-malleability). Simply put, signature malleability means that it is technically possible to change the signature of a transaction before it is confirmed.
- In the future, signatures will be only 64 bytes in size instead of up to 72 bytes.
- Schnorr-encoded public keys will only be 32 bytes instead of 33 bytes.
Many keys combined
The linearity property of Schnorr signatures is particularly exciting. This enables the concept of multisignatures with “key aggregation.” Essentially, this allows multiple signers of a transaction with multiple signatures to combine their public keys into one aggregated public key. With ECDSA, this was only possible in a roundabout way through P2SH/P2WSH. This combined public key can then no longer be distinguished from a conventional key (single sig) from the outside.
Schnorr signatures further contribute to space savings in the Bitcoin blockchain and, in a sense, have no drawbacks, according to the authors of BIP340.
Output conditions distributed to scripts
In addition to the Schnorr signature, “Merklized Alternative Script Trees” (MAST) play an important role in Taproot. The name is made up of the two concepts “Abstract Syntax Tree” (AST) and the Merkle tree already used in Bitcoin. Roughly speaking, MAST allows output conditions of a transaction to be distributed among several scripts.
With “Pay-to-Taproot” (P2TR) a new transaction type was created, which uses MAST among other things. P2TR knows two ways how a UTXO can be spent. First, a UTXO can be issued via key-path-spending by an owner of a private key that matches the public key. Second, a UTXO can be issued via script-path-spending if the requirements of any script within the MAST can be met.
A P2TR transaction is externally bound to a single public key. Internally, this public key is in turn a combination of one or more public keys and a script constructed from the Merkle root of several scripts. Let’s take a look at the example construction of output conditions in a P2TR transaction. We assume that this consists of a 2-out-of-3 multisignature. This is a popular setup, which quite a few Bitcoin service providers also offer.
In our example, the service provider has one key (casa key) and the customer has two keys. One key (cold key) is stored in cold storage. To be able to issue the Satoshis, at least two keys are required. In total, there are three possible combinations to get the Satoshis.
This design allows to choose between complex scripts and simple pay-to-public key functions. The whole thing happens only at the time of re-issuing a transaction. When issuing the transaction, thanks to MAST, not every possible script has to be disclosed, but only the script that is actually needed to issue the satoshis.
Upgrades in a decentralized network
All Taproot features are optional for wallets. This means that existing wallets do not have to change their functionality and can incorporate and benefit from Taproot’s new capabilities as needed. Technically, the requirements for the soft fork with Taproot have already been in place since Bitcoin Core version 0.21.0. The term soft fork was introduced to distinguish this upgrade method from a hard fork that is not compatible with old versions. A soft fork is a backward-compatible change to the consensus rules that allows non-upgraded nodes to continue to operate in consensus with the new rules.
Before the changes can be activated on the network, it was decided by the Bitcoin community that Bitcoin miners and Bitcoin mining pools must signal their readiness to upgrade. For this purpose, the Speedy Trial mechanism was implemented according to BIP9 and rolled out in Release 0.21.1.
90 percent approval in 2,016 blocks
Within 2,016 newly mined blocks, at least 90 percent of the miners must signal their readiness for the planned soft fork. A miner signals its readiness for a planned upgrade by setting a bit in the version field of the block header. The number of blocks is not random. The length of 2,016 blocks corresponds exactly to a Difficulty period in the Bitcoin network. After that, the difficulty for mining new blocks is automatically adjusted again. This can be helpful, for example, if larger amounts of computing power leave the network unplanned.
If the 1,815 blocks (90 percent) are not reached in a Difficulty Period, the game starts all over again. There are 3 months left for Lock-In 3. On June 12, the 90 percent mark was reached within one Difficulty Period and the Lock-In has been fixed since then.
Taproot will thus be activated on Bitcoin’s mainnet starting with block 709632, which is expected to be on November 16. All full nodes should also be updated to at least version 0.21.1 by this time, because in the end, compliance with the new consensus rules will be checked by full nodes and thus by everyone in the network.