Security of Bitcoin and other PoW blockchains

It’s been almost 9 months since my previous blog, but in Bitcoin time it must have felt like decades. While the Earth was barely completing three third of its course around the Sun, thousands of Initial Coin Offerings (ICOs) have come and also vanished (or busted) with so much money that rival the previous dot-com bubble. Riding on Ethereum, an ICO typical issues digital tokens to be used as assets or utilities, and attracts investors with unspoken promises of many X increases in token values.

John Oliver described the current blockchain and cryptocurrency landscape quite eloquently:

everything we don’t understand about money combined with everything we don’t understand about computers.

In this post I’ll focus on explaining the current state of knowledge regarding security of these cryptocurrencies. Security is often lauded as the key advantage of Bitcoin and Ethereum over the traditional banking systems, i.e. cryptocurrencies are more secure. Well, if it were that cut and dry, I wouldn’t be writing this. My hope is that given what we know about the current (in)security, you (readers) may understand more about the computers parts and become more suspicious confused amazed at how this whole thing (i.e. money) even work, let alone having billions of market capitalization.

I’ll start with some background on security of PoW blockchains. I’ll then discuss two important build-up work [1], [2] leading up to the formalization in [3], and end with the quantitative model in [4]. There are other important papers, but the selected are ones that I could make through, barely.

A little background

Blockchain is the buzz word of the day (a close second is AI). There are two main classes of blockchain systems: permissioned and permissionless. The former include Hyperledger, JP Morgan’s Quorum and Microsoft’s Coco. The latter — the original blockchain v1.0 — are made hugely popular by Bitcoin and Ethereum. One key difference between these two classes of systems is the consensus protocol. I have blogged more than once about consensus protocols in traditional distributed systems, i.e. how difficult and subtle they are. Traditional can be roughly understood as before Bitcoin. More specifically, traditional consensus protocols assume that every node knows about the identity of another node in the system. Permissioned blockchain makes adopt this assumption and therefore are able to use such protocols as PBFT without change.

Permissionless blockchains make no assumption about node’s identity. In such a purely decentralized setting, overcoming Sybil attack is challenging, and for which traditional consensus protocols cannot be applied. Proof of Work (PoW) emerges as the solution for achieving consensus. In PoW, nodes compete to solve computationally hard puzzles, and whoever finds a solution get to append a new block to the chain. One typical puzzle is to find such that

for a given value of (hash of previous block) and (pre-defined difficulty). In Bitcoin, is set such that the average time of solving the puzzle, of the entire network as a whole, is about minutes. is adjusted dynamically based on the timestamps of a certain number of previous blocks. For example, if the previous $K$ blocks suggests fast block time ( minutes), is decreased in order to bring the average block time back to minutes. Maintaining an appropriate value for block time is in fact very important for security, as will be clear later.

A detailed comparisons of permissioned vs. permissionless blockchain merits a separate post. I’d shamelessly refer readers to my recent survey paper [5].

What is security?

Consensus protocols determine security of blockchain systems. In classic Byzantine settings, where malicious (or Byzantine) nodes behave arbitrarily, security is the same as safety when the number of Byzantine nodes is bounded. It broadly means that all honest nodes agree on the same value. In PBFT, for example, safety property dictates that if an honest node locally commits to a tuple , then all other honest nodes also do the same.

In PoW, the adversary model is also Byzantine. However, the adversary’s power cannot be expressed simply in terms of number of nodes, because the network size is unknown. Instead, it is bounded by its commanding hashing power relatively to the total network hashing power. For instance, a -bounded adversary (or simply adversary) can solve PoW puzzles of the time, while the remaining nodes together solve of the time. Given this model, security is roughly understood as block immutability: if an honest node puts a block on its blockchain, and remains in the chain for some time, then all other honest nodes will do the same with high probability. This probability is typically quantified with confirmation length parameter , where is considered reasonable security. captures how long a block need to remain in the chain for it to be adopted by all honest nodes. In practice, the probability of tampering with a block grows exponentially smaller as honest nodes wait for more confirmation blocks.

  Notation     Definition  
  Number of nodes
  Adversary bound
  Bound on honest nodes
  Advantage of the honest node,
  Confirmation length
  Block interval: how fast a solution to PoW puzzle is found
  Propagation delay
  Rate at which honest node adopts blocks generated by the adversary
  the chain obtained from cutting off the last blocks of

The table above summarizes key notations for analysis. is the number of Byzantine nodes in permissioned blockchains, and the relative hashing power in permissionless blockchain. It is easy to see why Bitcoin as well as other PoW blockchain cannot be secure when . The specific attack here is that the adversary builds a private chain which will be longer than the chain the honest nodes are working on. At any time, the adversary can broadcast and completely invalidates .

Security of traditional consensus protocols such as PBFT has been formally studied. Depending on network condition, classic BFT protocols require different bounds on the adversary. In fully asynchronous network, no consensus is possible (while also keeping liveness), due to the FLP result. The table below compares values of necessary to achieve security. Recall that partiall synchronous means that there exists an unknown boundi on message delays; whereas synchronous means that that is known.

    BFT (synchronous)     BFT (partially synchronous)     PoW (synchronous & partially synchronous) 
   

Effect of propagation time on Bitcoin security [1]

Decker et al. are the first to measure Bitcoin networks. Published in the P2P conference, just 5 years after the original Bitcoin paper, the paper contains a nice description of Bitcoin at the network level: how blocks and other messages are disseminated in the network.

  • Propagation time is defined as a time when a message is delivered to all nodes in the network. Without complete view of the network, the paper estimates it by connecting a number of clients to the real Bitcoin network. The client passively collects messages for a duration of blocks. Using the message log, propagation time for a block can be estimated from the first time it was heard, until it was sent by all the client’s neighbors.

    The measured propagation time is in average.

    Using the same methodology, Gervais et al. [4] report a similar delay of (the difference is due to faster network in 2016 vs. 2013, and the use of Bitcoin relay network in the later study).

  • Probability of forks due to concurrency is modeled as

    where represents the duration during which a node stayed uninformed of a new block. is proportional, but not the same, as propagation delay, i.e. the higher the propagation delay, the higher is . Decker et al. estimated and , making . Again, a similar result is confirmed later by Gervais et al.

  • An important interpretation of the fork model is that each time a block is found, an equivalent of mining is wasted. As a consequence, the effective total hash power is only . In other words,

    The adversary bound decreases from to .

    We can devise a similar model for Ethereum where is smaller () and is higher (/block). Vitalk himself confirms the bound in Ethereum is only .


Selfish mining [2]

The study from Decker et al. basically refined the original security guarantees of Bitcoin; it is still a good approximation to say the adversary bound is . One year later, Eyal et al. presented the first important attacks on the system showing that must be significantly lower. One implication of this paper is that a mining pool with hash power is able to accumulate more share of the block rewards. But most importantly, it is able to lures honest miners to join the pool, thus snowballing the collective hash power beyond .

A mining pool consists of many miners, and is under control of one identity. It is easier to view a pool as a really powerful miner. The paper [2] is concerned with the scenario where mining pools go rogue. And the prognosis isn’t good.

Rational miners

When studying security of a distributed system, it is common to adopt the dichotomy of honest vs. malicious/Byzantine nodes, in which an honest node always follow the protocol correctly to the letter. When applied to cryptocurrencies setting, this model becomes too simplistic because miners are controlled by rational human agents who are out to maximize a certain objective function (a.k.a money). Eyal et al. therefore adopts a more realistic model that incorporate rational behavior. Although not cited in the paper, the model resembles the Byzantine Altruistic Rational (or BAR) model in classical consensus setting [7]. In particular, the network consists of three types of nodes:

  1. Byzantine: the usual malicious node, whose only goal is to break things.

  2. Altruistic: the usual honest node, who follows blindly.

  3. Rational: the human-like node that adopt any strategy that maximizes its objective function.

The attack

The attacker is called a selfish miner, as its chief strategy involves withholding blocks instead of broadcast to the network.

Its ultimate goal is to increase the ratios of blocks in the blockchain that are generated by the adversary.

Given a blockchain , the number of blocks contributed by the adversary should be bound by its hashing power, i.e. . Recall that (or else, we have simple adversary), a simplified version of the attack works as follows:

  1. It mines on a private chain of block. The public chain on which honest miners are working is . Let be the lead the adversary has over honest miners.

  2. Selectively reveals the blocks in order to invalidate honest miners’ blocks:

    2.1. When , it adopts as the private chain. Thus, .

    2.2. With non-negligible probability, it finds a block before honest miners, or . It keeps mining on .

    • If honest miners find a block , it immediately broadcasts its private block. This results in a probability of that is adopted by the network.

    • If it again finds another block , i.e. , it continues mining on .

    2.3. With a lead of blocks, it broadcasts a private block for every block that the honest miners find. Because , at least private blocks will be adopted by the network, wasting the honest miners’ effort in finding blocks. will be reduced to eventually, since , at which the adversary goes back to step 2.1.

Impact

The adversary’s relative block reward is directly proportional to the fraction blocks it can force to the blockchain. Eyal et al. devise a simple model for this relative reward, using just and . The result is shown in figure below.

Selfish mining

Two important take-aways are:

  1. If the adversary can always convince another node of accepting its block in step 2.2.a, i.e. , it relative reward is always better than honest mining. It means selfish mining is profitable.

  2. If the adversary commands of hashing power, and honest nodes adopts its block randomly with , selfish mining is also profitable.

Take-away number 2 by itself doesn’t necessarily mean security is always compromised, because the adversary doesn’t have majority ().

However, under the BAR model it is almost the same as the adversary having majority of hashing power.

More specifically, selfish mining profitable means that honest mining is not rewarded fairly. In other words, the honest miners with a fraction of hashing power gets less than the block rewards. This motivates them to join the adversary in the latter mining pool, because it then get more reward than its fair share. At , the benefit of joining is already greater than . As the adversary pool gets bigger, the benefit grows faster than linearly (see the figure above), hence even more incentive to join. It’s only a matter of time before the adversary gets beyond the threshold.

This work is clairvoyant

Written in 2014, many of the paper’s warnings did come true.

  • It warned that mining concentration is dangerous. New measurements of two biggest cryptocurrency network reveals that it is a real concern [6]. Although no mining pool has yet amassed , 6 largest Bitcoin pools now control hashing power (11 pools in Ethereum respectively). A corollary is:

    A Byzantine quorum system of 20 could achieve better decentralization than current PoW systems

  • Selfish mining did actually happen in a Japanese alt-coin. How is that for validation!


Security formalization [3]

Both [1] and [2] touched on the upper bound of the adversary power, beyond which is there no security. [1] says the bound is in fact strictly less than . [2] says is sufficient assuming the miners are rational. No work before [3] had formalized the relationship between and security, or even defined exactly what security is. All 12 versions of the paper provides a formal model of the protocol which captures security with and without selfish mining. The newer versions also include partially synchronous communications. The model and proofs change quite a bit between versions; the following are based on the June 2017 version.

The model

The model follows a popular framework for analyzing security of multi-party computation. Fortunately I am not unfamiliar with this framework, thanks to months of pouring through pages and pages of Canetti’s universal composable (UC) security model — an incarnation of this framework — during the early years in my PhD. Granted, Garay et al.’s model is relatively more simple.

  • A miner is an interactive Turing machine (ITM) with an input tape and output tape. It executes a protocol , e.g. the Bitcoin backbone.
  • A random oracle (RO) simulates PoW computation. A miner can query RO for a solution.
  • Adversary control the communication channel and can tamper with the meta data the miners’ input tapes.
  • All processing finishes within a round. In the synchronous setting, all messages are delivered in one round. In the partially synchronous setting, messages sent in round is delivered before round (remember that is unknown to the miners).

An environment supplies input and issues queries to the miner. Its view , made up of the environment input and contents of the output tapes from the miners, is a random variable whose randomness comes from ’s coins and the miners’. represents the visible states of the entire network. Security of the protocol is specified as a predicate over . In particular, satisfies if is false with negligible probability.

The protocol

The core of Bitcoin consists of three functions:

  1. validate: checks if content of the chain is valid. This involves checking validity of every block and the contained transactions, all the way back to the first block of . In practice, a miner would do this only incrementally by maintaining some indices.

  2. maxValid: after receiving multiple chains from neighbors, selects one chain based on some pre-defined conditions. One condition is chain length, which captures the rule for resolving forks in Bitcoin. For example, if , it returns .

  3. pow(b,C): computes PoW solution for a new block content , using as the previous block. It sends up to queries to RO per round, and gets a solution with probability .

The main loop at each miner is simplified as follows:

C = empty
round = 1
while True:
  C' = maxvalid(C, any other chain C' found in input tape)
  if found READ command:
    write C' to output tape

  b = new block found in input tape
  C'' = pow(b, C')
  if C'' != C:
    C = C''
    Diffuse(C)
  else:
    Diffuse(empty)

  round++

It can hardly get any clearer than that.

Security property

So far we have to content with an informal description of security that is roughly equivalent to block immutability (see the earlier section). Now armed with the current model, security is defined as common prefix property. That is:

Common prefix (parameterized by ): for any two honest miners that output and at a round in , it holds that

where essentially represents the confirmation length. This property says that the honest miners’ chains share a large prefix, which is non-empty when their chains are longer than blocks. Let’s see how this implies block immutability: once an honest miner add a block to its chain and remains on the chain blocks later, the common prefix property implies the (sub-)chain containing is the common prefix shared by all honest miners.

Security property under selfish mining

Perhaps the greatest contribution [3] is that it captures selfish mining attacks via chain quality property. Recall the goal of selfish mining is to introduce as many blocks to the chain of honest miners as possible.

Chain quality (parameterized by ): for any honest miner that outputs in , it holds that for any consecutive blocks the ratio of adversarial blocks is at most .

and bound the impacts of selfish mining. For instance, if selfish mining is profitable, hence honest miners are motivated to join the adversary. Note that this property says nothing about attack strategies. Therefore a bound on will be the upper bound on selfish mining: the most clever attack can only claim up to out of blocks.

Proof highlights

The proofs for common prefix and chain quality are based on three random variables:

  • A successful round occurs when at least one honest miner finds a block in the round.
  • A uniquely successfully round occurs when exactly one honest miner finds a block.
  • An adversarial round occurs when the adversary finds a block.

Here is the high-level idea of the proof for common prefix. The first observation (see the paper for the exact lemma) is that the chain of a honest miner grows with . In particular, if at round the chain length is , then at round the chain grows to be at least . Another observation is that if the block of a chain is generated by an uniquely successful round, then the block of any chain will be either or another block contributed by the adversary.

Let’s take a look at the figure above where the adversary tries to split the chain at block . The upper line represents the chain that honest miners work on, whereas the bottom is the adversary’s private chain. The first observation says that the upper chains will grow faster than the lower, because for round . Furthermore, if the block in the private chain is generated in a uniquely private successful round, then in order to maintain the split the adversary is force to produce different blocks from . Combined, the two observations suggest that to ensure common prefix, it is sufficient to set parameters such that the probability of the adversary being able to produce that many blocks to maintain the split is exponentially small. For example, if we can make the chain grow fast and there are many uniquely successful round, the adversary cannot catch up.

Main theorem (synchronous)

The following theorems provide sufficient conditions for achieving common prefix and chain quality. They are not necessary, in the sense that both properties can still be satisfied when those conditions are false.

Common prefix: if and for some small , common prefix holds with

and

Chain quality: if and for some small , chain quality holds with and

The sufficient condition and can be interpreted as follows:

  • corresponds to a setting in which it is hard to find a block in one round. In practice, it translates to a high ratio between block time and block propagation time. In this case, it is sufficient to set close to , i.e. in order to achieve common prefix. The ratio of adversarial block is close to , which is strictly greater than . Hence, selfish mining is profitable. This bound is higher than what [2] observed. In particular, the specific attacks in [2] achieves a bound of when . The result here suggests that it is possible to increase the adversarial block ratio up to . The paper in fact details an attack that achieves this bound.

  • corresponds to a setting where it is very easy to find a block. In practice, it translates to a very small block interval. In this case, , meaning that . Therefore, the sufficient conditions for common prefix requires virtually all miners to be honest. Chain quality becomes quite high, but then it is expected because the adversary is almost non-existent.

    Does it mean setting a small block interval, as Ethereum does, implies virtually no security? NO. The condition is only sufficient.

To the best of my knowledge, necessary conditions for security of PoW blockchains don’t yet exist. So if you’re out to design a new PoW blockchains, your best bet is to set your parameters such that the above sufficient conditions hold.

Main theorem (asynchronous)

Earlier versions of the paper used a rather unrealistic, fully synchronous model, in which all messages are delivered in the same round (and all processing done in the same round).

Let’s first extend it to a synchronous model, where there is a known upper bound on message delay. Under this model, all miners simply wait rounds for all the messages. Traditional consensus protocols, e.g. BFT, treat this the same as fully synchronous. It’s the same as clocking down all the nodes such that they process messages times more slowly. No safety or liveness are affected. PoW consensus is different

In -synchronous model, the honest miner waits for round doing nothing, while the adversary can keep mining. Thus, is increased by a factor of .

As a consequence, for the sufficient conditions to hold, mus be very small to start with. Or if were fixed at the beginning, and the network condition changes, both theorems above fail to hold (again, this doesn’t imply no security).

Now let’s move to the more realistic, partially synchronous model in which is unknown. Most practical, classic BFT consensus work in this model. The high-level ideas for proving common prefix and chain quality properties are similar as in the fully synchronous model. Here I’ll only summarize the results.

Common prefix: if and , then common prefix holds for

Compared to the previous theorem, the lower bound on is now increased by a factor of . And the confirmation length is longer by blocks. It means if for a given , in order for the sufficient conditions to hold, must be time larger than before in the fully synchronous model. Another interpretation is that if becomes larger (corresponding to larger propagation delay in practice), must be times larger. For example, suppose is small such that , or in the fully synchronous model. When the bound of the adversary reduces to , and you’ll have to wait for blocks longer.

Chain quality: if and , then chain quality holds with and .

While common prefix deteriorates with higher , chain quality looks better as the ratio of adversarial block, still bounded by is, lower than before. In practice this makes sense because with higher , is smaller, hence the adversary impact is also smaller.

Comparison to [8]

Pass et al. are first to analyse Bitcoin protocols in the partially synchronous model, which prompted Garay et al. to extend their early versions. I couldn’t follow the proof in [8] to the end, due to unfamiliarity with the formal model. But it appears that the results are similar, at least as far as interpretation and implication to real systems are concerned.


Quantitative security model [4]

Having stayed with me till this point, you may be wondering what are left to discuss, since security have been formalized and theorems proven (and I already said there’re not yet theorems about necessary conditions for security). Well, Gervais et al. set out to quantitatively study security properties in real systems, and quantify the trade-offs between security and transaction throughput. [4]’s high level contribution can be seen as validating the theorems and putting real numbers in the models.

Security = stale block rate

The wisdom in security is that you either have it or you don’t. In the real world, security must be measured and managed as risk. Garay et al. [3] define security as common prefix property, which is security in the traditional, binary sense. In contrast, Gervais et al. measures it in terms of stale block rate. This metric roughly approximates the risk of the common prefix property being violated. It is also a proxy for the risk of selfish mining, because the adversary introduce its blocks to the chain by means of stale blocks.

Real measurements

One very useful results of this work is the measurement of security-related parameters of real Bitcoin and Ethereum networks.

We can only control two parameters: block interval and block size.

I’m going to post the results in verbatim below.

Note the similarity between the results an early work of Decker et al. [1].

Take-aways

I’ll not discuss the quantitative model (Markov Decision Process model) here, simply because I couldn’t understand it. But I’ll summary important take-aways from their simulation and how they are related to the theoretical results in [3].

  1. The lower the block interval, the higher the stale block rate. This is as expected because higher means there are more chance of concurrent blocks. In terms of security, when is high the sufficient condition for common prefix to hold is that the adversary bound becomes smaller. When is smaller, there is less security in the sense that it is easy for the adversary to exceed this bound.

  2. The larger the block size, the higher the stale block rate. This is the result of increased block propagation time as blocks get bigger, and the results in [3] imply that an increase in have the same effect as increase in . However, the interesting finding is

Block propagation time increases linearly up to 4MB, and exponentially after that.

which contrasts my initial impression that propagation time is linear to block size.

  1. Efficient block propagation helps reduce stale block rate. This is consistent with [3], since lower means the adversary bound can be higher for the common prefix property to hold.

  2. Throughput is a function of block size and block interval. But increasing one without lowering the other means lower security. The sweet spot for achieving the same security as current Bitcoin’s is and , or .


Conclusion

The results above are guidelines for designing a secure blockchain system based on PoW consensus. They are not necessary conditions for security, but very useful nevertheless. Alternatives to PoW abound, and the models discussed so far could be easily generalized to them. In particular, the relative adversary power , block interval , propagation delay , confirmation length, etc. are essential parameters in all of those PoW-like consensus.


References

[1] Decker et al. Information propagation in the Bitcoin network. P2P 2013

[2] Eyal et al. Majority is not enough: bitcoin mining is vulnerable. FC 2014

[3] Garay et al. The bitcoin backbone protocol: analysis and applications. Arxiv 2017

[4] Gervais et al. On the security and performance of proof of work blockchains. CCS 2016

[5] Dinh et al. Untangling blockchain: A data processing view of blockchain systems. TKDE 2018

[6] Gencer et al. Decentralization in Bitcoin and Ethereum network. FC 2018

[7] Aiyer et al. BAR fault tolerance for cooperative service. SOSP 2015

[8] Pass et al. Analysis of the Blockchain protocol in asynchronous networks. Arxiv 2016

Written on June 11, 2018