The Data Tier: The Database (State Machine) of TApps
Earlier we explained the front consumer-facing tier of the typical TEA Project TApp (the presentation tier) as well as the application tier holding the executable files. When these decentralized apps have state changes, it’s up to the 3rd tier, the data tier, to track the events that changes the state of the ledger. The data tier is decentralized so any of the nodes can be queried for updates, with the querying node updating their status based on the result of the command. And it’s dynamic to match the changing state of TApps on the TEA network. The data tier actually keeps track of two separate state categories on the TEA network to match the varying requirements of TApps.
Two State Categories: An Eventually Consistent CRDT Database and a Strongly Consistent State Machine
According to the needs of their specific business logic, the many TApp applications running on the TEA network actually have two state categories:
- A strong-consistency state machine based on Proof of Time.
- An eventually consistent CRDT database built on OrbitDB that can be used ad hoc by TApps.
The first state category based on Proof of Time is for transactions requiring strong consistency, which would govern transactions involving funds and accounting. The other category is a CRDT database that allows for short-term inconsistencies in the business logic of apps. The TEA Project uses OrbitDB databases built on top of IPFS for these transactions. For example, the OrbitDB database that’s used for messages in the TeaParty sample application could momentarily be out of sync between different users. For a good overview of OrbitDB as compared to other distributed databases, this is a good post for more info.
Traditional internet applications use databases for state management, so we continue using the database term in keeping with existing conventions. But in fact, we distinguish between two different types of states: one is a CRDT database, and the other is a strongly consistent state machine. For example, our earliest TApp used the Patricia Trie data structure with full memory storage as state storage. This is obviously not a database, but it’s very close to the state machine of Ethereum. So strictly speaking, the concept of a state machine is more precise for our application than a database.
According to CAP theory, achieving strong consistency while maintaining decentralization involves some compromises to be made in timeliness (availability). Blockchain is another example of CAP theory where such compromises must be made. This is also why the blockchain is generally considered to be very inefficient.
The complexity of the state machine lies in synchronizing the current state between multiple nodes. Because we do not want to use the traditional blockchain consensus algorithms to achieve strong consistency, the most crucial task is ensuring that the transaction sequence is consistent across all replicas. We do this by relying on the accurate time provided by the atomic clock of GPS satellites. The reported time is recognized under the supervision of trusted TPM chips and used as the basis for the final ranking of all replicas. Since time is stable in our universe, it follows that each replica can achieve strong consistency using non-BFT algorithms.
The supervision of the GPS clock by the TPM chip is crucial for making the reported time data trustable. A tampered GPS chip could be attempted by bad actors to try to double spend or falsely claim transaction priority. But an on-board TPM chip will not sign off on faked GPS data.
The TPM chip can protect against on-board hardware alteration, but what about hacks that originate external to the box? Let’s imagine a hacker seeking a way around the TPM protection rigs together an external device to alter the radio waves from the GPS unit. This attack can’t be detected by TPM since it happens outside of the mining machine itself, but we can still detect this type of attack using alternate measures.
The TPM chip will sign transactions and compare them with the received transactions from other replicas. If the GPS radio is not interfered with, the time signal keeps a continuous steady pace. All the transactions should be signed at the same speed for all replicas. If one replica suffers a GPS radio interference attack, the timestamp signatures from that TPM would be significantly different compared to other nodes and have anomalies related to erratic time jumps not seen in other replicas. All of its data will be verified by Remote Attestation as well as Availability Attestation. The bad mining node (which might be a victim who might not even know it’s under attack) will be alerted and set offline. The bad timestamped transactions will be removed from the conveyor and marked invalid. These timing consistency checks are similar to the ones used in Solana’s Proof of History.
Strong Consistency State Machine
The strong consistency state machine is necessary for billing issues to protect against double spending attacks. The TEA Project is able to achieve strong consistency quickly by using Proof of Time as well as by not needing Byzantine fault tolerance because the possibility of Byzantine faults has been handled by the layer one blockchain.
Blockchain has blocks because it has to wait for all nodes / replicas to reach the same state. In other words, all replica nodes must have the same order of events / transactions. Traditional blockchains have to occasionally stop and take stock of where all its nodes are relative to the canonical ledger.
The TEA Project doesn’t have to perform this extra step of having to stop and take stock. Using the time stamps from navigation satellites under watch of hardware attestation, our strong consistency state machine can achieve continuous state updates at a small synchronization cost. The TEA Project is continuous as it has no concept of blocks, and it’s able to sync nodes at very little cost as it has no PoW puzzles to solve.
- It is not necessary for all nodes to periodically reach a consensus on the latest block. But in order to ensure that most replications can be synchronized to a consistent state, the TEA Project’s state machine requires a short waiting queue due to network latency. For example, an earlier timestamped transaction may arrive to a node after other later transactions due to issues with network traffic. But eventually all nodes will have the same transaction order.
- When more than 50% of nodes can no longer adjust the order transactions in the queue, these transactions becomes immutable and are then sent to the state machine for processing. This is called the conveyor belt algorithm and ensures that the state machine will eventually reach strong consistency. A similar consensus model that also uses physical time for sharding and is able to achieve large-capacity processing is Google Spanner.
The Eventually Consistent CRDT Database
For apps that can tolerate short-term inconsistency but still eventually achieve consistency, the ideal is the CRDT algorithm. CRDT (conflcit-free replication data type) allows conflict-free mergers between different replications and ultimately achieves network-wide consistency. In fact, the business logic of most apps can tolerate short-term inconsistencies to achieve both decentralization and efficiency, which is why we can use an OrbitDB database to track these short-term changes. A typical example of a traditional cloud app that deals with this issue gracefully is Google Docs.
The TEA Project uses CRDT databases because CRDT is cheap and fast.
- CRDT is fast as it has no time delay and doesn’t wait for others. Allowances are made for new transaction reports which are added non-destructively. This is in contrast to the TEA Project’s strong consistency state machine which must wait a minimum amount of time for confirmation.
- CRDT storage cost is cheap, relying on IPFS for decentralized hard drive storage instead of the more expensive RAM storage.
In our sample application TEA Party, all application logic that is separate from any logic requiring billing is stored in an OrbitDb database using CRDT (for example, the message list). Using CRDT, the host nodes provided by the respective miners do not need to interact with each other. Other miner nodes will achieve strong consistency eventually on their own, which can also ensure that users can communicate with each other quickly. If a user deletes a message from the TEA Party app, some nodes will delete this withdrawn message immediately while other nodes will be a little late due to network delays. In fact, all the instant messaging applications based on internet cloud computing we currently use also deal with the same inconsistencies when sent messages are withdrawn by users. This will not be alarming to users as everyone is used to the same minor inconsistencies when using traditional cloud apps.