Horizontal Scaling

When a single server can no longer handle the volume of read queries for your application, it becomes necessary to scale out horizontally. In TypeDB, this is achieved by creating a database cluster that distributes the read load across multiple nodes. This chapter explains how TypeDB’s replication-based architecture enables horizontal scaling for read transactions while also providing high availability and fault tolerance.

Clustering is available via TypeDB Cloud and TypeDB Enterprise. TypeDB CE only support single-node deployments.

Scaling read throughput with replication

TypeDB’s strategy for horizontal scaling is based on data replication using the Raft consensus algorithm. Because every node has access to all the data, read-only transactions can be executed on any node in the cluster. This allows you to scale your application’s read throughput linearly by simply adding more nodes. As you add nodes, the cluster’s capacity to handle concurrent read queries increases proportionally.

The leader-follower model

A TypeDB cluster operates on a leader-follower model. At any given time, the cluster elects a single leader node for each database, while all other nodes act as followers. Followers receive a stream of committed transactions from the leader and apply them to their local copy of the database, keeping them in sync. These nodes are only available to process read transactions.

The leader is exclusively responsible for processing all schema and data writes. Centralizing writes on a single node simplifies consistency and ensures that all changes are applied in a strict order. Write throughput is therefore determined by the capacity of the single leader node and is scaled by increasing its resources (see vertical scaling).

If a leader node fails, the cluster automatically elects a new leader from among the followers, ensuring that the database remains available for writes with minimal interruption.

Interacting with a cluster

Interacting with a cluster is very similar to interacting with a single server. The key difference is that the client driver must be configured with the network addresses of all nodes in the cluster.

The driver uses this list to intelligently manage connections. It automatically discovers which node is the current leader for a given database and routes all write transactions to it. For read transactions, the driver can distribute the load across all available nodes (both leader and followers) by setting the read_any_replica option during opening the transaction, effectively using the entire cluster’s capacity. This routing is handled transparently, so your application code for opening sessions and running transactions remains the same whether you are connecting to a single node or a full cluster.

Consistency and durability

TypeDB’s replication model provides strong consistency guarantees, even in a distributed read environment. When a write transaction is sent to the leader, it is not confirmed until a majority of nodes in the cluster have durably stored the transaction in their logs. This process guarantees that once a transaction is committed, it is safely replicated and will not be lost. It also ensures that when followers serve read queries, they are providing access to a consistent and up-to-date state of the database.