PostgreSQL Sharding: When to Scale Horizontally and How to Do It Right

November 10, 2025

PostgreSQL is excellent—until it isn't. At some point, your single database hits a wall. Here's how to know when you need sharding, and how to do it right.

Sharding is a double-edged sword: it solves single-node scaling limits but introduces cross-shard consistency, query complexity, and operational overhead.

Signs You Need Sharding

Your situation calls for sharding if:

  1. Single table > 100GB and hot (write-heavy)
  2. > 1 billion rows in a single table
  3. Vertical scaling exhausted: Biggest instance still can't keep up

Before You Shard, Exhaust These Alternatives

AlternativeWhen It Works
Better indexing80% of performance issues are missing indexes
Read replicas100K QPS reads → 1 primary + 5 replicas
CachingRedis in front—90% requests from cache
Table partitioningTime-series data by month

PostgreSQL Partitioning: The First Step

Three Strategies

-- HASH PARTITIONING: Distribute by hash CREATE TABLE users ( user_id BIGSERIAL, name TEXT ) PARTITION BY HASH (user_id); -- RANGE PARTITIONING: Distribute by range (time-series) CREATE TABLE events ( id BIGSERIAL, created_at TIMESTAMP NOT NULL ) PARTITION BY RANGE (created_at); -- LIST PARTITIONING: Distribute by discrete values CREATE TABLE orders ( order_id BIGSERIAL, country TEXT ) PARTITION BY LIST (country);

When to Use Each

StrategyBest ForQuery Pattern
HashEven distribution, random accessSingle-shard lookups
RangeTime-series, archivalDate-range queries
ListCategorical dataIN queries by category

Citus: PostgreSQL Extension for Horizontal Scaling

Citus transforms a single PostgreSQL into a distributed system—transparently.

Setup

CREATE EXTENSION citus; SELECT citus_add_node('worker1.example.com', 5432); SELECT citus_add_node('worker2.example.com', 5432);

Creating Distributed Tables

CREATE TABLE events ( tenant_id BIGINT NOT NULL, event_id BIGINT NOT NULL, event_data JSONB ); SELECT create_distributed_table('events', 'tenant_id');

The Cross-Shard Query Problem

This is where sharding gets painful.

Fast: Single-Shard Query

-- Goes to ONE shard (fast, ~10ms) SELECT * FROM orders WHERE user_id = 42;

Slow: Cross-Shard Query

-- Fans out to ALL shards (slow, ~500ms) SELECT COUNT(*) FROM orders GROUP BY country;

Design queries to include the distribution key.

TL;DR

  • Partitioning: Start with PostgreSQL native (range, list, hash)
  • Citus: Use when single-node limits hit
  • Cross-shard pain: Always design for single-shard queries
  • Alternatives first: Indexes, replicas, caching

Ready to scale? Start with partitioning, move to Citus, only then consider application-level sharding.

Home
Blog
GitHub
LinkedIn
X