Design Matters More Than You Think
Writing SQL queries is one skill. Designing the database those queries run against is another — and arguably more impactful. A well-designed schema makes queries simple, fast, and correct. A poorly designed one means you are forever writing workarounds and wondering why everything is slow.
In trading systems, the stakes are higher than most applications. Data volumes are large, queries are complex, and errors in data integrity can have direct financial consequences.
Normalisation: Organising Data to Reduce Redundancy
Normalisation means structuring your data so that each piece of information is stored exactly once. Instead of writing "Apple Inc." on every one of a million trade records, you store it once in a products table and reference it by symbol.
CREATE TABLE products ( symbol VARCHAR(10) PRIMARY KEY, name VARCHAR(100) NOT NULL, sector VARCHAR(50), exchange VARCHAR(20), currency VARCHAR(3) DEFAULT 'USD' ); CREATE TABLE accounts ( account_id SERIAL PRIMARY KEY, name VARCHAR(100) NOT NULL, account_type VARCHAR(20) CHECK (account_type IN ('TRADING', 'CUSTODY', 'MARGIN')), created_at TIMESTAMP DEFAULT NOW() ); CREATE TABLE trades ( trade_id SERIAL PRIMARY KEY, symbol VARCHAR(10) NOT NULL REFERENCES products(symbol), account_id INTEGER NOT NULL REFERENCES accounts(account_id), quantity INTEGER NOT NULL, price DECIMAL(12,4) NOT NULL, side VARCHAR(4) CHECK (side IN ('BUY', 'SELL')), trade_date TIMESTAMP NOT NULL, settlement_date DATE, status VARCHAR(10) DEFAULT 'PENDING' );
The REFERENCES keyword creates foreign keys — the database will reject any trade with a symbol that does not exist in products or an account that does not exist in accounts. This is data integrity enforced at the database level, not in your application code where it can be bypassed.
When to Denormalise
Pure normalisation is not always practical. For read-heavy analytics workloads, some denormalisation — storing pre-computed or duplicated data — can dramatically improve query performance. The key is to denormalise intentionally, not accidentally.
Indexing Strategy
Indexes are like a book's index — they let the database jump directly to matching rows without scanning the entire table.
-- The most common query pattern: symbol + date range CREATE INDEX idx_trades_symbol_date ON trades (symbol, trade_date); -- Account-based lookups CREATE INDEX idx_trades_account ON trades (account_id, trade_date); -- Composite index for settlement queries CREATE INDEX idx_trades_settlement ON trades (status, settlement_date) WHERE status != 'SETTLED';
Column order in composite indexes matters. An index on (symbol, trade_date) supports queries filtering by symbol alone or by symbol + date, but not by date alone. Design your indexes around your actual query patterns.
The tradeoff: every index speeds up reads but slows down writes, because the database must update every relevant index on each INSERT or UPDATE. For trading systems processing thousands of trades per second, this balance requires careful thought. These performance considerations are also important when thinking about data pipeline design.
Partitioning for Scale
When your trades table reaches hundreds of millions of rows, even well-indexed queries slow down. Partitioning splits a table into smaller physical chunks — almost always by date in finance:
CREATE TABLE trades ( trade_id SERIAL, symbol VARCHAR(10), quantity INTEGER, price DECIMAL(12,4), trade_date DATE NOT NULL ) PARTITION BY RANGE (trade_date); CREATE TABLE trades_2024_q1 PARTITION OF trades FOR VALUES FROM ('2024-01-01') TO ('2024-04-01'); CREATE TABLE trades_2024_q2 PARTITION OF trades FOR VALUES FROM ('2024-04-01') TO ('2024-07-01'); CREATE TABLE trades_2024_q3 PARTITION OF trades FOR VALUES FROM ('2024-07-01') TO ('2024-10-01'); CREATE TABLE trades_2024_q4 PARTITION OF trades FOR VALUES FROM ('2024-10-01') TO ('2025-01-01');
Now a query for January 2024 only scans the Q1 partition — ignoring the other 75% of the year's data. For multi-year tables, this can turn a 30-second query into a sub-second one.
Constraints and Data Quality
Financial data needs to be correct. Database constraints are your first line of defence:
-- Prices must be positive ALTER TABLE trades ADD CONSTRAINT chk_positive_price CHECK (price > 0); -- Quantity must be non-zero ALTER TABLE trades ADD CONSTRAINT chk_nonzero_qty CHECK (quantity != 0); -- No duplicate trade IDs from external systems ALTER TABLE trades ADD CONSTRAINT uq_external_id UNIQUE (external_trade_id); -- Enum-like constraints ALTER TABLE trades ADD CONSTRAINT chk_status CHECK (status IN ('PENDING', 'FILLED', 'CANCELLED', 'SETTLED'));
These constraints catch bad data at the database level, before it propagates through your APIs and reporting systems.
When Relational Is Not Enough
Relational databases are the default, but they are not ideal for every workload:
- Time series databases handle high-frequency market data more efficiently
- Document databases (MongoDB) work well for semi-structured data like trade confirmations
- Columnar databases (ClickHouse, Redshift) excel at analytical queries over billions of rows
A well-designed production system often uses multiple database technologies — relational for transactional data, TSDB for market data, and a data warehouse for analytics. Understanding cloud database offerings helps you choose the right tool for each workload.
Good database design is foundational. Get it right and everything built on top — your SQL queries, your APIs, your reporting — works smoothly. Get it wrong and you will be fighting the schema for years.
Want to go deeper on Database Design for Trading Systems?
This article covers the essentials, but there's a lot more to learn. Inside Quantt, you'll find hands-on coding exercises, interactive quizzes, and structured lessons that take you from fundamentals to production-ready skills — across 50+ courses in technology, finance, and mathematics.
Free to get started · No credit card required