Why SQL Still Matters
Despite every NoSQL trend that has come and gone over the past two decades, SQL databases remain the backbone of financial systems. Your trade records, position data, client information, risk calculations, regulatory reports — nearly all of it lives in relational databases. And the language to interact with it is SQL.
SQL has been around since the 1970s and it is not going anywhere. In fact, even many "NoSQL" databases have added SQL-like interfaces because the model is just that useful. If you are going to work in finance technology, SQL fluency is not optional.
If you are coming from a Python/Pandas background, the good news is that the concepts map almost directly. SELECT is like choosing columns, WHERE is filtering rows, GROUP BY is .groupby().
The Fundamentals
SELECT, WHERE, ORDER BY
-- Get all Apple trades, most recent first SELECT symbol, quantity, price, trade_date FROM trades WHERE symbol = 'AAPL' ORDER BY trade_date DESC; -- Find large trades in the current year SELECT * FROM trades WHERE quantity * price > 100000 AND trade_date >= '2024-01-01'; -- Multiple conditions SELECT symbol, price, quantity FROM trades WHERE symbol IN ('AAPL', 'GOOGL', 'MSFT') AND side = 'BUY' AND price BETWEEN 100 AND 200;
Aggregations with GROUP BY
This is where SQL starts earning its keep in finance. Summarising millions of trades into a concise report — in a single query:
SELECT symbol, COUNT(*) AS trade_count, SUM(quantity) AS total_volume, AVG(price) AS avg_price, MIN(price) AS min_price, MAX(price) AS max_price, SUM(quantity * price) AS total_notional FROM trades WHERE trade_date >= '2024-01-01' GROUP BY symbol HAVING SUM(quantity * price) > 1000000 ORDER BY total_notional DESC;
The HAVING clause filters after aggregation — you cannot use WHERE for this because the aggregated values do not exist until the grouping is done.
JOINs: Connecting Related Data
Real-world financial data is spread across multiple tables. A trades table references a products table for instrument details, an accounts table for client info, a counterparties table, and so on.
-- Combine trade data with product details SELECT t.trade_date, t.symbol, p.name AS product_name, p.sector, t.quantity, t.price, t.quantity * t.price AS notional FROM trades t INNER JOIN products p ON t.symbol = p.symbol WHERE p.sector = 'Technology' ORDER BY notional DESC;
JOIN Types
| Type | Behaviour |
|---|---|
| INNER JOIN | Only matching rows from both tables |
| LEFT JOIN | All rows from left table, matching from right (nulls if no match) |
| RIGHT JOIN | All rows from right table, matching from left |
| FULL OUTER JOIN | All rows from both tables |
LEFT JOIN is the most common in practice — you want all your trades, with product info where available:
SELECT t.*, p.sector, p.name FROM trades t LEFT JOIN products p ON t.symbol = p.symbol; -- Trades with unknown symbols still appear (sector/name will be NULL)
Understanding JOINs is fundamental to database design — data is normalised into separate tables precisely so you can combine it flexibly at query time.
Window Functions: The Finance Power Tool
Window functions compute a value for each row based on a "window" of related rows — without collapsing the results like GROUP BY does. They are incredibly useful for financial calculations.
-- Running cumulative P&L SELECT trade_date, daily_pnl, SUM(daily_pnl) OVER (ORDER BY trade_date) AS cumulative_pnl FROM daily_results; -- Rank traders by monthly performance SELECT trader_id, month, total_pnl, RANK() OVER (PARTITION BY month ORDER BY total_pnl DESC) AS rank FROM monthly_performance; -- Compare each day to the previous day SELECT trade_date, closing_price, LAG(closing_price) OVER (ORDER BY trade_date) AS prev_close, closing_price - LAG(closing_price) OVER (ORDER BY trade_date) AS daily_change FROM prices WHERE symbol = 'AAPL';
Window functions are covered in much more depth in our advanced SQL guide, but even basic familiarity will make you significantly more productive.
SQL vs Pandas: When to Use Which
A common question from Python developers: "why not just do everything in Pandas?"
Use SQL when:
- Data lives in a database (do not pull millions of rows just to filter in Python)
- You need aggregations across large datasets
- Multiple people or services need the same data
- You want the database to optimise the query plan for you
Use Pandas when:
- You need exploratory analysis with rapid iteration
- Data is already in files (CSV, Parquet)
- You need complex transformations that are awkward in SQL
- You are prototyping before building a production pipeline
In practice, most quant developers use both — SQL to extract and pre-aggregate, Pandas to analyse and model. They are complementary, not competitors.
Getting Started
Focus on SELECT, WHERE, GROUP BY, and JOIN — that covers 80% of what you will use day to day. From there, advanced SQL techniques like CTEs, window functions, and query optimisation will make you genuinely effective. And understanding database design principles will help you build systems that are fast and maintainable from the start.
Want to go deeper on SQL for Financial Data: A Developer's Starting Point?
This article covers the essentials, but there's a lot more to learn. Inside Quantt, you'll find hands-on coding exercises, interactive quizzes, and structured lessons that take you from fundamentals to production-ready skills — across 50+ courses in technology, finance, and mathematics.
Free to get started · No credit card required