Why Python Depth Matters
Most quant research and many quant developer roles use Python as the primary language. The interview content depth varies significantly. At banks and most hedge funds, Python interviews stop at "can you write clean Python and use pandas." At top quant firms - Two Sigma, Citadel, XTX, AQR, Hudson River Trading research - the depth is genuinely deeper, including questions on numpy broadcasting internals, pandas memory layout, vectorisation, the GIL, and how to write Python that won't be the bottleneck in production research workflows.
This guide collects 20 worked examples from recent Two Sigma, Citadel, XTX, AQR and DE Shaw research and engineering interviews. For broader context, see our python fundamentals for quant finance and python for finance guide.
Section 1: Language Fundamentals (Questions 1-7)
1. List vs tuple
What's the difference between a list and a tuple in Python? When would you choose one over the other?
Answer: Lists are mutable; tuples are immutable. Tuples have lower memory overhead and faster instantiation, and can be used as dictionary keys. Use tuples for fixed-structure data (record-like) and lists for sequences you'll modify.
2. Mutable default arguments
def f(x, lst=[]): lst.append(x) return lst
What's wrong with this? What does f(1) followed by f(2) return?
Answer: Default arguments are evaluated once at function definition. Both calls share the same list. f(1) returns [1]; f(2) returns [1, 2]. Fix: use lst=None and if lst is None: lst = [] inside the function.
3. List comprehension vs map
[x**2 for x in range(1000)] vs list(map(lambda x: x**2, range(1000))). Which is faster, and why?
Answer: List comprehension is typically faster. Reasons: (1) The lambda has function call overhead; the comprehension's expression is inlined. (2) Comprehensions have specialised bytecode (LIST_APPEND). (3) For built-in functions (without lambda), map may be marginally faster (e.g., map(int, strings) vs [int(x) for x in strings]).
4. Garbage collection
Python uses reference counting plus a cycle collector. What's a "cycle" in this context, and why is it special?
Answer: A cycle is a set of objects that reference each other but aren't reachable from the main program. Reference counting alone can't detect cycles (each object has positive count from the others). The cycle collector periodically scans for these and frees them. The trade-off: cycles are expensive to detect; modules with frequent cycles (e.g., custom data structures with parent-child references) can have garbage-collection-related latency spikes.
5. Decorators
Write a decorator that times function execution.
import time import functools def timed(func): @functools.wraps(func) def wrapper(*args, **kwargs): start = time.perf_counter() result = func(*args, **kwargs) elapsed = time.perf_counter() - start print(f"{func.__name__} took {elapsed:.4f}s") return result return wrapper @timed def slow_op(): time.sleep(0.5)
The @functools.wraps preserves the wrapped function's metadata (name, docstring, etc.).
6. Context managers
Write a context manager that temporarily changes the working directory.
import os from contextlib import contextmanager @contextmanager def cwd(path): old = os.getcwd() os.chdir(path) try: yield finally: os.chdir(old)
The try/finally ensures cleanup even on exception.
7. *args and **kwargs
What's the difference?
Answer: *args collects positional arguments as a tuple; **kwargs collects keyword arguments as a dict. They allow forwarding arguments (def wrapper(*args, **kwargs): return inner(*args, **kwargs)).
Section 2: NumPy and Pandas (Questions 8-14)
8. Why is numpy faster than pure Python for arithmetic?
Answer: (1) Vectorised operations are implemented in C (no Python interpreter loop). (2) Contiguous memory layout enables CPU cache efficiency. (3) SIMD instructions (AVX, etc.) process multiple values per CPU cycle. (4) No Python object overhead per element (numpy uses primitive C types like int64, float64).
9. Broadcasting
Two arrays: a shape (3, 1) and b shape (1, 4). What's the shape of a + b?
Answer: (3, 4). NumPy broadcasts: dimensions of size 1 are stretched to match. For two arrays to broadcast, each dimension must be either equal or one of them must be 1. Compute on (3, 4) virtual grid; no actual data copy.
10. View vs copy
b = a[2:5]. Does modifying b modify a?
Answer: Yes for numpy arrays - slicing returns a view, not a copy. To get a copy, use b = a[2:5].copy() or b = np.array(a[2:5]). For pandas DataFrames, slicing behaviour is more complex (the SettingWithCopyWarning exists for this reason).
11. Pandas indexing
Difference between .loc, .iloc, and bracket notation?
Answer: .loc[] is label-based; .iloc[] is integer-position-based; bracket notation depends on what you pass (string for column, slice for rows). The conventional rule: use .loc and .iloc explicitly for clarity. Avoid bracket notation for ambiguous cases.
12. Vectorising a calculation
Convert this loop to vectorised numpy:
result = [] for i in range(len(prices)): result.append(prices[i] * volumes[i] / 100)
Answer: result = prices * volumes / 100. Numpy broadcasts the scalar division. For a million elements, the vectorised version is typically 50-100x faster than the pure-Python loop.
13. Memory layout
A pandas DataFrame with 10 columns has 1M rows. The first 5 columns are float64; the last 5 are object (strings). What's the memory layout, and what does this mean for performance?
Answer: Pandas stores data in "blocks" by dtype. The 5 float columns share one contiguous block; the object columns each have a separate block (because Python objects can be variable-size and need indirection). Operations that span multiple dtypes are slower because of this. Convert to homogeneous dtypes when possible (e.g., categorical for strings with few unique values).
14. Group-by performance
df.groupby('symbol')['price'].mean() is slow on a 100M-row DataFrame. How would you speed it up?
Answer: Several options: (1) Sort by 'symbol' first - sorted groupby is faster. (2) Convert 'symbol' to categorical - reduces hash time. (3) Use polars instead of pandas - significantly faster for groupby on large data. (4) If most computations involve multiple aggregations, do them in one pass. (5) For repeated queries, precompute and cache.
Section 3: Concurrency and Performance (Questions 15-20)
15. The GIL
What is the Global Interpreter Lock, and why does it matter?
Answer: The GIL serialises bytecode execution across threads in CPython. Even with multiple threads, only one runs Python code at a time. CPU-bound code doesn't benefit from threading. I/O-bound code does (the GIL releases during I/O calls). For CPU-bound parallelism, use multiprocessing (separate processes) or cython/numba/C extensions that release the GIL.
16. asyncio
When should you use asyncio over threading?
Answer: asyncio is best for I/O-bound code with high concurrency (thousands of connections). Threading is reasonable for tens of concurrent I/O operations. asyncio has lower overhead per task (typically <1KB per coroutine vs MBs per thread) but requires the entire codepath to be async-aware. Mixing sync code into async (via run_in_executor) works but defeats the purpose.
17. Numba
You have a numerical loop in Python that's the bottleneck. How does numba help?
Answer: Numba JIT-compiles Python functions to machine code via LLVM. For numerical loops, speedups of 100x+ over pure Python are common. The annotation is just @numba.jit(nopython=True) on the function. Trade-offs: not all Python is supported (nopython=True mode requires basic numerical operations only); compilation time on first call.
18. Cython
When is Cython useful?
Answer: When you need C-level performance for code that's complex enough that numba doesn't apply, and where you want gradual migration from Python. Cython lets you add type annotations to Python code that compile to C. Common in critical pricing libraries that need both Python integration and C-level performance.
19. Memory profiling
Your script uses 5GB of RAM but you expected 1GB. How do you investigate?
Answer: (1) memory_profiler to track per-line memory usage. (2) pympler to inspect object size breakdowns. (3) Check for unintended object retention - circular references, caches that don't evict, large default arguments. (4) For pandas, check whether you're holding multiple copies of the same data through chained operations.
20. Why is my pandas operation slow?
A pandas operation that should take seconds takes minutes. How do you investigate?
Answer: (1) Check dtypes - using object dtype for what should be a number is a common cause. (2) Check whether you're accidentally creating new DataFrames in a loop (concatenation in a loop is O(n²)). (3) Use .values to drop to numpy when possible (avoid pandas index overhead). (4) Profile with %lprun (line-profiler) to see exactly where time is spent. (5) Consider polars or dask for very large data.
How to Use This Guide
For research-track candidates at top quant firms (Two Sigma, XTX, AQR, Citadel), all 20 questions are likely fair game. For sell-side bank quants and most hedge funds, focus on Sections 1 and 2 - questions about numpy/pandas internals appear less.
For broader prep:
- Quant interview questions hub
- Quant coding interview questions
- Python fundamentals for quant finance
- Python for finance guide
For firm-specific interview content where Python is heavily tested:
Practise the questions Python Quant Interview Questions: 20 Real Examples 2026 actually asks
Reading about the interview is one thing - sitting one is another. Quantt's interactive coding tests are modelled on the same problem types that show up in firms like Jane Street, Citadel, Hudson River and Optiver. Run real Python in the browser, get instant feedback, and benchmark yourself against the bar.
Free to start - no credit card required