What Quant Research Interviews Test
Quant research interviews differ from quant trader and quant developer interviews in three key ways. First, the technical questions go deeper into statistics and machine learning rather than rapid mental arithmetic. Second, you'll be asked methodology questions specific to financial data - look-ahead bias, regime change, out-of-sample validation - that engineers and traders rarely face. Third, the bar on intellectual humility is unusually high. Researchers who confidently assert wrong things tend to fail; researchers who quantify their uncertainty and reason carefully tend to pass.
This guide collects 25 worked examples from recent Two Sigma, DE Shaw, XTX, Citadel, Renaissance and Hudson River Trading research interviews. For broader context, see our quant researcher salary guide, machine learning finance guide, and statistics for quantitative trading.
Section 1: Statistics Foundations (Questions 1-7)
1. Bias-variance trade-off
You have a model that achieves 99% training accuracy and 65% test accuracy. What's happening, and what would you do?
Answer: Severe overfitting (high variance). Reduce model capacity (smaller network, fewer features), add regularisation (L1/L2), use cross-validation to tune hyperparameters, gather more data, or use ensemble methods (bagging reduces variance).
2. Bias estimation
(X_1, ..., X_n) are iid with variance σ². Show why (s^2 = \frac{1}{n}\sum(X_i - \bar{X})^2) is a biased estimator of σ², and what unbiased version is.
Answer: (E[s^2] = \frac{n-1}{n}σ^2). The unbiased version uses (n-1) in the denominator (Bessel's correction). The bias arises because (\bar{X}) is computed from the same sample, which introduces correlation that shrinks the apparent spread.
3. Hypothesis testing
You run an A/B test on two trading strategies. Strategy A returns 10% with std dev 2%; Strategy B returns 11% with std dev 2%, both over 60 days. Is B significantly better?
Answer: Compute t-statistic. (t = (11 - 10) / \sqrt{(2^2/60 + 2^2/60)} = 1 / 0.365 ≈ 2.74). At α=0.05 (t-critical ≈ 2.0 with large dof), B is significantly better. But: discuss whether 60 daily returns is enough; whether the strategies are independent (probably not); and whether the variance estimate is itself reliable.
4. Multiple testing
You test 100 trading signals. 5 reject the null at p < 0.05. Are any real?
Answer: Almost certainly not - that's exactly what you'd expect by chance. Apply Bonferroni (p < 0.0005 needed) or Benjamini-Hochberg. The interviewer is testing whether you understand multiple testing correction without prompting.
5. Maximum likelihood
You observe data (x_1, ..., x_n) drawn iid from N(μ, σ²). What are the MLE estimates of μ and σ²?
Answer: (\hat{μ} = \bar{x}), (\hat{σ}^2 = \frac{1}{n}\sum(x_i - \bar{x})^2). Note that the MLE of variance is the biased version (uses n, not n-1).
6. Bayesian update
You start with a prior of N(0, 1) on a parameter μ. You observe a single value x ~ N(μ, σ²). What's the posterior distribution of μ?
Answer: N((\frac{x}{1 + σ²}, \frac{σ²}{1 + σ²})). The posterior is Gaussian with precision = prior precision + observation precision.
7. Central limit theorem
You're computing the mean of 1000 iid samples from a heavy-tailed distribution (e.g., Cauchy). Does CLT apply?
Answer: No. CLT requires finite variance; Cauchy doesn't have one. The sample mean of n Cauchy variables is itself Cauchy (not Gaussian). This is a real issue with financial returns - their tails are heavy enough that asymptotic normality may not hold even with thousands of observations.
Section 2: Machine Learning (Questions 8-14)
8. Regularisation
Compare L1 and L2 regularisation. When would you choose one over the other?
Answer: L1 (lasso) penalises absolute weight magnitudes and produces sparse solutions - useful when you suspect many features are irrelevant. L2 (ridge) penalises squared magnitudes; shrinks all weights without zeroing - useful for multicollinearity. Elastic net combines both. With high-dimensional financial data, L1 or elastic net usually outperforms pure L2.
9. Cross-validation in time series
Why is K-fold cross-validation problematic for financial time series? What should you use instead?
Answer: K-fold randomly shuffles observations across folds, leaking future information into the training set. For time series use forward-walking validation: train on data up to time T, test on T+1 to T+H, advance the window. Be explicit about validation horizon and assumed holding period.
10. Train-test split with regime change
You have 10 years of data. The first 7 years had bull markets, the last 3 had a bear market. You split chronologically. Your model fails out-of-sample. What happened?
Answer: Regime change. The features that mattered in bull markets (e.g., momentum) often reverse in bear markets. Solutions: include regime indicators as features, use ensemble models that combine regime-specific predictors, or use rolling-window training that gives more weight to recent data.
11. Feature selection
You have 1000 candidate features and 5 years of monthly returns (60 observations). How do you select features?
Answer: With 60 observations, you can't naively use 1000 features. Options: (1) L1 regularisation for automatic selection. (2) Univariate filters first - rank features by correlation with target, keep top K. (3) PCA to reduce dimensionality. (4) Demand economic plausibility - if you can't articulate why a feature should matter, exclude it. The risk of false positives is enormous with this signal-to-noise ratio.
12. Tree-based methods
When would you choose XGBoost over a linear model for a regression problem on financial data?
Answer: When you suspect non-linearities or interactions between features that a linear model can't capture, and you have enough data to estimate those interactions reliably. With <1000 observations, often the linear model wins; with >10,000 and rich features, gradient boosting usually outperforms.
13. Walk-forward analysis
You backtest a signal on 2010-2020 data with annual retraining. Sharpe is 1.8. You retrain monthly instead. Sharpe drops to 0.6. What's happening?
Answer: Almost certainly overfitting amplified by frequent retraining. With monthly retrains, you're effectively doing 132 retraining events over 11 years, each fitted to recent noise. The annual retrain has only 11 events; less ability to chase noise. The right test: try out-of-sample on 2021-present with each retraining frequency.
14. Model selection AIC vs BIC
What's the difference between AIC and BIC, and when would you prefer one over the other?
Answer: AIC = -2 log L + 2k; BIC = -2 log L + k log n. AIC tends to select more complex models; BIC penalises complexity more. Use BIC when you want a parsimonious model and have enough data; AIC when you care about predictive performance and don't mind extra parameters.
Section 3: Time Series and Signal Design (Questions 15-20)
15. Stationarity
Why is stationarity important for financial time series modelling, and how do you check for it?
Answer: Most time-series methods (ARIMA, GARCH) assume stationarity - constant mean and variance over time. Financial returns are usually approximately stationary; prices are not (random walk). Check via Augmented Dickey-Fuller test, KPSS test, or visual inspection of rolling statistics.
16. Cointegration
Two stocks A and B both follow random walks. Their prices diverge over time. But the spread A - β·B is stationary. How would you trade this?
Answer: Mean-reverting strategy on the spread. Buy when spread is below its mean, sell when above. Estimate β via Engle-Granger or Johansen test. Cautions: cointegration relationships can break (especially in stressed markets); transaction costs may eat the alpha; the spread may be persistent for long periods.
17. Autocorrelation
You compute the autocorrelation of daily returns of a stock and find significant correlation at lag 1. What does this mean?
Answer: Returns have momentum at the daily horizon. Possible explanations: market microstructure effects (bid-ask bounce can produce negative autocorrelation; persistent trading can produce positive); slow-moving information arrival; or genuine inefficiency. Be skeptical - many "edges" at the lag-1 horizon are bid-ask bounce, not exploitable.
18. Volatility clustering
Financial returns show low autocorrelation but absolute returns show high autocorrelation. What's the standard model for this?
Answer: GARCH (Generalised Autoregressive Conditional Heteroskedasticity). Returns are uncorrelated but their conditional variance is autocorrelated. Realised vol today predicts realised vol tomorrow.
19. Spurious correlation
Two unrelated random walks can show R² > 0.5 over thousands of observations. What's going on, and how do you guard against it?
Answer: Both processes are non-stationary (integrated of order 1). The standard regression doesn't handle this - the t-statistic is unreliable. Solutions: regress on differences (returns) instead of levels (prices); test for cointegration before trusting a relationship; use error-correction models if cointegrated.
20. Signal decay
You build a signal with Sharpe 2.0 in 2015-2020 backtest. By 2024 the Sharpe is 0.5. What likely happened?
Answer: Signal decay - other market participants are likely arbitraging the same edge. This is the norm rather than the exception in finance; alphas don't last. Solutions: continually research new signals; use signal portfolios (don't depend on a single signal); be honest about expected decay rates.
Section 4: Practical Research Methodology (Questions 21-25)
21. Look-ahead bias
You build a signal that uses each stock's market cap as a feature. You use today's market cap, computed as today's closing price × shares outstanding. What's wrong?
Answer: Look-ahead bias - you're using today's price to predict today's price. The fix: lag market cap by one day (use yesterday's close × shares outstanding for today's signal).
22. Survivorship bias
You backtest a strategy on the current S&P 500 constituents from 2000 to today. Sharpe is 2.0. What's wrong?
Answer: Survivorship bias - you're testing on companies that survived to be in today's index. Companies that were delisted (often due to poor performance) are excluded. Use point-in-time index constituents instead.
23. Snooping bias
You build 100 signals; one has Sharpe 3 backtested. Should you trade it?
Answer: No - or at least, with great caution. With 100 trials and pure noise, the best Sharpe will be ~3 by chance. Apply multiple-testing correction (Bonferroni or Bailey's deflated Sharpe). Trade only if the signal survives correction, has economic intuition, and works out-of-sample on data not used in selection.
24. Transaction costs
Your simulated strategy has Sharpe 2.0 with no transaction costs. With 10 bps round-trip, Sharpe is 0.3. What's going on?
Answer: The strategy turns over too quickly. High-turnover strategies require very low costs to be profitable. Solutions: (1) reduce turnover (longer holding periods, larger thresholds for trade signals); (2) only trade highly liquid instruments where costs are low; (3) use cost-aware portfolio optimisation that explicitly models the cost-return trade-off.
25. Why your model failed in production
A model that backtested well failed in live trading after 3 months. What are the possible causes, in order of likelihood?
Answer (typical order):
- Look-ahead bias in the backtest (most common; subtle data-pipeline issues).
- Transaction cost mis-modelling - real costs higher than assumed, especially in stressed markets.
- Capacity issues - the strategy works at small size but moves the market at production size.
- Regime change - market conditions differ from training.
- Bug in production code - the live model differs from the backtested one.
- Genuine alpha decay - the edge has been arbitraged away.
A disciplined post-mortem checks these in order and tests each hypothesis with specific diagnostics.
How to Use This Guide
For research-track preparation, work through Sections 1-3 thoroughly. For senior researcher interviews, Section 4 (methodology) is often the hardest - the questions feel obvious but the answers separate experienced researchers from juniors.
For broader prep:
- Quant interview questions hub
- Statistics for quantitative trading
- Machine learning finance guide
- Quant researcher salary guide
For firm-specific research interview content:
Practise the questions Quant Research Interview Questions: 25 Real Examples 2026 actually asks
Reading about the interview is one thing - sitting one is another. Quantt's interactive coding tests are modelled on the same problem types that show up in firms like Jane Street, Citadel, Hudson River and Optiver. Run real Python in the browser, get instant feedback, and benchmark yourself against the bar.
Free to start - no credit card required