The Art of Finding the Best
Almost every interesting problem in quantitative finance is, at its core, an optimisation problem. What is the best portfolio? What model parameters best fit the data? What hedge minimises residual risk? Where should I place a limit order to minimise execution cost?
Optimisation is the mathematical machinery for answering "what is best?" — and it ties together calculus, linear algebra, and probability in one neat package.
The Basic Idea
An optimisation problem has three parts:
- Objective function — what you want to maximise or minimise (portfolio return, tracking error, execution cost)
- Decision variables — what you can control (portfolio weights, hedge ratios, order sizes)
- Constraints — rules you must follow (weights sum to 1, no shorting, maximum position size)
In mathematical terms:
[ \min_{\mathbf{w}} f(\mathbf{w}) \quad \text{subject to} \quad g_i(\mathbf{w}) \leq 0 ]
The Markowitz portfolio optimisation is the classic example: minimise portfolio variance subject to achieving a target return and weights summing to one.
First-Order Conditions
For unconstrained optimisation, the minimum occurs where the derivative (or gradient) equals zero:
[ \nabla f(\mathbf{w}) = \mathbf{0} ]
This is the multivariable extension of "set the derivative to zero" from A-level maths. The gradient ( \nabla f ) is a vector of partial derivatives — one for each variable — and setting it to zero gives you a system of equations to solve.
Second-Order Check
Not every point where the gradient is zero is a minimum — it could be a maximum or a saddle point. The Hessian matrix (matrix of second derivatives) tells you which:
- All positive eigenvalues → minimum ✓
- All negative eigenvalues → maximum
- Mixed signs → saddle point
For the quadratic objective in portfolio theory, the Hessian is ( 2\Sigma ) (the covariance matrix), which is positive semi-definite by construction. So the optimisation is well-behaved — there is a unique minimum.
Constrained Optimisation: Lagrange Multipliers
When you have constraints (and you always do in finance), you use Lagrange multipliers. The idea: at the optimum, the gradient of the objective must be a linear combination of the gradients of the constraints.
[ \nabla f = \sum_i \lambda_i \nabla g_i ]
The ( \lambda_i ) values are Lagrange multipliers, and they have a beautiful interpretation: each one tells you how much the optimal value would improve if you relaxed that constraint slightly. In portfolio terms, the multiplier on the return constraint tells you the marginal cost (in extra risk) of demanding a slightly higher return.
Portfolio Optimisation — The Real Thing
The Markowitz problem:
[ \min_{\mathbf{w}} \mathbf{w}^T \Sigma \mathbf{w} \quad \text{s.t.} \quad \mathbf{w}^T \boldsymbol{\mu} = r_{\text{target}}, \quad \mathbf{w}^T \mathbf{1} = 1 ]
This is a quadratic programme — the objective is quadratic in the decision variables and the constraints are linear. It has a closed-form solution, which is one reason Markowitz won a Nobel Prize.
In practice, real portfolios add more constraints:
- No shorting: ( w_i \geq 0 )
- Sector limits: total weight in tech ( \leq ) 30%
- Turnover limits: restrict trading to reduce transaction costs
These make the problem harder (no closed-form solution), but modern optimisation libraries handle them comfortably.
Numerical Methods
When analytical solutions do not exist, we use iterative algorithms:
Gradient Descent
The simplest: step downhill. At each iteration, move in the direction of steepest descent:
[ \mathbf{w}_{k+1} = \mathbf{w}_k - \alpha \nabla f(\mathbf{w}_k) ]
where ( \alpha ) is the step size (learning rate). It is simple, robust, and the foundation of most machine learning optimisation.
Newton's Method
Uses second-derivative information (the Hessian) for much faster convergence:
[ \mathbf{w}_{k+1} = \mathbf{w}_k - H^{-1} \nabla f(\mathbf{w}_k) ]
More expensive per step but far fewer steps needed. Used in model calibration where precision matters.
Convexity
If the objective function is convex (bowl-shaped), any local minimum is also the global minimum. Quadratic portfolio optimisation is convex, which is why it is so well-behaved. Non-convex problems (common in options calibration) are harder — you might find a local minimum that is not the best overall.
Model Calibration
Beyond portfolios, optimisation is used to fit models to data:
- Calibrating Black-Scholes: find the implied volatility that makes model price match market price
- Fitting yield curves: find the parameters that best reproduce observed bond prices
- Estimating factor models: regression is itself an optimisation — minimise the sum of squared residuals
[ \min_{\boldsymbol{\beta}} \sum_{i=1}^{n} (y_i - \mathbf{x}_i^T \boldsymbol{\beta})^2 ]
Getting Practical
SciPy's optimize module and CVXPY are the standard Python tools:
from scipy.optimize import minimize def portfolio_variance(w, cov_matrix): return w @ cov_matrix @ w result = minimize( portfolio_variance, x0=[0.25, 0.25, 0.25, 0.25], args=(cov_matrix,), method='SLSQP', constraints=[{'type': 'eq', 'fun': lambda w: sum(w) - 1}], bounds=[(0, 1)] * 4 )
Want to go from theory to working code? Quantt builds optimisation skills progressively, connecting each mathematical concept to its financial application with interactive exercises. It is the fastest way to go from "I sort of understand gradients" to "I can build a portfolio optimiser."
Want to go deeper on Optimisation in Quant Finance: Finding the Best Portfolio (and Everything Else)?
This article covers the essentials, but there's a lot more to learn. Inside Quantt, you'll find hands-on coding exercises, interactive quizzes, and structured lessons that take you from fundamentals to production-ready skills — across 50+ courses in technology, finance, and mathematics.
Free to get started · No credit card required