Sensitivity, Conjugates, and the Spread: Convex Optimisation for Sports Trading (Part 2)
Part 1 ended with a claim: the dual variables from your optimisation solver tell you where your system is bottlenecked. Constitution Hill's liquidity constraint had a shadow price of £0.09 per pound. Enable's was zero. The budget constraint cost £0.03.
Those were static numbers — snapshots at a single set of parameters. But markets move. Edges shift. Liquidity appears and disappears. A Saturday afternoon is not one optimisation problem; it's a sequence of them, each slightly different from the last. The real question is: how does the optimal plan change when the inputs change?
That question is sensitivity analysis, and it builds on everything from Part 1 — Lagrangians, KKT, shadow prices. But this post goes further, introducing two ideas that transform how you think about markets:
- The convex conjugate — a single mathematical operation that converts a cost function into a profit function (and back).
- Subgradients and the bid-ask spread — what happens when the cost function has "kinks," and why the spread is not a nuisance but a fundamental feature.
Sensitivity Analysis: How Shadow Prices Respond to Change
Perturbing the constraints
Recall the Saturday staking problem from Part 1:
| Race | Selection | Edge | Max stake | Optimal stake |
|---|---|---|---|---|
| 2:30 Newmarket | Frankel | 5% | £2,000 | £2,000 |
| 3:15 Cheltenham | Constitution Hill | 12% | £500 | £500 |
| 4:00 Ascot | Enable | 3% | £5,000 | £500 |
Budget: £3,000. Shadow price of budget: $\mu^* = 0.03$.
Now suppose the budget changes to £3,000 + $\delta$, where $\delta$ is a small perturbation. How does the optimal expected profit $p^*(\delta)$ respond?
The sensitivity theorem (Boyd & Vandenberghe, 2004, §5.6) states:
$$ \frac{dp^*}{d\delta}\bigg|_{\delta=0} = \mu^* $$
The shadow price is the derivative. An extra £1 of bankroll increases expected profit by exactly £0.03, at least locally. This is not an approximation — for linear programs, it's exact over a range.
The sensitivity table
Every constraint has a shadow price. Collect them:
| Constraint | Shadow price $\lambda^*$ | Interpretation |
|---|---|---|
| Budget ≤ £3,000 | £0.03 | Each extra £1 bankroll → £0.03 profit |
| Frankel ≤ £2,000 | £0.02 | Each extra £1 Frankel liquidity → £0.02 profit |
| Constitution Hill ≤ £500 | £0.09 | Each extra £1 CH liquidity → £0.09 profit |
| Enable ≤ £5,000 | £0.00 | Extra Enable liquidity worthless |
When the regime breaks
Shadow prices are valid locally — they hold while the same set of constraints is binding. But push a perturbation far enough and a different constraint becomes active. The optimal basis changes. This is called a basis change (in LP terminology) or a regime transition.
For our problem, what happens if we keep adding bankroll?
- £3,000 → £3,500: Enable gets the extra £500 (edge 3%, the marginal bet). Shadow price of budget stays £0.03.
- £3,500 → £7,500: All of Enable's £5,000 liquidity gets consumed. Budget shadow price is still £0.03 while Enable has room.
- £7,500+: Budget exceeds total available liquidity (£2,000 + £500 + £5,000 = £7,500). Budget shadow price drops to £0.00. Extra money is worthless — there's nothing left to bet on.
The shadow price is piecewise constant, jumping at breakpoints where the active constraints change. Plotting $p^*$ against budget gives a piecewise linear curve — concave, because additional budget is never worth more than the previous increment.
Production insight. In a live system, when you solve the staking problem every few seconds as prices update, most solves will be in the same regime. The shadow prices are stable. When they suddenly jump, a constraint has become active or inactive — a regime change. Log these transitions. They are often the most informative diagnostic for understanding your system's behaviour.
Sensitivity to edge estimates
Shadow prices also apply to the objective coefficients — the edge estimates. Suppose Constitution Hill's true edge turns out to be $0.12 + \epsilon$ instead of $0.12$:
$$ \frac{dp^*}{d\epsilon} = x_2^* = 500 $$
The sensitivity of optimal profit to the edge estimate equals the optimal stake on that selection. High-stake selections amplify estimation errors. This is why fractional Kelly (staking 25–50% of full Kelly) exists — it dampens the impact of edge mis-estimation by reducing the stakes that multiply those errors.
| Whose edge you mis-estimate | Optimal stake | Profit sensitivity |
|---|---|---|
| Constitution Hill (12%) | £500 | £500 per 1% error |
| Frankel (5%) | £2,000 | £2,000 per 1% error |
| Enable (3%) | £500 | £500 per 1% error |
Frankel has the highest sensitivity — not because its edge is uncertain, but because the stake is largest. A 1% overestimate of Frankel's edge inflates expected profit by £20. A 1% overestimate of Constitution Hill's edge inflates it by only £5. The danger is in the big positions, not the big edges.
Convex Functions: Why "Bowl-Shaped" Means "Solvable"
The key property
Before we get to the conjugate, we need the concept beneath it: convex functions.
A function $f$ is convex if the line segment between any two points on its graph lies on or above the graph:
$$ f(tx + (1-t)y) \leq t\,f(x) + (1-t)\,f(y) \quad \text{for all } t \in [0,1] $$
Everything below the curve is a "bowl." No dips, no bumps, no local traps.
Why does this matter for trading? Three properties:
| Property | What it means | Why it matters |
|---|---|---|
| Local = Global | Every local minimum is the global minimum | Any "good" solution your solver finds is the best solution |
| Unique minimum (strict convexity) | At most one minimiser exists | The optimal staking plan is unique — no ambiguity |
| Efficient algorithms | Polynomial-time solvers exist | You can solve in real time, not overnight |
If the objective is not convex, none of these hold. You can have multiple local minima, each giving a different staking plan, with no way to know which is best without exhaustive search. Convexity is the mathematical guarantee that your solver isn't lying to you.
Where convex functions appear in betting
| Function | Formula | Where it appears |
|---|---|---|
| Linear | $p^T x$ | Expected profit at fixed prices |
| Quadratic | $\frac{1}{2}x^T Q x$ | Market impact cost |
| Negative log | $-\log(x)$ | Kelly criterion (maximise $\mathbb{E}[\log W]$ is concave) |
| Log-sum-exp | $\log(\sum_i e^{x_i})$ | LMSR cost function — the automated market maker |
| Negative entropy | $\sum_i p_i \log p_i$ | Information loss — the conjugate of log-sum-exp |
| Norms | $\|x\|_1$, $\|x\|_2$ | Exposure measures, regularisation |
Every one of these is convex (or concave, for the log/Kelly case). This is not coincidence. Market mechanisms are designed with convex cost functions because convexity guarantees the three properties above. Hanson built the LMSR on log-sum-exp precisely because it is convex, ensuring a unique price vector at every state.
The Convex Conjugate: Flipping Cost into Profit
The definition
The convex conjugate (or Fenchel conjugate) of a function $f$ is:
$$ f^*(y) = \sup_x \left\{ y^T x - f(x) \right\} $$
Read this as: "Given price vector $y$, what position $x$ maximises revenue $y^T x$ minus cost $f(x)$?"
The answer — $f^*(y)$ — is the maximum achievable profit at price $y$.
The fundamental trading duality. The conjugate transforms a cost function (what it costs to hold a position) into a profit function (the best profit achievable at given prices). Take the conjugate again and you get the original cost function back: $f^{**} = f$ for any closed convex function. Cost and profit are two views of the same object.
A betting example: quadratic impact
Suppose trading in a market has quadratic cost — the price moves against you proportionally to your trade size. A reasonable model for a liquid Betfair favourite:
$$ \text{Cost}(x) = \frac{1}{2}\alpha\, x^2 $$
where $\alpha$ is the impact coefficient (£ per £² staked — how aggressively the price moves). The conjugate is:
$$ \text{Cost}^*(y) = \sup_x \left\{yx - \frac{1}{2}\alpha\, x^2\right\} $$
Differentiate, set to zero: $y - \alpha x = 0$, so $x^* = y/\alpha$. Substitute back:
$$ \text{Cost}^*(y) = \frac{y^2}{2\alpha} $$
Interpretation. The maximum profit grows as the square of the price deviation, scaled by $1/\alpha$:
- Deep market (large $\alpha$, e.g., Honeysuckle in a Grade 1 at Cheltenham): prices move a lot per pound traded. Profit per unit price deviation is small ($1/\alpha$ is small). Trading is expensive but stable.
- Thin market (small $\alpha$, e.g., a seller in a Southwell 6:30): prices barely move. Profit per unit price deviation is large ($1/\alpha$ is large). You can extract more — but the market can also move against you faster.
The impact coefficient $\alpha$ in the cost becomes $1/\alpha$ in the profit. The conjugate inverts the liquidity.
The Fenchel–Young inequality: the no-free-lunch law
For any position $x$ and any price $y$:
$$ f(x) + f^*(y) \geq x^T y $$
In English: cost of holding the position plus maximum achievable profit is always at least as large as the raw payout. You can't simultaneously have low cost, high profit, and a big payout. Something has to give.
Equality holds when $y = \nabla f(x)$ — when the price equals the marginal cost. This is the equilibrium condition: the market clears when the price you see is the derivative of the cost function at the current position.
The LMSR duality: log-sum-exp ↔ negative entropy
The most important conjugate pair in market making. The Logarithmic Market Scoring Rule (LMSR) has cost function:
$$ C(q) = b \log\left(\sum_{i=1}^n e^{q_i / b}\right) $$
where $q$ is the vector of cumulative quantities traded on each outcome and $b$ is the liquidity parameter (how much the market maker is willing to lose — bounded by $b \log n$).
The prices are the gradient:
$$ p_i = \nabla_i C(q) = \frac{e^{q_i / b}}{\sum_j e^{q_j / b}} $$
This is the softmax function — the same one used in neural networks for classification. Prices are always positive and sum to 1. They are the implied probability distribution.
The conjugate is:
$$ C^*(p) = b \sum_{i=1}^n p_i \log p_i \quad \text{for } p \in \Delta_{n-1} $$
This is $b$ times the negative Shannon entropy. The duality says:
- Cost side (log-sum-exp): how much the market maker can lose.
- Profit side (negative entropy): how much information the market has aggregated.
The parameter $b$ controls the trade-off. Large $b$ means deeper liquidity (prices move slowly), but the market maker risks more. Small $b$ means thin liquidity (prices move fast), but the market maker's loss is tightly bounded.
On Betfair, you don't see an explicit LMSR. The order book is the market maker. But the same conjugate relationship governs any automated market mechanism — and understanding it tells you exactly how liquidity, price sensitivity, and bounded loss relate.
Subgradients and the Bid-Ask Spread
The problem with kinks
Not all cost functions are smooth. On Betfair, the order book is discrete — a stack of limit orders at specific prices. The aggregate cost of buying quantity $q$ is a piecewise linear function: the price is constant until you exhaust the depth at one level, then it jumps to the next.
At each jump, the function has a "kink" — a point where the derivative doesn't exist. Calculus breaks down. But optimisation can't stop.
Subgradients: derivatives that work at kinks
A subgradient of a convex function $f$ at point $x_0$ is any vector $g$ such that:
$$ f(x) \geq f(x_0) + g^T(x - x_0) \quad \text{for all } x $$
At smooth points, the only subgradient is the gradient. At a kink, there are many — the subdifferential $\partial f(x_0)$ is a set.
The spread is the subdifferential
Consider a single Betfair selection. The cost of acquiring a net position $x$ is:
- Increasing $x$ costs the back price (you're buying).
- Decreasing $x$ earns the lay price (you're selling).
At $x = 0$ (no position), the "slope" of the cost function is ambiguous — it's the back price going up and the lay price going down. The subdifferential at zero is:
$$ \partial C(0) = [p_{\text{lay}}, \, p_{\text{back}}] $$
The subdifferential is the bid-ask spread. The wider the spread, the larger the set of valid subgradients, the more "ambiguity" about the fair price.
| Market condition | Spread | Subdifferential | Meaning |
|---|---|---|---|
| Deep, liquid favourite | 1 tick | Nearly a single point | Price is almost uniquely defined |
| Thin outsider | 10+ ticks | Wide interval | Fair price is highly uncertain |
| Zero spread (theoretical) | 0 | Single point | Perfect market — gradient exists |
This is why market making is profitable. The market maker sits at the kink — quoting both back and lay — and the spread is the set of prices consistent with optimality. The spread is not an inefficiency to be eliminated; it is the subdifferential of the cost function, and it must be non-empty.
Optimality at a kink
The optimality condition for a convex function is:
$$ 0 \in \partial f(x^*) $$
Zero must be a valid subgradient at the optimum. At a kink, this means zero must lie inside the subdifferential interval. The market maker's optimal position — zero net exposure — satisfies this exactly: $0 \in [p_{\text{lay}}, p_{\text{back}}]$, which holds whenever the lay price is non-negative and the back price is non-negative, i.e., always.
The spread is the region where "do nothing" is optimal. If a price signal falls inside the spread, it provides no actionable information. Only when an opportunity pushes the implied price outside the spread does a trade become worthwhile.
Putting It Together: A Python Implementation
Here is a complete worked example: solve the Saturday staking problem from Part 1, extract the dual variables, and perform sensitivity analysis.
"""
Optimal staking with sensitivity analysis using scipy.optimize.linprog.
Demonstrates KKT conditions, shadow prices, and parametric sensitivity.
"""
import numpy as np
from scipy.optimize import linprog
# === Problem data ===
edges = np.array([0.05, 0.12, 0.03]) # expected profit per £1 staked
max_stakes = np.array([2000.0, 500.0, 5000.0]) # liquidity limits
budget = 3000.0 # total bankroll
# linprog minimises c^T x, so negate edges (we want to maximise)
c = -edges
# Budget constraint: x1 + x2 + x3 <= budget
A_ub = np.array([[1.0, 1.0, 1.0]])
b_ub = np.array([budget])
# Bounds: 0 <= x_i <= max_stake_i
bounds = [(0, ms) for ms in max_stakes]
# === Solve ===
result = linprog(c, A_ub=A_ub, b_ub=b_ub, bounds=bounds, method="highs")
print("=== Optimal Staking Plan ===")
names = ["Frankel (2:30)", "Constitution Hill (3:15)", "Enable (4:00)"]
for name, stake in zip(names, result.x):
print(f" {name}: £{stake:,.0f}")
print(f" Expected profit: £{-result.fun:,.2f}")
# === Shadow prices from the dual ===
# The dual values for inequality constraints in scipy are in result.ineqlin
budget_shadow = result.ineqlin.marginals[0]
print(f"\n=== Shadow Prices ===")
print(f" Budget constraint: £{-budget_shadow:.4f} per £1 extra bankroll")
# Upper bound shadow prices (how much more profit per £1 extra liquidity)
upper_shadows = result.upper.marginals
for name, shadow in zip(names, upper_shadows):
print(f" {name} liquidity: £{-shadow:.4f} per £1 extra depth")
# === Sensitivity analysis: sweep budget ===
print(f"\n=== Budget Sensitivity ===")
budgets = [2000, 2500, 3000, 3500, 5000, 7500, 8000]
for b in budgets:
b_ub_sweep = np.array([float(b)])
res = linprog(c, A_ub=A_ub, b_ub=b_ub_sweep, bounds=bounds, method="highs")
shadow = -res.ineqlin.marginals[0]
print(f" Budget £{b:>5,}: profit £{-res.fun:>8,.2f}, "
f"marginal value £{shadow:.4f}/£1")
This prints the optimal stakes, the shadow price of every constraint, and — critically — shows how the marginal value of bankroll decreases as budget increases. At £7,500 the budget shadow price drops to zero: you've run out of things to bet on.
The Takeaway
-
Shadow prices are derivatives. The sensitivity of optimal profit to any constraint perturbation equals the corresponding dual variable. Track these in production — they tell you where to focus effort.
-
Sensitivity has regimes. Shadow prices are constant within a regime and jump at breakpoints where the binding constraints change. Log the jumps — they're diagnostic gold.
-
Edge sensitivity = stake size. The profit impact of mis-estimating an edge is proportional to the optimal stake. The biggest positions carry the most estimation risk, not the biggest edges.
-
The convex conjugate flips cost into profit. For any cost function, the conjugate gives the maximum profit at each price. For the LMSR: log-sum-exp (cost) ↔ negative entropy (information aggregated). The liquidity parameter $b$ controls the trade-off.
-
The bid-ask spread is the subdifferential. At a kink in the cost function, the set of valid "slopes" is an interval — the spread. Market making is profitable precisely because this interval is non-empty. Signals inside the spread are noise; only signals that breach the spread are actionable.
Key Terms Introduced
| Term | Meaning |
|---|---|
| Sensitivity analysis | Studying how the optimal solution changes when problem parameters (budget, edges, limits) change |
| Basis change / regime transition | The point where a different set of constraints becomes binding, causing shadow prices to jump |
| Convex function | A function whose graph curves upward — the line between any two points on the graph lies above the graph |
| Strict / strong convexity | Guarantees a unique minimum (strict) or quadratic growth away from the minimum (strong) |
| Convex conjugate (Fenchel transform) | $f^*(y) = \sup_x\{y^T x - f(x)\}$ — transforms a cost function into a profit function |
| Fenchel–Young inequality | $f(x) + f^*(y) \geq x^T y$ — cost + max profit ≥ raw payout; equality at equilibrium |
| LMSR (Log Market Scoring Rule) | An automated market maker with cost function log-sum-exp; prices are softmax of quantities |
| Negative entropy | $\sum p_i \log p_i$ — the conjugate of log-sum-exp; measures information concentration |
| Subgradient | A generalised derivative at a point where the function has a kink |
| Subdifferential | The set of all subgradients at a point — an interval at a kink, a single point where smooth |
| Bid-ask spread as subdifferential | The spread on an exchange is the subdifferential of the cost function at zero position |
References
-
Boyd, S. & Vandenberghe, L. (2004). Convex Optimization. Cambridge University Press. — Section 5.6 covers sensitivity analysis and the interpretation of dual variables as derivatives. Chapter 3 covers convex functions. Free online at stanford.edu/~boyd/cvxbook.
-
Rockafellar, R. T. (1970). Convex Analysis. Princeton University Press. — Chapters 12 (conjugates), 23–25 (subgradients), 28 (Fenchel duality). The mathematical bedrock.
-
Hanson, R. (2003). "Combinatorial Information Market Design." Information Systems Frontiers, 5(1), 107–119. — The original LMSR paper. Defines the log-sum-exp cost function and proves bounded loss.
-
Hanson, R. (2007). "Logarithmic Market Scoring Rules for Modular Combinatorial Information Aggregation." Journal of Prediction Markets, 1(1), 3–15. — Extended treatment with the log-sum-exp / negative entropy conjugate pair derived explicitly.
-
Abernethy, J., Chen, Y. & Vaughan, J. W. (2013). "Efficient Market Making via Convex Optimization, and a Connection to Online Learning." ACM Transactions on Economics and Computation, 1(2). — The definitive paper connecting convex conjugates, market making mechanisms, and online learning. Shows that every cost-function-based market maker corresponds to a convex conjugate pair.
-
Bertsimas, D. & Tsitsiklis, J. N. (1997). Introduction to Linear Optimization. Athena Scientific. — Chapter 5 on sensitivity analysis for linear programs. Clear treatment of shadow prices, basis changes, and parametric LP.
-
Hiriart-Urruty, J.-B. & Lemaréchal, C. (2001). Fundamentals of Convex Analysis. Springer. — Chapter 4 on subdifferentials. The connection between subgradient sets and non-smooth optimisation.