How to distribute computing budget between Pre-Training, Fine-Tuning, and Test-Time Compute – and why this trade-off is crucial
Compute Allocation: How do you optimally distribute a fixed compute budget? Pre-Training, Fine-Tuning, and Inference compete for resources. The optimal balance is shifting toward Test-Time Compute.
The economic perspective on Test-Time Scaling.
2022: 90% Pre-Training, 10% Rest. 2025: 60% Pre-Training, 20% Fine-Tuning, 20% Inference. The trend is toward more Inference-Compute — this changes the ML economics.