Deep dive into how Affine uses Pareto dominance to evaluate and reward miners. Why winners-take-all creates the right incentives for model improvement.
What is the Pareto Frontier?
In multi-objective optimization, the Pareto frontier (also called the Pareto front or Pareto boundary) represents the set of solutions where no objective can be improved without degrading another objective.
In the context of Affine, we evaluate models across multiple RL environments simultaneously. A model is said to dominate another if it performs at least as well on all environments and strictly better on at least one.
Why Pareto Dominance?
Traditional single-metric evaluation has a fundamental flaw: models can be optimized to game that specific metric while performing poorly on real-world tasks. By using Pareto dominance across multiple diverse environments, we ensure that:
- Generalization is rewarded — Models must perform well across different task types
- Gaming is difficult — Optimizing for one metric at the expense of others won't help
- True capability improvements surface — Only genuine improvements propagate
The Winners-Take-All Mechanism
Affine employs a winners-take-all reward structure. When a new model achieves Pareto dominance over the current frontier, it captures the majority of rewards. This creates powerful incentives:
For Miners
- Strong motivation to improve — Significant rewards for meaningful contributions
- Clear target — The current Pareto frontier provides a benchmark to beat
- Collaborative competition — Miners can build on each other's work
For the Network
- Rapid improvement — Strong incentives drive fast progress
- Quality over quantity — Only genuine improvements are rewarded
- Efficient resource allocation — Compute goes to the best models
Evaluation Environments
Models are currently evaluated on two primary environment types:
DED-V2 (Deduction)
Tests a model's ability to derive conclusions from given premises. This evaluates logical reasoning and step-by-step inference capabilities.
ABD-V2 (Abduction)
Tests a model's ability to infer the best explanation for observed phenomena. This evaluates hypothesis generation and creative reasoning.
Mathematical Framework
Let M = {m₁, m₂, ..., mₙ} be the set of submitted models and E = {e₁, e₂, ..., eₖ} be the set of evaluation environments.
For each model mᵢ, we compute a performance vector:
P(mᵢ) = [score(mᵢ, e₁), score(mᵢ, e₂), ..., score(mᵢ, eₖ)]Model mᵢ Pareto dominates model mⱼ if:
∀k: score(mᵢ, eₖ) ≥ score(mⱼ, eₖ)∃k: score(mᵢ, eₖ) > score(mⱼ, eₖ)
The Pareto frontier consists of all non-dominated models.
Implications for Model Development
Understanding the Pareto mechanism should inform your training strategy:
- Balance your improvements — Don't sacrifice performance on one environment for another
- Test across all environments — Ensure improvements are consistent
- Aim for Pareto dominance — Partial improvements may not be rewarded
Conclusion
The Pareto frontier mechanism is the mathematical heart of Affine's incentive structure. By rewarding models that achieve genuine multi-dimensional improvements, we create a self-improving system that converges toward better and better AI capabilities.