Inference/Test Time Scaling
Last updated
Last updated
How does performance improve with more compute during test-time?
In what ways can be use more compute during test-time?
Parallel output - running the models multiple times and then do majority voting of the solutions
Longer output - Longer chain of thoughts.
Categorised as Sequential vs Parallel