Inference/Test Time Scaling

How does performance improve with more compute during test-time?
- In what ways can be use more compute during test-time?
  - Parallel output - running the models multiple times and then do majority voting of the solutions
  - Longer output - Longer chain of thoughts.
  - Categorised as Sequential vs Parallel

Last updated 12 months ago