Inference/Test Time Scaling

  • How does performance improve with more compute during test-time?

    • In what ways can be use more compute during test-time?

      • Parallel output - running the models multiple times and then do majority voting of the solutions

      • Longer output - Longer chain of thoughts.

      • Categorised as Sequential vs Parallel

Last updated