Thompson Sampling

randomly take action according to the probability you believe it is the optimal action - Thompson

Problem being solved: Need to explore which of the two treatments is more successful, but also want to minimize the number of times you give patients the suboptimal treatment.

Working example:

Last updated