randomly take action according to the probability you believe it is the optimal action - Thompson
Problem being solved: Need to explore which of the two treatments is more successful, but also want to minimize the number of times you give patients the suboptimal treatment.