Practice Thompson Sampling (9.9.3.3) - Reinforcement Learning and Bandits
Students

Academic Programs

AI-powered learning for grades 8-12, aligned with major curricula

Professional

Professional Courses

Industry-relevant training in Business, Technology, and Design

Games

Interactive Games

Fun games to boost memory, math, typing, and English skills

Thompson Sampling

Practice - Thompson Sampling - 9.9.3.3

Learning

Practice Questions

Test your understanding with targeted questions

Question 1 Easy

What is Thompson Sampling?

💡 Hint: Think of its relation to probabilities and rewards.

Question 2 Easy

Define regret in the context of decision-making.

💡 Hint: What do we lose by not having the best possible choice?

4 more questions available

Interactive Quizzes

Quick quizzes to reinforce your learning

Question 1

What does Thompson Sampling aim to optimize in Multi-Armed Bandit problems?

Exploration
Exploitation
Cumulative Reward

💡 Hint: What is the ultimate goal when selecting an arm?

Question 2

True or False: Thompson Sampling only focuses on exploration.

True
False

💡 Hint: Consider both terms' definitions.

1 more question available

Challenge Problems

Push your limits with advanced challenges

Challenge 1 Hard

A website has three different layouts, A, B, and C. Design an experiment using Thompson Sampling to identify the best layout for user engagement. Outline how you would gather data and update your probabilities.

💡 Hint: Think about how you'd weigh past clicks when making new choices.

Challenge 2 Hard

Consider an online ad campaign using Thompson Sampling. If the current click-through rate for three ads is 20%, 15%, and 10%, explain how you’d implement Thompson Sampling and justify your choice in the algorithm.

💡 Hint: Reflect on how would you update your initial beliefs with new data.

Get performance evaluation

Reference links

Supplementary resources to enhance your learning experience.