Bibek

Home _^>>

Multi-armed Bandit

Bibek

April 14, 2019

In probability theory, the multi-armed bandit problem (sometimes called the K–^[1] or N-armed bandit problem^[2]) is a problem in which a fixed limited set of resources must be allocated between competing (alternative) choices in a way that maximizes their expected gain, when each choice’s properties are only partially known at the time of allocation, and may become better understood as time passes or by allocating resources to the choice.^[3]^[4] This is a classic reinforcement learning problem that exemplifies the exploration-exploitation tradeoff dilemma. The name comes from imagining a gambler at a row of slot machines(sometimes known as “one-armed bandits”), who has to decide which machines to play, how many times to play each machine and in which order to play them, and whether to continue with the current machine or try a different machine.^[5] The multi-armed bandit problem also falls into the broad category of stochastic scheduling.

In the problem, each machine provides a random reward from a probability distribution specific to that machine. The objective of the gambler is to maximize the sum of rewards earned through a sequence of lever pulls.^[3]^[4] The crucial tradeoff the gambler faces at each trial is between “exploitation” of the machine that has the highest expected payoff and “exploration” to get more information about the expected payoffs of the other machines. The trade-off between exploration and exploitation is also faced in machine learning. In practice, multi-armed bandits have been used to model problems such as managing research projects in a large organization like a science foundation or a pharmaceutical company.^[3]^[4] In early versions of the problem, the gambler begins with no initial knowledge about the machines.

Good Read at below

Multi-Armed Bandit (MAB) – A/B Testing Sans Regret

Bibek

Multi-armed Bandit

Multi-armed Bandit

Like this:

Tags

Leave a ReplyCancel reply

Search

Categories

Tags

Editors Pick

Success Can Come at Any Age. Just Look at These 6 Successful Entrepreneurs.

Storing data in DNA is a lot easier than getting it back out

India’s Greatest Scientists Who Never Won A Nobel Prize, Despite Their Contribution To Science

Success Can Come at Any Age. Just Look at These 6 Successful Entrepreneurs.

Bibek

Tags

Latest Posts

Success Can Come at Any Age. Just Look at These 6 Successful Entrepreneurs.

Storing data in DNA is a lot easier than getting it back out

India’s Greatest Scientists Who Never Won A Nobel Prize, Despite Their Contribution To Science

Multi-armed Bandit

Multi-armed Bandit

Share this:

Like this:

Tags

Leave a ReplyCancel reply

Search

Categories

Tags

Editors Pick

Tags

Latest Posts