{"id":98,"date":"2024-10-23T02:20:06","date_gmt":"2024-10-23T02:20:06","guid":{"rendered":"https:\/\/riespw.com\/blog\/?p=98"},"modified":"2024-10-23T02:20:06","modified_gmt":"2024-10-23T02:20:06","slug":"ml-bayesian-bandit","status":"publish","type":"post","link":"https:\/\/riespw.com\/blog\/2024\/10\/23\/ml-bayesian-bandit\/","title":{"rendered":"ML: Bayesian Bandit"},"content":{"rendered":"\n<p><a href=\"https:\/\/lazyprogrammer.me\/bayesian-bandit-tutorial\/\" target=\"_blank\" rel=\"noopener\" title=\"\">Reference<\/a><\/p>\n\n\n\n<p class=\"has-text-align-center\"><strong>the problem<\/strong><\/p>\n\n\n\n<p>Suppose you are at a casino and have a choice between N slot machines. Each of the N slot machines (bandits) has an unknown probability of letting you win. i.e. Bandit 1 may have P(win) = 0.9. Bandit 2 may have P(win) = 0.3. We wish to maximize our winnings by playing the machine which has the highest probability of winning. The problem is determining which machine this is without playing each machine million times so we can maximize the profit!<\/p>\n\n\n\n<p><\/p>\n\n\n\n<p class=\"has-text-align-center\"><strong>the general idea<\/strong><\/p>\n\n\n\n<p>The idea: let\u2019s not pull each arm 1000 times to get an accurate estimate of its probability of winning. Instead, let\u2019s use the data we\u2019ve collected so far to determine which arm to pull. If an arm doesn\u2019t win that often, but we haven\u2019t sampled it too much, then its probability of winning is low, but our\u00a0<em>confidence<\/em>\u00a0in that estimate is also low \u2013 so let\u2019s give it a small chance in the future. However, if we\u2019ve pulled an arm many times and it wins often, then its probability of winning is high, and our confidence in that estimate is also high \u2013 let\u2019s give that arm a higher chance of being pulled.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<p class=\"has-text-align-center\"><strong>the result<\/strong><\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"193\" src=\"https:\/\/riespw.com\/blog\/wp-content\/uploads\/2024\/10\/bayesian-bandit-result-1024x193.png\" alt=\"\" class=\"wp-image-99\" srcset=\"https:\/\/riespw.com\/blog\/wp-content\/uploads\/2024\/10\/bayesian-bandit-result-1024x193.png 1024w, https:\/\/riespw.com\/blog\/wp-content\/uploads\/2024\/10\/bayesian-bandit-result-300x57.png 300w, https:\/\/riespw.com\/blog\/wp-content\/uploads\/2024\/10\/bayesian-bandit-result-768x145.png 768w, https:\/\/riespw.com\/blog\/wp-content\/uploads\/2024\/10\/bayesian-bandit-result-1536x290.png 1536w, https:\/\/riespw.com\/blog\/wp-content\/uploads\/2024\/10\/bayesian-bandit-result.png 2000w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p class=\"has-text-align-center\">~~~<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Reference the problem Suppose you are at a casino and have a choice between N slot machines. Each of the N slot machines (bandits) has an unknown probability of letting you win. i.e. Bandit 1 may have P(win) = 0.9. Bandit 2 may have P(win) = 0.3. We wish to maximize our winnings by playing [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[7],"tags":[14],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/riespw.com\/blog\/wp-json\/wp\/v2\/posts\/98"}],"collection":[{"href":"https:\/\/riespw.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/riespw.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/riespw.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/riespw.com\/blog\/wp-json\/wp\/v2\/comments?post=98"}],"version-history":[{"count":1,"href":"https:\/\/riespw.com\/blog\/wp-json\/wp\/v2\/posts\/98\/revisions"}],"predecessor-version":[{"id":100,"href":"https:\/\/riespw.com\/blog\/wp-json\/wp\/v2\/posts\/98\/revisions\/100"}],"wp:attachment":[{"href":"https:\/\/riespw.com\/blog\/wp-json\/wp\/v2\/media?parent=98"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/riespw.com\/blog\/wp-json\/wp\/v2\/categories?post=98"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/riespw.com\/blog\/wp-json\/wp\/v2\/tags?post=98"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}