This week, Zephy McKanna sets out to explain Multi-Armed bandits. Here at Wayfair, we are constantly testing new algorithms, product sort orders, marketing, and other messages using epsilon-first Multi-Armed Bandits – also known as A/B tests. We are also employing some more interesting MAB algorithms, like Thompson Sampling, to dynamically balance traffic during tests on low-traffic portions of the site (like holiday décor), as well as for some long-standing dynamic problems like choosing the top N among the ever-changing sales events to send out in emails to default customers.

Zephy is a cognitive scientist whose quest to model human behavior has led him to selling furniture online. In his current role as a Senior Data Scientist on the Personalization & Recommendations team at Wayfair,  Zephy primarily works with reinforcement learning techniques, which are satisfyingly adaptive to the quickly-changing e-commerce environment. He also loves to do partner acrobatics, play with cats, tell stories, and explore edge conditions in board games.