Real World Reinforcement Learning with Vowpal Wabbit

NeurIPS 2019

Sunday December 8, 2019


Reinforcement learning is increasingly being used to solve real world personalization and optimization scenarios, with online, sample efficient algorithms such as Contextual Bandits. Companies such as Netflix ( and The New York Times ( are using Contextual Bandits to personalize content and optimize engagement. Across multiple deployments Microsoft uses Contextual Bandits, and recently released the Personalizer Azure Cognitive Service which is the world’s first real world reinforcement learning service.

Vowpal Wabbit is an open source machine learning library, extensively used by industry, and is the first public terascale learning system. It provides fast, scalable machine learning and has unique capabilities such as learning to search, active learning, contextual memory, and extreme multiclass learning. It has a focus on reinforcement learning and provides production ready implementations of Contextual Bandit algorithms. Vowpal Wabbit sees significant innovation as a research to production vehicle for Microsoft Research.

Come and learn about reinforcement learning, Vowpal Wabbit, and applying contextual bandits to problems using Vowpal Wabbit