Get started Tutorials Research


Publication List

  1. Bietti, A., Agarwal, A., & Langford, J. (2018). A Contextual Bandit Bake-off. arXiv:1802.04064v3 [stat.ML]. Retrieved from Get .bib
  2. Cortes, D. (2018). Adapting multi-armed bandits policies to contextual bandits scenarios. CoRR, abs/1811.04383. Retrieved from Get .bib
  3. Agarwal, A., Bird, S., Cozowicz, M., Hoang, L., Langford, J., Lee, S., … Slivkins, A. (2016). A Multiworld Testing Decision Service. CoRR, abs/1606.03966. Retrieved from Get .bib
  4. Jiang, N., & Li, L. (2016). Doubly Robust Off-policy Value Evaluation for Reinforcement Learning. In Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19-24, 2016 (pp. 652–661). Retrieved from Get .bib
  5. Osband, I., & Roy, B. V. (2015). Bootstrapped Thompson Sampling and Deep Exploration. CoRR, abs/1507.00300. Retrieved from Get .bib
  6. Eckles, D., & Kaptein, M. (2014). Thompson sampling with the online bootstrap. CoRR, abs/1410.4009. Retrieved from Get .bib
  7. Agarwal, A., Hsu, D. J., Kale, S., Langford, J., Li, L., & Schapire, R. E. (2014). Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits. CoRR, abs/1402.0555. Retrieved from Get .bib
  8. Dudı́k Miroslav, Langford, J., & Li, L. (2011). Doubly Robust Policy Evaluation and Learning. In Proceedings of the 28th International Conference on Machine Learning, ICML 2011, Bellevue, Washington, USA, June 28 - July 2, 2011 (pp. 1097–1104). Retrieved from Get .bib
  9. Karampatziakis, N., & Langford, J. (2011). Online Importance Weight Aware Updates. In Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence (pp. 392–399). Arlington, Virginia, United States: AUAI Press. Retrieved from Get .bib
  10. Agarwal, A., Chapelle, O., Dudı́k Miroslav, & Langford, J. (2011). A Reliable Effective Terascale Linear Learning System. CoRR, abs/1110.4198. Retrieved from Get .bib
  11. Li, L., Chu, W., Langford, J., & Schapire, R. E. (2010). A Contextual-Bandit Approach to Personalized News Article Recommendation. CoRR, abs/1003.0146. Retrieved from Get .bib
  12. Shi, Q., Petterson, J., Dror, G., Langford, J., Smola, A., & Vishwanathan, S. V. N. (2009). Hash Kernels for Structured Data. J. Mach. Learn. Res., 10, 2615–2637. Retrieved from Get .bib
  13. Weinberger, K. Q., Dasgupta, A., Attenberg, J., Langford, J., & Smola, A. J. (2009). Feature Hashing for Large Scale Multitask Learning. CoRR, abs/0902.2206. Retrieved from Get .bib
  14. Horvitz, D. G., & Thompson, D. J. (1952). A Generalization of Sampling Without Replacement from a Finite Universe. Journal of the American Statistical Association, 47(260), 663–685. Get .bib