Research

Publication List

  1. Agarwal, A., Bird, S., Cozowicz, M., Hoang, L., Langford, J., Lee, S., … Slivkins, A. (2016). A Multiworld Testing Decision Service. CoRR, abs/1606.03966. Retrieved from http://arxiv.org/abs/1606.03966 Get .bib
  2. Li, L., Chu, W., Langford, J., & Schapire, R. E. (2010). A Contextual-Bandit Approach to Personalized News Article Recommendation. CoRR, abs/1003.0146. Retrieved from http://arxiv.org/abs/1003.0146 Get .bib
  3. Horvitz, D. G., & Thompson, D. J. (1952). A Generalization of Sampling Without Replacement from a Finite Universe. Journal of the American Statistical Association, 47(260), 663–685. https://doi.org/10.1080/01621459.1952.10483446 Get .bib
  4. Jiang, N., & Li, L. (2016). Doubly Robust Off-policy Value Evaluation for Reinforcement Learning. In Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19-24, 2016 (pp. 652–661). Retrieved from http://proceedings.mlr.press/v48/jiang16.html Get .bib
  5. Dudı́k Miroslav, Langford, J., & Li, L. (2011). Doubly Robust Policy Evaluation and Learning. In Proceedings of the 28th International Conference on Machine Learning, ICML 2011, Bellevue, Washington, USA, June 28 - July 2, 2011 (pp. 1097–1104). Retrieved from https://icml.cc/2011/papers/554_icmlpaper.pdf Get .bib
  6. Bietti, A., Agarwal, A., & Langford, J. (2018). A Contextual Bandit Bake-off. arXiv:1802.04064v3 [stat.ML]. Retrieved from https://www.microsoft.com/en-us/research/publication/a-contextual-bandit-bake-off-2/ Get .bib
  7. Karampatziakis, N., & Langford, J. (2011). Online Importance Weight Aware Updates. In Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence (pp. 392–399). Arlington, Virginia, United States: AUAI Press. Retrieved from http://dl.acm.org/citation.cfm?id=3020548.3020594 Get .bib
  8. Osband, I., & Roy, B. V. (2015). Bootstrapped Thompson Sampling and Deep Exploration. CoRR, abs/1507.00300. Retrieved from http://arxiv.org/abs/1507.00300 Get .bib
  9. Eckles, D., & Kaptein, M. (2014). Thompson sampling with the online bootstrap. CoRR, abs/1410.4009. Retrieved from http://arxiv.org/abs/1410.4009 Get .bib
  10. Agarwal, A., Hsu, D. J., Kale, S., Langford, J., Li, L., & Schapire, R. E. (2014). Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits. CoRR, abs/1402.0555. Retrieved from http://arxiv.org/abs/1402.0555 Get .bib
  11. Cortes, D. (2018). Adapting multi-armed bandits policies to contextual bandits scenarios. CoRR, abs/1811.04383. Retrieved from http://arxiv.org/abs/1811.04383 Get .bib
  12. Shi, Q., Petterson, J., Dror, G., Langford, J., Smola, A., & Vishwanathan, S. V. N. (2009). Hash Kernels for Structured Data. J. Mach. Learn. Res., 10, 2615–2637. Retrieved from http://dl.acm.org/citation.cfm?id=1577069.1755873 Get .bib
  13. Weinberger, K. Q., Dasgupta, A., Attenberg, J., Langford, J., & Smola, A. J. (2009). Feature Hashing for Large Scale Multitask Learning. CoRR, abs/0902.2206. Retrieved from http://arxiv.org/abs/0902.2206 Get .bib
  14. Agarwal, A., Chapelle, O., Dudı́k Miroslav, & Langford, J. (2011). A Reliable Effective Terascale Linear Learning System. CoRR, abs/1110.4198. Retrieved from http://arxiv.org/abs/1110.4198 Get .bib