Reinforcement Learning  1.1
Introduction

Enterprises are constantly faced with decisions that require picking from a set of actions based on contextual information. Real world reinforcement-based techniques are effective tools in aiding decision making; they rely on free interaction data to "predict" and "learn". This library implements an interface for reinforcement-based prediction based on contextual bandits.

Reinforcement Learning Loop

Reinforcement Learning Loop

RL Inference API

This API allows the developer to perform inference (choosing an action from an action set) and to report the outcome of this decision. The inference library automatically sends the action set, the decision, and the outcome to an online trainer running in the Azure cloud. It also periodically refreshes its copy of the model produced by the online trainer.

API Usage: Basic steps to using the RL Inference API

reinforcement_learning::live_model class is the main driving interface of the RL Inference API

  • Instantiate and initialize live_model
    r::cb_loop rl(config);
    if (rl.init(&status) != err::success)
    {
    std::cout << status.get_error_msg() << std::endl;
    return -1;
    }
  • choose_rank() to choose an action from a list of actions (action set)
    // Response class
    r::ranking_response response;
    if (rl.choose_rank(event_id, context, response, &status) != err::success)
    {
    std::cout << status.get_error_msg() << std::endl;
    return -1;
    }
  • report_outcome() to provide the outcome of the chosen action (The default outcome is used when this call is not used)
    // Report received outcome (Optional: if this call is not made, default missing outcome is applied)
    // Missing outcome can be thought of as negative reinforcement
    if (rl.report_outcome(event_id, outcome, &status) != err::success)
    {
    std::cout << status.get_error_msg() << std::endl;
    return -1;
    }