2020 Projects

Table of contents

RL Open Source Fest Projects

1. VW support for FlatBuff and/or Protobuf

VW has several file inputs, examples, cache and models. This project involves adding support for a modern serialization framework such as FlatBuff or ProtoBuff. This will enable easier interop, better stability and potentially increased performance.

Goals

Produce wiki page outlining design and usage
Design schemas
Load and save a model
Load examples from a file. Start by keeping labels stored as a string.
Utilities to inspect and convert to/from VW’s file formats
Produce wiki page outlining design and usage

Stretch Goals

Load and save the cache file
Schemas for structured label types
Benchmark and optimize performance

2. Contextual bandits data visualization with Jupyter notebooks

Build visualizations to help understand the behavior of Contextual Bandit policies and logs.

Goals

Vizualizations for:
- Action distribution
- Action/reward distribution by feature(s) or model used
- Model comparison
- Feature importance
Produce a synthetic dataset that highlight the usefulness of the visualizations

3. Parallelized parsing

Modern machines often utilize many threads to achieve performance. VW currently uses a single parse thread and a single learner thread, and parsing is often the bottleneck. Extending the parser to support many threads will allow us to better utilize resources.

Goals

Produce wiki page outlining design and usage
Extract parser to standalone component
Spawn threads for parse jobs
Ensure original ordering in datafile is preserved

Stretch Goals

Lock free synchronization of threads
Use all reduce to support multi threaded learning
Separate I/O threads from parse threads

4. VW server mode revamp

VW currently has daemon mode, which allows clients to send examples, train and model and receive predictions. This uses raw sockets and a custom binary protocol We want to provide a modern version of VW’s server mode utilizing a modern RPC technology.

Goals

Single model serving using GRPC with the following endpoints:
- Predict
- Learn
- Statistics (number of features, current loss, etc)
- Management (download current model, number of features)
Packaging tools to create docker containers from VW params and model
Wiki page describing how to use it

Stretch Goals

Persistent model storage.
Multiple models from a single daemon.

5. Improve VW’s Python experience

VW’s Python integration can be improved is several areas to make it easier for users. Supporting Pandas as a first class concept will make utilizing VW in experimentation workflows much more streamlined. Implementing IPython HTML representations for some common types will improve usability of these components.

Goals

Implement repr_html for examples, model and labels
Access to progressive validation and other model statistics
Pandas load and save from VW text format

Stretch Goals

Simplify example lifecycle

6. End-to-load local loop for reinforcement learning

The reinforcement learning library has extension points to allow for swapping out parts of the framework, however there is no simple way to make it work end to end locally at the moment. Making RLClientLib support prediction, logging, joining and training locally will make for a great prototyping tool.

Goals

In-memory joining and training
Extend configuration to enable local mode
Python and C# API support

Stretch Goals

Checkpointing - load and save model
Port some of our RLClientLib simulators to use the local loop

7. TensorWatch and TensorBoard integration

TensorBoard and TensorWatch are great tools for debugging and monitoring training making them a great choice for integrating with VW and RLClientLib.

Goals

Integrate VW training with TensorWatch all within a notebook
Extend VW to output TensorBoard logs
Extend RLClientLib to support TensorBoard and TensorWatch

Stretch Goals

Add lazy logging mode to VW and RLClientLib

8. ONNX operator set and model format for VW models

VW has its own runtime for running inference off of its own model files. However, ONNX is the emerging standard for defining models and supporting inference. This project enables VW models to interoperate with ONNX runtime.

Goals

Define ONNX.vw operation set for the reductions needed for classification (CSOAA)
Define shape of VW example in tensor format
Converter tool from vw model to ONNX model
Implement the new opset with ONNX runtime
Sample app that runs inference

Stretch Goals

Extend opset to Contextual Bandits
Export ONNX model directly from VW

9. Support implementation of a VW reduction in Python

All reductions in VW are implemented in C++. However, to allow for rapid prototyping and taking advantage of the Python ecosystem, using Python to do this makes sense.

Goals

Create interface that allows Python code to implement a base learner in VW
Implement a simple gradient descent base learner using SKlearn

Stretch Goals

Allow for the Python implemented reduction to be used at a different level of the reduction stack

10. Support Python implementations of RLClientLib extensibility points

RLCLientLib supports several points of extensibility, but these are only exposed in C++. When using RLCLientLib in Python it is important to be able to support these.

Goals

Support a custom model implementation in Python through the i_model interface
Create an example of using these locally

Stretch Goals

Support custom i_sender implementation for event logging
Support i_data_transport for retrieving updated models

11. Contextual bandit benchmark and competition

There exists many different contextual bandit algorithms. In order to compare these a standard benchmark would be useful. Use the contextual bandit bake off paper as a base and build a set of standard CB benchmarks and supporting infrastructure to competitively evaluate CB algorithms. This is similar to the GLUE benchmark for NLP.

Goals

Design CB experiments - start off with CB bakeoff paper
Create infrastructure to obtain datasets
Upload predictions to evaluate performance of algorithm
Visualization, display results and compare to others

Stretch Goals

Abstract what it means to be a CB algo to provide a more structured evaluation workflow

12. Library of contextual bandit estimators

Estimators are used in off policy evaluation. One common estimator is IPS, and others are DR and PseudoInverse. These estimators work better or worse in different settings. This project explores reference implementations of each and allows for comparison between them to aid in understanding. As a stretch goal it involves utilizing this common library of estimators in the existing counterfactual estimation module.

Goals

Add implementation of DR, and DR in episodic settings
Simulator interface that allows evaluation against logging policy and target policy
Generate a random logging policy and target policy to use for evaluation
Visualization of comparison

Stretch Goals

Pseudo inverse
Integrate into existing counterfactual evaluation framework

13. Scriptable feature engineering with Python

VW supports example manipulation through its command line. It provides a lot of flexibility but it’s hard to express anything beyond the fixed set of options. The idea is to enable example manipulation to be scripted in Python as a series of hooks in the parsing pipeline.

Goals

Allow hooks to inject and rewrite features into a namespace
Allow hooks to rewrite and manipulate namespaces

Stretch Goals

Allow hooks that manipulate multi-line examples (such as Contextual Bandits input) by adding or removing whole examples
Extend support for attaching C++ code to the parse pipeline hooks

2020 Projects

RL Open Source Fest Projects

1. VW support for FlatBuff and/or Protobuf

Goals

Stretch Goals

Links

2. Contextual bandits data visualization with Jupyter notebooks

Goals

3. Parallelized parsing

Goals

Stretch Goals

Links

4. VW server mode revamp

Goals

Stretch Goals

Links

5. Improve VW’s Python experience

Goals

Stretch Goals

Links

6. End-to-load local loop for reinforcement learning

Goals

Stretch Goals

Links

7. TensorWatch and TensorBoard integration

Goals

Stretch Goals

Links

8. ONNX operator set and model format for VW models

Goals

Stretch Goals

Links

9. Support implementation of a VW reduction in Python

Goals

Stretch Goals

10. Support Python implementations of RLClientLib extensibility points

Goals

Stretch Goals

Links

11. Contextual bandit benchmark and competition

Goals

Stretch Goals

Links

12. Library of contextual bandit estimators

Goals

Stretch Goals

Links

13. Scriptable feature engineering with Python

Goals

Stretch Goals

Links

RL Open Source Fest Projects