Get started Features Tutorials Blog Research
April 01, 2021

VowpalWabbit 8.10.0 Release Notes

author's avatar
Jack Gerrits

-q :: speedup

Using the wildcard (:) when doing quadratic interactions (-q ::) has been significantly sped up (#2807). This optimization only affects quadratics interactions for now, and not cubics or higher order interactions.

We saw a 35% speedup for one of the benchmarks which tests quadratic interactions. Additionally, we saw that the runtime for a CCB ADF run with a file of 347k examples and 3 interactions (Action, Slot, User) from ~11m41s to ~1m50s.

Initial ARM Support

We’ve added initial ARM support with this release. VW should now be able to build on ARM platforms, we are not yet supporting binary Python wheels on ARM yet though. The command line tool is now supported natively on Apple Silicon, and will be available once the Homebrew package is updated to 8.10.0.

Logging changes

We’ve made some steps to improve logging in VW, starting with unifying what we have on one system and adding log levels. This is still a bit of a work in progress, and so not all output quite follows it yet.

  • Progressive validation remains the same
  • Other logging messages have a log level prepended to the line

Goals of the logging work

  • Easier to understand warnings, info, errors, etc
  • Ability to filter by level
  • Machine readable log format
  • More sensible output when VW is used as a library

Comparison

Old
Num weight bits = 18
learning rate = 0.5
initial_t = 0
power_t = 0.5
using no cache
Reading datafile = train-sets/malformed.dat
num sources = 1
Enabled reductions: gd, scorer
average  since         example        example  current  current  current
loss     last          counter         weight    label  predict features
malformed example! '|',space, or EOL expected after : "| x:0.7"in Example #0: "| x:0.7"
malformed example! '|' or EOL expected after : "| x:0.7"in Example #0: "| x:0.7"
New
Num weight bits = 18
learning rate = 0.5
initial_t = 0
power_t = 0.5
using no cache
Reading datafile = train-sets/malformed.dat
num sources = 1
Enabled reductions: gd, scorer
average  since         example        example  current  current  current
loss     last          counter         weight    label  predict features
[warning] malformed example! '|',space, or EOL expected after : "| x:0.7"in Example #0: "| x:0.7"
[warning] malformed example! '|' or EOL expected after : "| x:0.7"in Example #0: "| x:0.7"

Experimental: Flatbuffers

Experimental support for Flatbuffer schematized examples as an input format has been added. Flatbuffers are a schematized binary format and should provide efficiency and portability when used. This is still experimental as we want to ensure the schema is complete for real world use, and because documentation is currently limited, not all binaries released support it and tooling to make it easier to work with is not there yet. The schema for the example objects can be found here, but the file itself contains a sequence of size prefixed such objects to allow streamed input.

When building from source support is disabled by default but can be enabled by passing -BUILD_FLATBUFFERS=ON to the CMake configure step.

Contextual Bandit Zeroth Order Optimization

Contextual Bandit Zeroth Order Optimization (CBZO) is a new reduction (contributed by @ajay0). CBZO is a contextual bandit-style algorithm meant for a multi-dimensional, continuous action space. It can learn different policies based on Zeroth-Order Optimization – continuous optimization techniques which make use of gradient estimators that only require values of the function to make an estimate. The variant of CBZO currently implemented in VW works in the 1-dimensional action space setting and can learn either constant or linear policies. The algorithm has optimal bounded regret when the cost function is smooth and convex.

Learn more at the wiki page.

Internal improvements

  • Progress towards label information not being required in predict calls and reducing the number of redundant predict calls done before a learn call takes place
  • New Python test runner script which supports parallelized tests
  • Overhaul v_array to be an RAII type
  • Enable prediction and label structures to be RAII types

Thank you

A huge thank you and welcome to all of the new contributors since the last release:

And of course thank you to existing contributors:

Full changelist