vowpalwabbit

The core functionality of the package is available in this root module. A small number of advanced usage classes are only available in vowpalwabbit.pyvw.

Example usage

from vowpalwabbit import Workspace, Example
workspace = Workspace(quiet=True)
ex = Example('1 | a b c')
workspace.learn(ex)
workspace.predict(ex)

Module contents

Python interfaces for VW

class vowpalwabbit.AbstractLabel

Bases: object

An abstract class for a VW label.

__init__()
static from_example(ex)

Extract a label from the given example.

Parameters:

ex (Example) – example from which label is to be extracted

Return type:

AbstractLabel

class vowpalwabbit.ActionScore(action, score)

Bases: object

__init__(action, score)
class vowpalwabbit.CBContinuousLabel(costs=[])

Bases: AbstractLabel

Class for cb_continuous VW label

__init__(costs=[])
static from_example(ex)

Extract a label from the given example.

Parameters:

ex (Example) – example from which label is to be extracted

class vowpalwabbit.CBContinuousLabelElement(action=None, cost=0.0, pdf_value=0.0)

Bases: object

__init__(action=None, cost=0.0, pdf_value=0.0)
class vowpalwabbit.CBEvalLabel(action, cb_label)

Bases: AbstractLabel

Class for contextual bandits eval VW label

__init__(action, cb_label)
static from_example(ex)

Extract a label from the given example.

Parameters:

ex (Example) – example from which label is to be extracted

class vowpalwabbit.CBLabel(costs=[], weight=1.0)

Bases: AbstractLabel

Class for contextual bandits VW label

__init__(costs=[], weight=1.0)
static from_example(ex)

Extract a label from the given example.

Parameters:

ex (Example) – example from which label is to be extracted

class vowpalwabbit.CBLabelElement(action=None, cost=0.0, partial_prediction=0.0, probability=0.0, **kwargs)

Bases: object

__init__(action=None, cost=0.0, partial_prediction=0.0, probability=0.0, **kwargs)
class vowpalwabbit.CCBLabel(type=CCBLabelType.UNSET, explicit_included_actions=[], weight=1, outcome=None)

Bases: AbstractLabel

Class for conditional contextual bandits VW label

__init__(type=CCBLabelType.UNSET, explicit_included_actions=[], weight=1, outcome=None)
static from_example(ex)

Extract a label from the given example.

Parameters:

ex (Example) – example from which label is to be extracted

class vowpalwabbit.CCBLabelType(value)

Bases: IntEnum

An enumeration.

ACTION = 1
SHARED = 0
SLOT = 2
UNSET = 3
class vowpalwabbit.CCBSlotOutcome(cost, action_probs)

Bases: object

__init__(cost, action_probs)
class vowpalwabbit.CostSensitiveElement(label, cost=0.0, partial_prediction=0.0, wap_value=0.0)

Bases: object

__init__(label, cost=0.0, partial_prediction=0.0, wap_value=0.0)
class vowpalwabbit.CostSensitiveLabel(costs=[], prediction=0.0)

Bases: AbstractLabel

Class for cost sensative VW label

__init__(costs=[], prediction=0.0)
static from_example(ex)

Extract a label from the given example.

Parameters:

ex (Example) – example from which label is to be extracted

class vowpalwabbit.Example(vw, initStringOrDictOrRawExample=None, labelType=None)

Bases: example

The example class is a wrapper around pylibvw.example. pylibvw.example should not be used. Most of the wrapping is to make the interface easier to use (by making the types safer via NamespaceId) and also with added python-specific functionality.

__init__(vw, initStringOrDictOrRawExample=None, labelType=None)

Construct a new example from vw.

Parameters:
  • vw (Workspace) – Owning workspace of this example object

  • initStringOrDictOrRawExample (Union[str, Dict[str, List[Union[Tuple[Union[str, int], float], str, int]]], Dict[str, Dict[Union[str, int], float]], Any, example, None]) –

    Content to initialize the example with.

    • If None, created as an empty example

    • If a string, parsed as a VW example string

    • If a pylibvw.example, wraps the native example. This is advanced and should rarely be used.

    • If is a callable object will be called until it is no longer callable. At that point it should be another of the supported types.

      Deprecated since version 9.0.0: Using a callable object is no longer supported.

    • If a dict, the keys are the namespace names and the values are the namespace features. Namespace features can either be represented as a list or a dict. When using a list items are either keys (i.e., an int or string) in which case the value is assumed to be 1 or a key-value tuple. When using a dict the all features are represented as key-value pairs.

  • labelType (Union[int, LabelType, None]) –

    Which label type this example contains. If None (or 0), the label type is inferred from the workspace configuration.

    Deprecated since version 9.0.0: Supplying an integer is no longer supported. Use the LabelType enum instead.

See also

Workspace

ensure_namespace_exists(ns)

Check to see if a namespace already exists.

Parameters:

ns (Union[NamespaceId, str, int]) – If namespace exists does, do nothing. If it doesn’t, add it.

feature(ns, i)

Get the i-th hashed feature id in a given namespace

Parameters:
  • ns (Union[NamespaceId, str, int]) – namespace used to get the feature

  • i (int) – to get i-th hashed feature id in a given ns. It must range from 0 to self.num_features_in(ns)-1

Return type:

int

Returns:

i-th hashed feature-id in a given ns

feature_weight(ns, i)

Get the value(weight) associated with a given feature id

Parameters:
  • ns (Union[NamespaceId, str, int]) – namespace used to get the feature id

  • i (int) – to get the weight of i-th feature in the given ns. It must range from 0 to self.num_features_in(ns)-1

Return type:

float

Returns:

weight(value) of the i-th feature of given ns

finished: bool
get_feature_id(ns, feature, ns_hash=None)

Get the hashed feature id for a given feature in a given namespace. feature can either be an integer (already a feature id) or a string, in which case it is hashed.

Parameters:
  • ns (Union[NamespaceId, str, int]) – namespace used to get the feature

  • feature (Union[int, str]) – If integer the already a feature else will be hashed

  • ns_hash (Optional[int]) – The hash of the namespace. Optional.

Return type:

int

Returns:

Hashed feature id

Note

If –hash all is on, then get_feature_id(ns,”5”) != get_feature_id(ns, 5). If you’ve already hashed the namespace, you can optionally provide that value to avoid re-hashing it.

get_label(label_class=None)

Get the label object of this example.

Parameters:

label_class (Union[int, LabelType, Type[AbstractLabel], None]) –

  • If None, self.labelType will be used.

  • If int then corresponding LabelType for the label type to be retrieved.

  • The ability to pass an AbstractLabel or an int are legacy requirements and are deprecated. All new usage of this function should pass a LabelType.

Return type:

Union[AbstractLabel, SimpleLabel, MulticlassLabel, CostSensitiveLabel, CBLabel, CCBLabel, SlatesLabel, CBContinuousLabel]

get_ns(id)

Construct a NamespaceId

Argss:

id (NamespaceId/str/integer): id used to create namespace

Return type:

NamespaceId

Returns:

NamespaceId created using parameter passed(if id was NamespaceId,

just return it directly)

get_prediction(prediction_type=None)

Get prediction object from this example.

Parameters:

prediction_type (Union[int, PredictionType, None]) –

  • If None, the label type of the example’s owning Workspace instance will be used.

  • If int then corresponding PredictionType for the prediction type to be retrieved.

  • Supplying an int is deprecated and will be removed in a future release.

Return type:

Union[float, List[float], int, List[int], List[List[Tuple[int, float]]], Tuple[int, float], List[Tuple[float, float, float]], Tuple[int, List[int]], None]

Returns:

Prediction according to parameter prediction_type

Examples

>>> from vowpalwabbit import Workspace, PredictionType
>>> vw = Workspace(quiet=True)
>>> ex = vw.example('1 |a two features |b more features here')
>>> ex.get_prediction()
0.0
iter_features()

Iterate over all feature/value pairs in this example (all namespace included).

Return type:

Iterator[Tuple[int, float]]

labelType: LabelType
learn()

Learn on this example (and before learning, automatically call setup_example if the example hasn’t yet been setup).

num_features_in(ns)

Get the total number of features in a given namespace

Parameters:

ns (Union[NamespaceId, str, int]) – namespace Get the total features of this namespace

Return type:

int

Returns:

Total number of features in the given ns

pop_feature(ns)

Remove the top feature from a given namespace

Parameters:

ns (Union[NamespaceId, str, int]) – namespace from which feature is popped

Return type:

bool

Returns:

True if feature was removed else False as no feature was there to pop

pop_namespace()

Remove the top namespace from an example

Return type:

bool

Returns:

True if namespace was removed else False as no namespace was there to pop

push_feature(ns, feature, v=1.0, ns_hash=None)

Add an unhashed feature to a given namespace

Parameters:
  • ns (Union[NamespaceId, str, int]) – namespace in which the feature is to be pushed

  • f – feature

  • v (float) – The value of the feature, be default is 1.0

  • ns_hash (Optional[int]) – Optional, by default is None The hash of the namespace

Return type:

None

push_features(ns, featureList)

Push a list of features to a given namespace.

Parameters:
  • ns (Union[NamespaceId, str, int]) – namespace in which the features are pushed

  • featureList (List[Union[Tuple[Union[str, int], float], str, int]]) – Each feature in the list can either be an integer (already hashed) or a string (to be hashed) and may be paired with a value or not (if not, the value is assumed to be 1.0

Examples

>>> from vowpalwabbit import Workspace
>>> vw = Workspace(quiet=True)
>>> ex = vw.example('1 |a two features |b more features here')
>>> ex.push_features('x', ['a', 'b'])
>>> ex.push_features('y', [('c', 1.), 'd'])
>>> space_hash = vw.hash_space('x')
>>> feat_hash  = vw.hash_feature('a', space_hash)
>>> ex.push_features('x', [feat_hash]) #'x' should match the space_hash!
>>> ex.num_features_in('x')
3
>>> ex.num_features_in('y')
2
push_hashed_feature(ns, f, v=1.0)

Add a hashed feature to a given namespace.

Parameters:
  • ns (Union[NamespaceId, str, int]) – namespace namespace in which the feature is to be pushed

  • f (int) – integer feature

  • v (float) – float The value of the feature, be default is 1.0

Return type:

None

push_namespace(ns)

Push a new namespace onto this example. You should only do this if you’re sure that this example doesn’t already have the given namespace

Parameters:

ns (Union[NamespaceId, str, int]) – namespace which is to be pushed onto example

Return type:

None

set_label_string(string)

Give this example a new label

Parameters:

string (str) – a new label to this example, formatted as a string (ala the VW data file format)

Return type:

None

setup_done: bool
setup_example()

If this example hasn’t already been setup (ie, quadratic features constructed, etc.), do so.

stride: int
sum_feat_sq(ns)

Get the total sum feature-value squared for a given namespace

Parameters:

ns (Union[NamespaceId, str, int]) – namespace Get the total sum feature-value squared of this namespace

Return type:

float

Returns:

Total sum feature-value squared of the given ns

unsetup_example()

If this example has been setup, reverse that process so you can continue editing the examples.

vw: Workspace
class vowpalwabbit.ExampleNamespace(ex, ns, ns_hash=None)

Bases: object

The ExampleNamespace class is a helper class that allows you to extract namespaces from examples and operate at a namespace level rather than an example level. Mainly this is done to enable indexing like ex[‘x’][0] to get the 0th feature in namespace ‘x’ in example ex.

__init__(ex, ns, ns_hash=None)

Construct an ExampleNamespace

Parameters:
  • ex (Example) – examples from which namespace is to be extracted

  • ns (NamespaceId) – Target namespace

  • ns_hash (Optional[int]) – The hash of the namespace

iter_features()

iterate over all feature/value pairs in this namespace.

Return type:

Iterator[Tuple[int, float]]

num_features_in()

Return the total number of features in this namespace.

Return type:

int

pop_feature()

Remove the top feature from the current namespace; returns True if a feature was removed, returns False if there were no features to pop.

push_feature(feature, v=1.0)

Add an unhashed feature to the current namespace (fails if setup has already run on this example).

Parameters:
  • feature (Union[str, int]) – Feature to be pushed to current namespace

  • v (float) – Feature value, by default is 1.0

push_features(feature_list, feature_list_legacy=None)

Push a list of features to a given namespace.

Parameters:

feature_list (List[Union[Tuple[Union[str, int], float], Union[str, int]]]) – Each feature in the list can either be an integer (already hashed) or a string (to be hashed) and may be paired with a value or not (if not, the value is assumed to be 1.0).

Examples

See vowpalwabbit.Example.push_features() for examples.

This function used to have a ns argument that never did anything and a featureList argument. The ns argument has been removed, so inly a feature list should be passed now. The function checks if the old way of calling was used and issues a warning.

class vowpalwabbit.LabelType(value)

Bases: IntEnum

An enumeration.

CONDITIONAL_CONTEXTUAL_BANDIT = 6
CONTEXTUAL_BANDIT = 4
CONTEXTUAL_BANDIT_EVAL = 9
CONTINUOUS = 8
COST_SENSITIVE = 3
MULTICLASS = 2
MULTILABEL = 10
SIMPLE = 1
SLATES = 7
class vowpalwabbit.MulticlassLabel(label=1, weight=1.0, prediction=1)

Bases: AbstractLabel

Class for multiclass VW label with prediction

__init__(label=1, weight=1.0, prediction=1)
static from_example(ex)

Extract a label from the given example.

Parameters:

ex (Example) – example from which label is to be extracted

class vowpalwabbit.MulticlassProbabilitiesLabel(prediction=None)

Bases: AbstractLabel

Class for multiclass VW label with probabilities

__init__(prediction=None)
static from_example(ex)

Extract a label from the given example.

Parameters:

ex (Example) – example from which label is to be extracted

class vowpalwabbit.MultilabelLabel(labels)

Bases: AbstractLabel

Class for multilabel VW label

__init__(labels)
static from_example(ex)

Extract a label from the given example.

Parameters:

ex (Example) – example from which label is to be extracted

class vowpalwabbit.NamespaceId(ex, id)

Bases: object

The NamespaceId class is simply a wrapper to convert between hash spaces referred to by character (eg ‘x’) versus their index in a particular example. Mostly used internally, you shouldn’t really need to touch this.

__init__(ex, id)

Given an example and an id, construct a NamespaceId.

Parameters:
  • ex (Example) – example used to create a namespace id

  • id (Union[int, str]) –

    • If int, uses that as an index into this Examples list of feature groups to get the namespace id character

    • If str, uses the first character as the namespace id character

id: Optional[int]

Index into the list of example feature groups for this given namespace

ns: str

Single character respresenting the namespace index

ord_ns: int

Integer representation of the ns field

class vowpalwabbit.PredictionType(value)

Bases: IntEnum

An enumeration.

ACTION_PDF_VALUE = 9
ACTION_PROBS = 3
ACTION_SCORES = 2
ACTIVE_MULTICLASS = 11
DECISION_SCORES = 8
MULTICLASS = 4
MULTICLASSPROBS = 7
MULTILABELS = 5
NOPRED = 12
PDF = 10
PROB = 6
SCALAR = 0
SCALARS = 1
class vowpalwabbit.SimpleLabel(label=0.0, weight=1.0, initial=0.0, prediction=0.0)

Bases: AbstractLabel

Class for simple VW label

__init__(label=0.0, weight=1.0, initial=0.0, prediction=0.0)
static from_example(ex)

Extract a label from the given example.

Parameters:

ex (Example) – example from which label is to be extracted

class vowpalwabbit.SlatesLabel(type=SlatesLabelType.UNSET, weight=1.0, labeled=False, cost=0.0, slot_id=0, probabilities=[])

Bases: AbstractLabel

Class for slates VW label

__init__(type=SlatesLabelType.UNSET, weight=1.0, labeled=False, cost=0.0, slot_id=0, probabilities=[])
static from_example(ex)

Extract a label from the given example.

Parameters:

ex (Example) – example from which label is to be extracted

class vowpalwabbit.SlatesLabelType(value)

Bases: IntEnum

An enumeration.

ACTION = 1
SHARED = 0
SLOT = 2
UNSET = 3
class vowpalwabbit.Workspace(arg_str=None, enable_logging=False, arg_list=None, **kw)

Bases: vw

Workspace exposes most of the library functionality. It wraps the native code. The Workspace Python class should always be used instead of the binding glue class.

__init__(arg_str=None, enable_logging=False, arg_list=None, **kw)

Initialize the Workspace object. arg_str, arg_list and the kwargs will be merged together. Duplicates will result in duplicate values in the command line.

Parameters:
  • arg_str (Optional[str]) – The command line arguments to initialize VW with, for example “–audit”. This list is naively split by spaces. To control the splitting behavior please pass a list of strings to arg_list instead.

  • enable_logging (bool) – Enable captured logging. By default is False. This must be True to be able to call get_log()

  • arg_list (Optional[List[str]]) – List of tokens that comprise the command line.

  • **kw – Using key/value pairs for different options available. Using this append an option to the command line in the form of “–key value”, or in the case of a bool “–key” if true.

Examples

>>> from vowpalwabbit import Workspace
>>> vw1 = Workspace('--audit')
>>> vw2 = Workspace(audit=True, b=24, k=True, c=True, l2=0.001)
>>> vw3 = Workspace("--audit", b=26)
>>> vw4 = Workspace(q=["ab", "ac"])
>>> vw4 = Workspace(arg_list=["--audit", "--interactions", "ab"])
example(stringOrDict=None, labelType=None)

Helper function to create an example object associated with this Workspace instance.

Parameters:
Return type:

Example

Returns:

Constructed Example

finish()

stop VW by calling finish (and, eg, write weights to disk)

Return type:

None

finish_example(ex)

Every example that is created with parse(), example(), or Example, should be passed to this method when you are finished with them.

This will return them to the Workspace instance to be reused and it will update internal statistics. If you care about statistics of used Examples then you should only use them once before passing them to finish.

Parameters:

ex (Union[Example, List[Example]]) – example or examples to be finished

Return type:

None

finished: bool
get_config(filtered_enabled_reductions_only=True)
get_label_type()
Return type:

LabelType

get_log()

Get all log messages produced. One line per item in the list. To get the complete log including run results, this should be called after finish()

Raises:

Exception – Raises an exception if this function is called but the init function was called without setting enable_logging to True

Return type:

List[str]

Returns:

A list of strings, where each item is a line in the log

get_prediction_type()
Return type:

PredictionType

get_weight(index, offset=0)

Get the weight at a particular position in the (learned) weight vector.

Parameters:
  • index (int) – position in the learned weight vector

  • offset (int) – By default is 0

Returns:

Weight at the given index

Return type:

float

get_weight_from_name(feature_name, namespace_name=' ')

Get the weight based on the feature name and the namespace name.

Args

feature_name: The name of the feature namespace_name: The name of the namespace where the feature lives

Returns:

Weight for the given feature and namespace name

Return type:

float

init: bool
init_search_task(search_task, task_data=None)
learn(ec)

Perform an online update

Parameters:

ec (Union[Example, List[Example], str, List[str]]) – Examples on which the model gets updated. If using a single object the learner must be a single line learner. If using a list of objects the learner must be a multiline learner. If passing strings they are parsed using parse() before being learned from. If passing Example objects then they must be given to finish_example() at a later point.

Return type:

None

num_weights()

Get length of weight vector.

Return type:

int

parse(str_ex, labelType=None)

Returns a collection of examples for a multiline example learner or a single example for a single example learner.

Parameters:
  • str_ex – str/list of str string representing examples. If the string is multiline then each line is considered as an example. In case of list, each string element is considered as an example

  • labelType (Union[int, LabelType, None]) – The direct integer value of the LabelType enum can be used or the enum directly. Supplying 0 or None means to use the default label type based on the setup VW learner.

Examples

>>> from vowpalwabbit import Workspace
>>> model = Workspace(quiet=True)
>>> ex = model.parse("0:0.1:0.75 | a:0.5 b:1 c:2")
>>> type(ex)
<class 'vowpalwabbit.pyvw.Example'>
>>> model = Workspace(quiet=True, cb_adf=True)
>>> ex = model.parse(["| a:1 b:0.5", "0:0.1:0.75 | a:0.5 b:1 c:2"])
>>> type(ex)
<class 'list'>
>>> len(ex) # Shows the multiline example is parsed
2
Return type:

Union[Example, List[Example]]

Returns:

Either a single example or list of examples.

parser_ran: bool
predict(ec, prediction_type=None)

Make a prediction on the example

Parameters:
  • ec (Union[Example, List[Example], str, List[str]]) – Examples of which to get a prediction from. If using a single object the learner must be a single line learner. If using a list of objects the learner must be a multiline learner. If passing strings they are parsed using parse() before being predicted on. If passing Example objects then they must be given to finish_example() at a later point.

  • prediction_type (Union[int, PredictionType, None]) – If none, use the prediction type of the example object. This is usually what is wanted. To request a specific type a value can be supplied here.

Returns:

Prediction based on the given example

save(filename)

save model to disk

Return type:

None