vowpalwabbit

The core functionality of the package is available in this root module. A small number of advanced usage classes are only available in vowpalwabbit.pyvw.

Example usage

from vowpalwabbit import Workspace, Example
workspace = Workspace(quiet=True)
ex = Example('1 | a b c')
workspace.learn(ex)
workspace.predict(ex)

Module contents

Python interfaces for VW

class vowpalwabbit.AbstractLabel

Bases: object

An abstract class for a VW label.

__init__()
static from_example(ex)

Extract a label from the given example.

Parameters

ex (Example) – example from which label is to be extracted

Return type

AbstractLabel

class vowpalwabbit.ActionScore(action, score)

Bases: object

__init__(action, score)
class vowpalwabbit.CBContinuousLabel(costs=[])

Bases: AbstractLabel

Class for cb_continuous VW label

__init__(costs=[])
static from_example(ex)

Extract a label from the given example.

Parameters

ex (Example) – example from which label is to be extracted

class vowpalwabbit.CBContinuousLabelElement(action=None, cost=0.0, pdf_value=0.0)

Bases: object

__init__(action=None, cost=0.0, pdf_value=0.0)
class vowpalwabbit.CBEvalLabel(action, cb_label)

Bases: AbstractLabel

Class for contextual bandits eval VW label

__init__(action, cb_label)
static from_example(ex)

Extract a label from the given example.

Parameters

ex (Example) – example from which label is to be extracted

class vowpalwabbit.CBLabel(costs=[], weight=1.0)

Bases: AbstractLabel

Class for contextual bandits VW label

__init__(costs=[], weight=1.0)
static from_example(ex)

Extract a label from the given example.

Parameters

ex (Example) – example from which label is to be extracted

class vowpalwabbit.CBLabelElement(action=None, cost=0.0, partial_prediction=0.0, probability=0.0, **kwargs)

Bases: object

__init__(action=None, cost=0.0, partial_prediction=0.0, probability=0.0, **kwargs)
class vowpalwabbit.CCBLabel(type=CCBLabelType.UNSET, explicit_included_actions=[], weight=1, outcome=None)

Bases: AbstractLabel

Class for conditional contextual bandits VW label

__init__(type=CCBLabelType.UNSET, explicit_included_actions=[], weight=1, outcome=None)
static from_example(ex)

Extract a label from the given example.

Parameters

ex (Example) – example from which label is to be extracted

class vowpalwabbit.CCBLabelType(value)

Bases: IntEnum

An enumeration.

ACTION = 1
SHARED = 0
SLOT = 2
UNSET = 3
class vowpalwabbit.CCBSlotOutcome(cost, action_probs)

Bases: object

__init__(cost, action_probs)
class vowpalwabbit.CostSensitiveElement(label, cost=0.0, partial_prediction=0.0, wap_value=0.0)

Bases: object

__init__(label, cost=0.0, partial_prediction=0.0, wap_value=0.0)
class vowpalwabbit.CostSensitiveLabel(costs=[], prediction=0.0)

Bases: AbstractLabel

Class for cost sensative VW label

__init__(costs=[], prediction=0.0)
static from_example(ex)

Extract a label from the given example.

Parameters

ex (Example) – example from which label is to be extracted

class vowpalwabbit.Example(vw, initStringOrDictOrRawExample=None, labelType=None)

Bases: example

The example class is a wrapper around pylibvw.example. pylibvw.example should not be used. Most of the wrapping is to make the interface easier to use (by making the types safer via NamespaceId) and also with added python-specific functionality.

__init__(vw, initStringOrDictOrRawExample=None, labelType=None)

Construct a new example from vw.

Parameters
  • vw (Workspace) – Owning workspace of this example object

  • initStringOrDictOrRawExample (Union[str, Dict[str, List[Union[Tuple[Union[str, int], float], str, int]]], Any, example, None]) –

    Content to initialize the example with.

    • If None, created as an empty example

    • If a string, parsed as a VW example string

    • If a pylibvw.example, wraps the native example. This is advanced and should rarely be used.

    • If is a callable object will be called until it is no longer callable. At that point it should be another of the supported types.

      Deprecated since version 9.0.0: Using a callable object is no longer supported.

    • If a dict, the keys are the namespace names and the values can either be an integer (already hashed) or a string (to be hashed) and may be paired with a value or not

    (if not, the value is assumed to be 1.0).

  • labelType (Union[int, LabelType, None]) –

    Which label type this example contains. If None (or 0), the label type is inferred from the workspace configuration.

    Deprecated since version 9.0.0: Supplying an integer is no longer supported. Use the LabelType enum instead.

See also

Workspace

ensure_namespace_exists(ns)

Check to see if a namespace already exists.

Parameters

ns (Union[NamespaceId, str, int]) – If namespace exists does, do nothing. If it doesn’t, add it.

feature(ns, i)

Get the i-th hashed feature id in a given namespace

Parameters
  • ns (Union[NamespaceId, str, int]) – namespace used to get the feature

  • i (int) – to get i-th hashed feature id in a given ns. It must range from 0 to self.num_features_in(ns)-1

Return type

int

Returns

i-th hashed feature-id in a given ns

feature_weight(ns, i)

Get the value(weight) associated with a given feature id

Parameters
  • ns (Union[NamespaceId, str, int]) – namespace used to get the feature id

  • i (int) – to get the weight of i-th feature in the given ns. It must range from 0 to self.num_features_in(ns)-1

Return type

float

Returns

weight(value) of the i-th feature of given ns

finished
get_feature_id(ns, feature, ns_hash=None)

Get the hashed feature id for a given feature in a given namespace. feature can either be an integer (already a feature id) or a string, in which case it is hashed.

Parameters
  • ns (Union[NamespaceId, str, int]) – namespace used to get the feature

  • feature (Union[int, str]) – If integer the already a feature else will be hashed

  • ns_hash (Optional[int]) – The hash of the namespace. Optional.

Return type

int

Returns

Hashed feature id

Note

If –hash all is on, then get_feature_id(ns,”5”) != get_feature_id(ns, 5). If you’ve already hashed the namespace, you can optionally provide that value to avoid re-hashing it.

get_label(label_class=None)

Get the label object of this example.

Parameters

label_class (Union[int, LabelType, Type[AbstractLabel], None]) –

  • If None, self.labelType will be used.

  • If int then corresponding LabelType for the label type to be retrieved.

  • The ability to pass an AbstractLabel or an int are legacy requirements and are deprecated. All new usage of this function should pass a LabelType.

Return type

Union[AbstractLabel, SimpleLabel, MulticlassLabel, CostSensitiveLabel, CBLabel, CCBLabel, SlatesLabel, CBContinuousLabel]

get_ns(id)

Construct a NamespaceId

Argss:

id (NamespaceId/str/integer): id used to create namespace

Return type

NamespaceId

Returns

NamespaceId created using parameter passed(if id was NamespaceId,

just return it directly)

get_prediction(prediction_type=None)

Get prediction object from this example.

Parameters

prediction_type (Union[int, PredictionType, None]) –

  • If None, the label type of the example’s owning Workspace instance will be used.

  • If int then corresponding PredictionType for the prediction type to be retrieved.

  • Supplying an int is deprecated and will be removed in a future release.

Return type

Union[float, List[float], int, List[int], List[List[Tuple[int, float]]], Tuple[int, float], List[Tuple[float, float, float]], Tuple[int, List[int]], str]

Returns

Prediction according to parameter prediction_type

Examples

>>> from vowpalwabbit import Workspace, PredictionType
>>> vw = Workspace(quiet=True)
>>> ex = vw.example('1 |a two features |b more features here')
>>> ex.get_prediction()
0.0
iter_features()

Iterate over all feature/value pairs in this example (all namespace included).

Return type

Iterator[Tuple[int, float]]

labelType
learn()

Learn on this example (and before learning, automatically call setup_example if the example hasn’t yet been setup).

num_features_in(ns)

Get the total number of features in a given namespace

Parameters

ns (Union[NamespaceId, str, int]) – namespace Get the total features of this namespace

Return type

int

Returns

Total number of features in the given ns

pop_feature(ns)

Remove the top feature from a given namespace

Parameters

ns (Union[NamespaceId, str, int]) – namespace from which feature is popped

Return type

bool

Returns

True if feature was removed else False as no feature was there to pop

pop_namespace()

Remove the top namespace from an example

Return type

bool

Returns

True if namespace was removed else False as no namespace was there to pop

push_feature(ns, feature, v=1.0, ns_hash=None)

Add an unhashed feature to a given namespace

Parameters
  • ns (Union[NamespaceId, str, int]) – namespace in which the feature is to be pushed

  • f – feature

  • v (float) – The value of the feature, be default is 1.0

  • ns_hash (Optional[int]) – Optional, by default is None The hash of the namespace

Return type

None

push_features(ns, featureList)

Push a list of features to a given namespace.

Parameters
  • ns (Union[NamespaceId, str, int]) – namespace in which the features are pushed

  • featureList (List[Union[Tuple[Union[str, int], float], str, int]]) – Each feature in the list can either be an integer (already hashed) or a string (to be hashed) and may be paired with a value or not (if not, the value is assumed to be 1.0

Examples

>>> from vowpalwabbit import Workspace
>>> vw = Workspace(quiet=True)
>>> ex = vw.example('1 |a two features |b more features here')
>>> ex.push_features('x', ['a', 'b'])
>>> ex.push_features('y', [('c', 1.), 'd'])
>>> space_hash = vw.hash_space('x')
>>> feat_hash  = vw.hash_feature('a', space_hash)
>>> ex.push_features('x', [feat_hash]) #'x' should match the space_hash!
>>> ex.num_features_in('x')
3
>>> ex.num_features_in('y')
2
push_hashed_feature(ns, f, v=1.0)

Add a hashed feature to a given namespace.

Parameters
  • ns (Union[NamespaceId, str, int]) – namespace namespace in which the feature is to be pushed

  • f (int) – integer feature

  • v (float) – float The value of the feature, be default is 1.0

Return type

None

push_namespace(ns)

Push a new namespace onto this example. You should only do this if you’re sure that this example doesn’t already have the given namespace

Parameters

ns (Union[NamespaceId, str, int]) – namespace which is to be pushed onto example

Return type

None

set_label_string(string)

Give this example a new label

Parameters

string (str) – a new label to this example, formatted as a string (ala the VW data file format)

Return type

None

setup_done
setup_example()

If this example hasn’t already been setup (ie, quadratic features constructed, etc.), do so.

stride
sum_feat_sq(ns)

Get the total sum feature-value squared for a given namespace

Parameters

ns (Union[NamespaceId, str, int]) – namespace Get the total sum feature-value squared of this namespace

Return type

float

Returns

Total sum feature-value squared of the given ns

unsetup_example()

If this example has been setup, reverse that process so you can continue editing the examples.

vw
class vowpalwabbit.ExampleNamespace(ex, ns, ns_hash=None)

Bases: object

The ExampleNamespace class is a helper class that allows you to extract namespaces from examples and operate at a namespace level rather than an example level. Mainly this is done to enable indexing like ex[‘x’][0] to get the 0th feature in namespace ‘x’ in example ex.

__init__(ex, ns, ns_hash=None)

Construct an ExampleNamespace

Parameters
  • ex (Example) – examples from which namespace is to be extracted

  • ns (NamespaceId) – Target namespace

  • ns_hash (Optional[int]) – The hash of the namespace

iter_features()

iterate over all feature/value pairs in this namespace.

Return type

Iterator[Tuple[int, float]]

num_features_in()

Return the total number of features in this namespace.

Return type

int

pop_feature()

Remove the top feature from the current namespace; returns True if a feature was removed, returns False if there were no features to pop.

push_feature(feature, v=1.0)

Add an unhashed feature to the current namespace (fails if setup has already run on this example).

Parameters
  • feature (Union[str, int]) – Feature to be pushed to current namespace

  • v (float) – Feature value, by default is 1.0

push_features(feature_list, feature_list_legacy=None)

Push a list of features to a given namespace.

Parameters

feature_list (List[Union[Tuple[Union[str, int], float], Union[str, int]]]) – Each feature in the list can either be an integer (already hashed) or a string (to be hashed) and may be paired with a value or not (if not, the value is assumed to be 1.0).

Examples

See vowpalwabbit.Example.push_features() for examples.

This function used to have a ns argument that never did anything and a featureList argument. The ns argument has been removed, so inly a feature list should be passed now. The function checks if the old way of calling was used and issues a warning.

class vowpalwabbit.LabelType(value)

Bases: IntEnum

An enumeration.

CONDITIONAL_CONTEXTUAL_BANDIT = 6
CONTEXTUAL_BANDIT = 4
CONTEXTUAL_BANDIT_EVAL = 9
CONTINUOUS = 8
COST_SENSITIVE = 3
MULTICLASS = 2
MULTILABEL = 10
SIMPLE = 1
SLATES = 7
class vowpalwabbit.MulticlassLabel(label=1, weight=1.0, prediction=1)

Bases: AbstractLabel

Class for multiclass VW label with prediction

__init__(label=1, weight=1.0, prediction=1)
static from_example(ex)

Extract a label from the given example.

Parameters

ex (Example) – example from which label is to be extracted

class vowpalwabbit.MulticlassProbabilitiesLabel(prediction=None)

Bases: AbstractLabel

Class for multiclass VW label with probabilities

__init__(prediction=None)
static from_example(ex)

Extract a label from the given example.

Parameters

ex (Example) – example from which label is to be extracted

class vowpalwabbit.MultilabelLabel(labels)

Bases: AbstractLabel

Class for multilabel VW label

__init__(labels)
static from_example(ex)

Extract a label from the given example.

Parameters

ex (Example) – example from which label is to be extracted

class vowpalwabbit.NamespaceId(ex, id)

Bases: object

The NamespaceId class is simply a wrapper to convert between hash spaces referred to by character (eg ‘x’) versus their index in a particular example. Mostly used internally, you shouldn’t really need to touch this.

__init__(ex, id)

Given an example and an id, construct a NamespaceId.

Parameters
  • ex (Example) – example used to create a namespace id

  • id (Union[int, str]) –

    • If int, uses that as an index into this Examples list of feature groups to get the namespace id character

    • If str, uses the first character as the namespace id character

id: Optional[int]

Index into the list of example feature groups for this given namespace

ns: str

Single character respresenting the namespace index

ord_ns: int

Integer representation of the ns field

class vowpalwabbit.PredictionType(value)

Bases: IntEnum

An enumeration.

ACTION_PDF_VALUE = 9
ACTION_PROBS = 3
ACTION_SCORES = 2
ACTIVE_MULTICLASS = 11
DECISION_SCORES = 8
MULTICLASS = 4
MULTICLASSPROBS = 7
MULTILABELS = 5
NOPRED = 12
PDF = 10
PROB = 6
SCALAR = 0
SCALARS = 1
class vowpalwabbit.SimpleLabel(label=0.0, weight=1.0, initial=0.0, prediction=0.0)

Bases: AbstractLabel

Class for simple VW label

__init__(label=0.0, weight=1.0, initial=0.0, prediction=0.0)
static from_example(ex)

Extract a label from the given example.

Parameters

ex (Example) – example from which label is to be extracted

class vowpalwabbit.SlatesLabel(type=SlatesLabelType.UNSET, weight=1.0, labeled=False, cost=0.0, slot_id=0, probabilities=[])

Bases: AbstractLabel

Class for slates VW label

__init__(type=SlatesLabelType.UNSET, weight=1.0, labeled=False, cost=0.0, slot_id=0, probabilities=[])
static from_example(ex)

Extract a label from the given example.

Parameters

ex (Example) – example from which label is to be extracted

class vowpalwabbit.SlatesLabelType(value)

Bases: IntEnum

An enumeration.

ACTION = 1
SHARED = 0
SLOT = 2
UNSET = 3
class vowpalwabbit.Workspace(arg_str=None, enable_logging=False, **kw)

Bases: vw

Workspace exposes most of the library functionality. It wraps the native code. The Workspace Python class should always be used instead of the binding glue class.

__init__(arg_str=None, enable_logging=False, **kw)

Initialize the Workspace object.

Parameters
  • arg_str (str) – The command line arguments to initialize VW with, for example “–audit”. By default is None.

  • enable_logging (bool) – Enable captured logging. By default is False. This must be True to be able to call get_log()

  • **kw – Using key/value pairs for different options available. Using this append an option to the command line in the form of “–key value”, or in the case of a bool “–key” if true.

Examples

>>> from vowpalwabbit import Workspace
>>> vw1 = Workspace('--audit')
>>> vw2 = Workspace(audit=True, b=24, k=True, c=True, l2=0.001)
>>> vw3 = Workspace("--audit", b=26)
>>> vw4 = Workspace(q=["ab", "ac"])
example(stringOrDict=None, labelType=None)

Helper function to create an example object associated with this Workspace instance.

Parameters
Return type

Example

Returns

Constructed Example

finish()

stop VW by calling finish (and, eg, write weights to disk)

Return type

None

finish_example(ex)

Every example that is created with parse(), example(), or Example, should be passed to this method when you are finished with them.

This will return them to the Workspace instance to be reused and it will update internal statistics. If you care about statistics of used Examples then you should only use them once before passing them to finish.

Parameters

ex (Union[Example, List[Example]]) – example or examples to be finished

Return type

None

finished = False
get_config(filtered_enabled_reductions_only=True)
get_label_type()
Return type

LabelType

get_log()

Get all log messages produced. One line per item in the list. To get the complete log including run results, this should be called after finish()

Raises

Exception – Raises an exception if this function is called but the init function was called without setting enable_logging to True

Return type

List[str]

Returns

A list of strings, where each item is a line in the log

get_prediction_type()
Return type

PredictionType

get_weight(index, offset=0)

Get the weight at a particular position in the (learned) weight vector.

Parameters
  • index (int) – position in the learned weight vector

  • offset (int) – By default is 0

Returns

Weight at the given index

Return type

float

get_weight_from_name(feature_name, namespace_name=' ')

Get the weight based on the feature name and the namespace name.

Args

feature_name: The name of the feature namespace_name: The name of the namespace where the feature lives

Returns

Weight for the given feature and namespace name

Return type

float

init = False
init_search_task(search_task, task_data=None)
learn(ec)

Perform an online update

Parameters

ec (Union[Example, List[Example], str, List[str]]) – Examples on which the model gets updated. If using a single object the learner must be a single line learner. If using a list of objects the learner must be a multiline learner. If passing strings they are parsed using parse() before being learned from. If passing Example objects then they must be given to finish_example() at a later point.

Return type

None

num_weights()

Get length of weight vector.

Return type

int

parse(str_ex, labelType=None)

Returns a collection of examples for a multiline example learner or a single example for a single example learner.

Parameters
  • str_ex – str/list of str string representing examples. If the string is multiline then each line is considered as an example. In case of list, each string element is considered as an example

  • labelType (Union[int, LabelType, None]) – The direct integer value of the LabelType enum can be used or the enum directly. Supplying 0 or None means to use the default label type based on the setup VW learner.

Examples

>>> from vowpalwabbit import Workspace
>>> model = Workspace(quiet=True)
>>> ex = model.parse("0:0.1:0.75 | a:0.5 b:1 c:2")
>>> type(ex)
<class 'vowpalwabbit.pyvw.Example'>
>>> model = Workspace(quiet=True, cb_adf=True)
>>> ex = model.parse(["| a:1 b:0.5", "0:0.1:0.75 | a:0.5 b:1 c:2"])
>>> type(ex)
<class 'list'>
>>> len(ex) # Shows the multiline example is parsed
2
Return type

Union[Example, List[Example]]

Returns

Either a single example or list of examples.

parser_ran = False
predict(ec, prediction_type=None)

Make a prediction on the example

Parameters
  • ec (Union[Example, List[Example], str, List[str]]) – Examples of which to get a prediction from. If using a single object the learner must be a single line learner. If using a list of objects the learner must be a multiline learner. If passing strings they are parsed using parse() before being predicted on. If passing Example objects then they must be given to finish_example() at a later point.

  • prediction_type (Union[int, PredictionType, None]) – If none, use the prediction type of the example object. This is usually what is wanted. To request a specific type a value can be supplied here.

Returns

Prediction based on the given example

save(filename)

save model to disk

Return type

None