vowpalwabbit¶
The core functionality of the package is available in this root module. A small number of advanced usage classes are only available in vowpalwabbit.pyvw
.
Example usage¶
from vowpalwabbit import Workspace, Example
workspace = Workspace(quiet=True)
ex = Example('1 | a b c')
workspace.learn(ex)
workspace.predict(ex)
Module contents¶
Python interfaces for VW
- class vowpalwabbit.AbstractLabel¶
Bases:
object
An abstract class for a VW label.
- __init__()¶
- class vowpalwabbit.CBContinuousLabel(costs=[])¶
Bases:
AbstractLabel
Class for cb_continuous VW label
- __init__(costs=[])¶
- class vowpalwabbit.CBContinuousLabelElement(action=None, cost=0.0, pdf_value=0.0)¶
Bases:
object
- __init__(action=None, cost=0.0, pdf_value=0.0)¶
- class vowpalwabbit.CBEvalLabel(action, cb_label)¶
Bases:
AbstractLabel
Class for contextual bandits eval VW label
- __init__(action, cb_label)¶
- class vowpalwabbit.CBLabel(costs=[], weight=1.0)¶
Bases:
AbstractLabel
Class for contextual bandits VW label
- __init__(costs=[], weight=1.0)¶
- class vowpalwabbit.CBLabelElement(action=None, cost=0.0, partial_prediction=0.0, probability=0.0, **kwargs)¶
Bases:
object
- __init__(action=None, cost=0.0, partial_prediction=0.0, probability=0.0, **kwargs)¶
- class vowpalwabbit.CCBLabel(type=CCBLabelType.UNSET, explicit_included_actions=[], weight=1, outcome=None)¶
Bases:
AbstractLabel
Class for conditional contextual bandits VW label
- __init__(type=CCBLabelType.UNSET, explicit_included_actions=[], weight=1, outcome=None)¶
- class vowpalwabbit.CCBLabelType(value)¶
Bases:
IntEnum
An enumeration.
- ACTION = 1¶
- SHARED = 0¶
- SLOT = 2¶
- UNSET = 3¶
- class vowpalwabbit.CostSensitiveElement(label, cost=0.0, partial_prediction=0.0, wap_value=0.0)¶
Bases:
object
- __init__(label, cost=0.0, partial_prediction=0.0, wap_value=0.0)¶
- class vowpalwabbit.CostSensitiveLabel(costs=[], prediction=0.0)¶
Bases:
AbstractLabel
Class for cost sensative VW label
- __init__(costs=[], prediction=0.0)¶
- class vowpalwabbit.Example(vw, initStringOrDictOrRawExample=None, labelType=None)¶
Bases:
example
The example class is a wrapper around pylibvw.example. pylibvw.example should not be used. Most of the wrapping is to make the interface easier to use (by making the types safer via NamespaceId) and also with added python-specific functionality.
- __init__(vw, initStringOrDictOrRawExample=None, labelType=None)¶
Construct a new example from vw.
- Parameters
vw (
Workspace
) – Owning workspace of this example objectinitStringOrDictOrRawExample (
Union
[str
,Dict
[str
,List
[Union
[Tuple
[Union
[str
,int
],float
],str
,int
]]],Any
,example
,None
]) –Content to initialize the example with.
If None, created as an empty example
If a string, parsed as a VW example string
If a pylibvw.example, wraps the native example. This is advanced and should rarely be used.
If is a callable object will be called until it is no longer callable. At that point it should be another of the supported types.
Deprecated since version 9.0.0: Using a callable object is no longer supported.
If a dict, the keys are the namespace names and the values can either be an integer (already hashed) or a string (to be hashed) and may be paired with a value or not
(if not, the value is assumed to be 1.0).
labelType (
Union
[int
,LabelType
,None
]) –Which label type this example contains. If None (or 0), the label type is inferred from the workspace configuration.
Deprecated since version 9.0.0: Supplying an integer is no longer supported. Use the LabelType enum instead.
See also
- ensure_namespace_exists(ns)¶
Check to see if a namespace already exists.
- Parameters
ns (
Union
[NamespaceId
,str
,int
]) – If namespace exists does, do nothing. If it doesn’t, add it.
- feature(ns, i)¶
Get the i-th hashed feature id in a given namespace
- feature_weight(ns, i)¶
Get the value(weight) associated with a given feature id
- finished¶
- get_feature_id(ns, feature, ns_hash=None)¶
Get the hashed feature id for a given feature in a given namespace. feature can either be an integer (already a feature id) or a string, in which case it is hashed.
- Parameters
- Return type
- Returns
Hashed feature id
Note
If –hash all is on, then get_feature_id(ns,”5”) != get_feature_id(ns, 5). If you’ve already hashed the namespace, you can optionally provide that value to avoid re-hashing it.
- get_label(label_class=None)¶
Get the label object of this example.
- Parameters
label_class (
Union
[int
,LabelType
,Type
[AbstractLabel
],None
]) –If None, self.labelType will be used.
If int then corresponding
LabelType
for the label type to be retrieved.The ability to pass an AbstractLabel or an int are legacy requirements and are deprecated. All new usage of this function should pass a LabelType.
- Return type
Union
[AbstractLabel
,SimpleLabel
,MulticlassLabel
,CostSensitiveLabel
,CBLabel
,CCBLabel
,SlatesLabel
,CBContinuousLabel
]
- get_ns(id)¶
Construct a NamespaceId
- Argss:
id (NamespaceId/str/integer): id used to create namespace
- Return type
- Returns
NamespaceId created using parameter passed(if id was NamespaceId,
just return it directly)
- get_prediction(prediction_type=None)¶
Get prediction object from this example.
- Parameters
prediction_type (
Union
[int
,PredictionType
,None
]) –If None, the label type of the example’s owning Workspace instance will be used.
If int then corresponding
PredictionType
for the prediction type to be retrieved.Supplying an int is deprecated and will be removed in a future release.
- Return type
Union
[float
,List
[float
],int
,List
[int
],List
[List
[Tuple
[int
,float
]]],Tuple
[int
,float
],List
[Tuple
[float
,float
,float
]],Tuple
[int
,List
[int
]],str
]- Returns
- Prediction according to parameter prediction_type
SCALAR
: floatSCALARS
: List[float]ACTION_SCORES
: List[float]ACTION_PROBS
: List[float]MULTICLASS
: intMULTILABELS
: List[int]PROB
: floatMULTICLASSPROBS
: List[float]DECISION_SCORES
: List[List[Tuple[int, float]]]ACTION_PDF_VALUE
: Tuple[int, float]PDF
: List[Tuple[float, float, float]]ACTIVE_MULTICLASS
: Tuple[int, List[int]]NOPRED
: str
Examples
>>> from vowpalwabbit import Workspace, PredictionType >>> vw = Workspace(quiet=True) >>> ex = vw.example('1 |a two features |b more features here') >>> ex.get_prediction() 0.0
- iter_features()¶
Iterate over all feature/value pairs in this example (all namespace included).
- labelType¶
- learn()¶
Learn on this example (and before learning, automatically call setup_example if the example hasn’t yet been setup).
- num_features_in(ns)¶
Get the total number of features in a given namespace
- Parameters
ns (
Union
[NamespaceId
,str
,int
]) – namespace Get the total features of this namespace- Return type
- Returns
Total number of features in the given ns
- pop_feature(ns)¶
Remove the top feature from a given namespace
- Parameters
ns (
Union
[NamespaceId
,str
,int
]) – namespace from which feature is popped- Return type
- Returns
True if feature was removed else False as no feature was there to pop
- pop_namespace()¶
Remove the top namespace from an example
- Return type
- Returns
True if namespace was removed else False as no namespace was there to pop
- push_feature(ns, feature, v=1.0, ns_hash=None)¶
Add an unhashed feature to a given namespace
- push_features(ns, featureList)¶
Push a list of features to a given namespace.
- Parameters
ns (
Union
[NamespaceId
,str
,int
]) – namespace in which the features are pushedfeatureList (
List
[Union
[Tuple
[Union
[str
,int
],float
],str
,int
]]) – Each feature in the list can either be an integer (already hashed) or a string (to be hashed) and may be paired with a value or not (if not, the value is assumed to be 1.0
Examples
>>> from vowpalwabbit import Workspace >>> vw = Workspace(quiet=True) >>> ex = vw.example('1 |a two features |b more features here') >>> ex.push_features('x', ['a', 'b']) >>> ex.push_features('y', [('c', 1.), 'd']) >>> space_hash = vw.hash_space('x') >>> feat_hash = vw.hash_feature('a', space_hash) >>> ex.push_features('x', [feat_hash]) #'x' should match the space_hash! >>> ex.num_features_in('x') 3 >>> ex.num_features_in('y') 2
- push_hashed_feature(ns, f, v=1.0)¶
Add a hashed feature to a given namespace.
- push_namespace(ns)¶
Push a new namespace onto this example. You should only do this if you’re sure that this example doesn’t already have the given namespace
- Parameters
ns (
Union
[NamespaceId
,str
,int
]) – namespace which is to be pushed onto example- Return type
- set_label_string(string)¶
Give this example a new label
- setup_done¶
- setup_example()¶
If this example hasn’t already been setup (ie, quadratic features constructed, etc.), do so.
- stride¶
- sum_feat_sq(ns)¶
Get the total sum feature-value squared for a given namespace
- Parameters
ns (
Union
[NamespaceId
,str
,int
]) – namespace Get the total sum feature-value squared of this namespace- Return type
- Returns
Total sum feature-value squared of the given ns
- unsetup_example()¶
If this example has been setup, reverse that process so you can continue editing the examples.
- vw¶
- class vowpalwabbit.ExampleNamespace(ex, ns, ns_hash=None)¶
Bases:
object
The ExampleNamespace class is a helper class that allows you to extract namespaces from examples and operate at a namespace level rather than an example level. Mainly this is done to enable indexing like ex[‘x’][0] to get the 0th feature in namespace ‘x’ in example ex.
- __init__(ex, ns, ns_hash=None)¶
Construct an ExampleNamespace
- Parameters
ex (
Example
) – examples from which namespace is to be extractedns (
NamespaceId
) – Target namespace
- iter_features()¶
iterate over all feature/value pairs in this namespace.
- pop_feature()¶
Remove the top feature from the current namespace; returns True if a feature was removed, returns False if there were no features to pop.
- push_feature(feature, v=1.0)¶
Add an unhashed feature to the current namespace (fails if setup has already run on this example).
- push_features(feature_list, feature_list_legacy=None)¶
Push a list of features to a given namespace.
- Parameters
feature_list (List[Union[Tuple[Union[str, int], float], Union[str, int]]]) – Each feature in the list can either be an integer (already hashed) or a string (to be hashed) and may be paired with a value or not (if not, the value is assumed to be 1.0).
Examples
See
vowpalwabbit.Example.push_features()
for examples.This function used to have a ns argument that never did anything and a featureList argument. The ns argument has been removed, so inly a feature list should be passed now. The function checks if the old way of calling was used and issues a warning.
- class vowpalwabbit.LabelType(value)¶
Bases:
IntEnum
An enumeration.
- CONDITIONAL_CONTEXTUAL_BANDIT = 6¶
- CONTEXTUAL_BANDIT = 4¶
- CONTEXTUAL_BANDIT_EVAL = 9¶
- CONTINUOUS = 8¶
- COST_SENSITIVE = 3¶
- MULTICLASS = 2¶
- MULTILABEL = 10¶
- SIMPLE = 1¶
- SLATES = 7¶
- class vowpalwabbit.MulticlassLabel(label=1, weight=1.0, prediction=1)¶
Bases:
AbstractLabel
Class for multiclass VW label with prediction
- __init__(label=1, weight=1.0, prediction=1)¶
- class vowpalwabbit.MulticlassProbabilitiesLabel(prediction=None)¶
Bases:
AbstractLabel
Class for multiclass VW label with probabilities
- __init__(prediction=None)¶
- class vowpalwabbit.MultilabelLabel(labels)¶
Bases:
AbstractLabel
Class for multilabel VW label
- __init__(labels)¶
- class vowpalwabbit.NamespaceId(ex, id)¶
Bases:
object
The NamespaceId class is simply a wrapper to convert between hash spaces referred to by character (eg ‘x’) versus their index in a particular example. Mostly used internally, you shouldn’t really need to touch this.
- __init__(ex, id)¶
Given an example and an id, construct a NamespaceId.
- class vowpalwabbit.PredictionType(value)¶
Bases:
IntEnum
An enumeration.
- ACTION_PDF_VALUE = 9¶
- ACTION_PROBS = 3¶
- ACTION_SCORES = 2¶
- ACTIVE_MULTICLASS = 11¶
- DECISION_SCORES = 8¶
- MULTICLASS = 4¶
- MULTICLASSPROBS = 7¶
- MULTILABELS = 5¶
- NOPRED = 12¶
- PDF = 10¶
- PROB = 6¶
- SCALAR = 0¶
- SCALARS = 1¶
- class vowpalwabbit.SimpleLabel(label=0.0, weight=1.0, initial=0.0, prediction=0.0)¶
Bases:
AbstractLabel
Class for simple VW label
- __init__(label=0.0, weight=1.0, initial=0.0, prediction=0.0)¶
- class vowpalwabbit.SlatesLabel(type=SlatesLabelType.UNSET, weight=1.0, labeled=False, cost=0.0, slot_id=0, probabilities=[])¶
Bases:
AbstractLabel
Class for slates VW label
- __init__(type=SlatesLabelType.UNSET, weight=1.0, labeled=False, cost=0.0, slot_id=0, probabilities=[])¶
- class vowpalwabbit.SlatesLabelType(value)¶
Bases:
IntEnum
An enumeration.
- ACTION = 1¶
- SHARED = 0¶
- SLOT = 2¶
- UNSET = 3¶
- class vowpalwabbit.Workspace(arg_str=None, enable_logging=False, **kw)¶
Bases:
vw
Workspace exposes most of the library functionality. It wraps the native code. The Workspace Python class should always be used instead of the binding glue class.
- __init__(arg_str=None, enable_logging=False, **kw)¶
Initialize the Workspace object.
- Parameters
arg_str (str) – The command line arguments to initialize VW with, for example “–audit”. By default is None.
enable_logging (bool) – Enable captured logging. By default is False. This must be True to be able to call
get_log()
**kw – Using key/value pairs for different options available. Using this append an option to the command line in the form of “–key value”, or in the case of a bool “–key” if true.
Examples
>>> from vowpalwabbit import Workspace >>> vw1 = Workspace('--audit') >>> vw2 = Workspace(audit=True, b=24, k=True, c=True, l2=0.001) >>> vw3 = Workspace("--audit", b=26) >>> vw4 = Workspace(q=["ab", "ac"])
- example(stringOrDict=None, labelType=None)¶
Helper function to create an example object associated with this Workspace instance.
- Parameters
- Return type
- Returns
Constructed Example
- finish_example(ex)¶
Every example that is created with
parse()
,example()
, orExample
, should be passed to this method when you are finished with them.This will return them to the Workspace instance to be reused and it will update internal statistics. If you care about statistics of used Examples then you should only use them once before passing them to finish.
- finished = False¶
- get_config(filtered_enabled_reductions_only=True)¶
- get_log()¶
Get all log messages produced. One line per item in the list. To get the complete log including run results, this should be called after
finish()
- get_prediction_type()¶
- Return type
- get_weight(index, offset=0)¶
Get the weight at a particular position in the (learned) weight vector.
- get_weight_from_name(feature_name, namespace_name=' ')¶
Get the weight based on the feature name and the namespace name.
- Args
feature_name: The name of the feature namespace_name: The name of the namespace where the feature lives
- Returns
Weight for the given feature and namespace name
- Return type
- init = False¶
- init_search_task(search_task, task_data=None)¶
- learn(ec)¶
Perform an online update
- Parameters
ec (
Union
[Example
,List
[Example
],str
,List
[str
]]) – Examples on which the model gets updated. If using a single object the learner must be a single line learner. If using a list of objects the learner must be a multiline learner. If passing strings they are parsed usingparse()
before being learned from. If passing Example objects then they must be given tofinish_example()
at a later point.- Return type
- parse(str_ex, labelType=None)¶
Returns a collection of examples for a multiline example learner or a single example for a single example learner.
- Parameters
str_ex – str/list of str string representing examples. If the string is multiline then each line is considered as an example. In case of list, each string element is considered as an example
labelType (
Union
[int
,LabelType
,None
]) – The direct integer value of theLabelType
enum can be used or the enum directly. Supplying 0 or None means to use the default label type based on the setup VW learner.
Examples
>>> from vowpalwabbit import Workspace >>> model = Workspace(quiet=True) >>> ex = model.parse("0:0.1:0.75 | a:0.5 b:1 c:2") >>> type(ex) <class 'vowpalwabbit.pyvw.Example'> >>> model = Workspace(quiet=True, cb_adf=True) >>> ex = model.parse(["| a:1 b:0.5", "0:0.1:0.75 | a:0.5 b:1 c:2"]) >>> type(ex) <class 'list'> >>> len(ex) # Shows the multiline example is parsed 2
- parser_ran = False¶
- predict(ec, prediction_type=None)¶
Make a prediction on the example
- Parameters
ec (
Union
[Example
,List
[Example
],str
,List
[str
]]) – Examples of which to get a prediction from. If using a single object the learner must be a single line learner. If using a list of objects the learner must be a multiline learner. If passing strings they are parsed usingparse()
before being predicted on. If passing Example objects then they must be given tofinish_example()
at a later point.prediction_type (
Union
[int
,PredictionType
,None
]) – If none, use the prediction type of the example object. This is usually what is wanted. To request a specific type a value can be supplied here.
- Returns
Prediction based on the given example