vowpalwabbit¶
The core functionality of the package is available in this root module. A small number of advanced usage classes are only available in vowpalwabbit.pyvw
.
Example usage¶
from vowpalwabbit import Workspace, Example
workspace = Workspace(quiet=True)
ex = Example('1 | a b c')
workspace.learn(ex)
workspace.predict(ex)
Module contents¶
Python interfaces for VW
- class vowpalwabbit.AbstractLabel¶
Bases:
object
An abstract class for a VW label.
- __init__()¶
- class vowpalwabbit.CBContinuousLabel(costs=[])¶
Bases:
AbstractLabel
Class for cb_continuous VW label
- __init__(costs=[])¶
- class vowpalwabbit.CBContinuousLabelElement(action=None, cost=0.0, pdf_value=0.0)¶
Bases:
object
- __init__(action=None, cost=0.0, pdf_value=0.0)¶
- class vowpalwabbit.CBEvalLabel(action, cb_label)¶
Bases:
AbstractLabel
Class for contextual bandits eval VW label
- __init__(action, cb_label)¶
- class vowpalwabbit.CBLabel(costs=[], weight=1.0)¶
Bases:
AbstractLabel
Class for contextual bandits VW label
- __init__(costs=[], weight=1.0)¶
- class vowpalwabbit.CBLabelElement(action=None, cost=0.0, partial_prediction=0.0, probability=0.0, **kwargs)¶
Bases:
object
- __init__(action=None, cost=0.0, partial_prediction=0.0, probability=0.0, **kwargs)¶
- class vowpalwabbit.CCBLabel(type=CCBLabelType.UNSET, explicit_included_actions=[], weight=1, outcome=None)¶
Bases:
AbstractLabel
Class for conditional contextual bandits VW label
- __init__(type=CCBLabelType.UNSET, explicit_included_actions=[], weight=1, outcome=None)¶
- class vowpalwabbit.CCBLabelType(value)¶
Bases:
IntEnum
An enumeration.
- ACTION = 1¶
- SHARED = 0¶
- SLOT = 2¶
- UNSET = 3¶
- class vowpalwabbit.CostSensitiveElement(label, cost=0.0, partial_prediction=0.0, wap_value=0.0)¶
Bases:
object
- __init__(label, cost=0.0, partial_prediction=0.0, wap_value=0.0)¶
- class vowpalwabbit.CostSensitiveLabel(costs=[], prediction=0.0)¶
Bases:
AbstractLabel
Class for cost sensative VW label
- __init__(costs=[], prediction=0.0)¶
- class vowpalwabbit.Example(vw, initStringOrDictOrRawExample=None, labelType=None)¶
Bases:
example
The example class is a wrapper around pylibvw.example. pylibvw.example should not be used. Most of the wrapping is to make the interface easier to use (by making the types safer via NamespaceId) and also with added python-specific functionality.
- __init__(vw, initStringOrDictOrRawExample=None, labelType=None)¶
Construct a new example from vw.
- Parameters:
vw (
Workspace
) – Owning workspace of this example objectinitStringOrDictOrRawExample (
Union
[str
,Dict
[str
,List
[Union
[Tuple
[Union
[str
,int
],float
],str
,int
]]],Dict
[str
,Dict
[Union
[str
,int
],float
]],Any
,example
,None
]) –Content to initialize the example with.
If None, created as an empty example
If a string, parsed as a VW example string
If a pylibvw.example, wraps the native example. This is advanced and should rarely be used.
If is a callable object will be called until it is no longer callable. At that point it should be another of the supported types.
Deprecated since version 9.0.0: Using a callable object is no longer supported.
If a dict, the keys are the namespace names and the values are the namespace features. Namespace features can either be represented as a list or a dict. When using a list items are either keys (i.e., an int or string) in which case the value is assumed to be 1 or a key-value tuple. When using a dict the all features are represented as key-value pairs.
labelType (
Union
[int
,LabelType
,None
]) –Which label type this example contains. If None (or 0), the label type is inferred from the workspace configuration.
Deprecated since version 9.0.0: Supplying an integer is no longer supported. Use the LabelType enum instead.
See also
- ensure_namespace_exists(ns)¶
Check to see if a namespace already exists.
- Parameters:
ns (
Union
[NamespaceId
,str
,int
]) – If namespace exists does, do nothing. If it doesn’t, add it.
- feature(ns, i)¶
Get the i-th hashed feature id in a given namespace
- feature_weight(ns, i)¶
Get the value(weight) associated with a given feature id
- get_feature_id(ns, feature, ns_hash=None)¶
Get the hashed feature id for a given feature in a given namespace. feature can either be an integer (already a feature id) or a string, in which case it is hashed.
- Parameters:
- Return type:
- Returns:
Hashed feature id
Note
If –hash all is on, then get_feature_id(ns,”5”) != get_feature_id(ns, 5). If you’ve already hashed the namespace, you can optionally provide that value to avoid re-hashing it.
- get_label(label_class=None)¶
Get the label object of this example.
- Parameters:
label_class (
Union
[int
,LabelType
,Type
[AbstractLabel
],None
]) –If None, self.labelType will be used.
If int then corresponding
LabelType
for the label type to be retrieved.The ability to pass an AbstractLabel or an int are legacy requirements and are deprecated. All new usage of this function should pass a LabelType.
- Return type:
Union
[AbstractLabel
,SimpleLabel
,MulticlassLabel
,CostSensitiveLabel
,CBLabel
,CCBLabel
,SlatesLabel
,CBContinuousLabel
]
- get_ns(id)¶
Construct a NamespaceId
- Argss:
id (NamespaceId/str/integer): id used to create namespace
- Return type:
- Returns:
NamespaceId created using parameter passed(if id was NamespaceId,
just return it directly)
- get_prediction(prediction_type=None)¶
Get prediction object from this example.
- Parameters:
prediction_type (
Union
[int
,PredictionType
,None
]) –If None, the label type of the example’s owning Workspace instance will be used.
If int then corresponding
PredictionType
for the prediction type to be retrieved.Supplying an int is deprecated and will be removed in a future release.
- Return type:
Union
[float
,List
[float
],int
,List
[int
],List
[List
[Tuple
[int
,float
]]],Tuple
[int
,float
],List
[Tuple
[float
,float
,float
]],Tuple
[int
,List
[int
]],None
]- Returns:
- Prediction according to parameter prediction_type
SCALAR
: floatSCALARS
: List[float]ACTION_SCORES
: List[float]ACTION_PROBS
: List[float]MULTICLASS
: intMULTILABELS
: List[int]PROB
: floatMULTICLASSPROBS
: List[float]DECISION_SCORES
: List[List[Tuple[int, float]]]ACTION_PDF_VALUE
: Tuple[int, float]PDF
: List[Tuple[float, float, float]]ACTIVE_MULTICLASS
: Tuple[int, List[int]]NOPRED
: None
Examples
>>> from vowpalwabbit import Workspace, PredictionType >>> vw = Workspace(quiet=True) >>> ex = vw.example('1 |a two features |b more features here') >>> ex.get_prediction() 0.0
- iter_features()¶
Iterate over all feature/value pairs in this example (all namespace included).
- learn()¶
Learn on this example (and before learning, automatically call setup_example if the example hasn’t yet been setup).
- num_features_in(ns)¶
Get the total number of features in a given namespace
- Parameters:
ns (
Union
[NamespaceId
,str
,int
]) – namespace Get the total features of this namespace- Return type:
- Returns:
Total number of features in the given ns
- pop_feature(ns)¶
Remove the top feature from a given namespace
- Parameters:
ns (
Union
[NamespaceId
,str
,int
]) – namespace from which feature is popped- Return type:
- Returns:
True if feature was removed else False as no feature was there to pop
- pop_namespace()¶
Remove the top namespace from an example
- Return type:
- Returns:
True if namespace was removed else False as no namespace was there to pop
- push_feature(ns, feature, v=1.0, ns_hash=None)¶
Add an unhashed feature to a given namespace
- push_features(ns, featureList)¶
Push a list of features to a given namespace.
- Parameters:
ns (
Union
[NamespaceId
,str
,int
]) – namespace in which the features are pushedfeatureList (
List
[Union
[Tuple
[Union
[str
,int
],float
],str
,int
]]) – Each feature in the list can either be an integer (already hashed) or a string (to be hashed) and may be paired with a value or not (if not, the value is assumed to be 1.0
Examples
>>> from vowpalwabbit import Workspace >>> vw = Workspace(quiet=True) >>> ex = vw.example('1 |a two features |b more features here') >>> ex.push_features('x', ['a', 'b']) >>> ex.push_features('y', [('c', 1.), 'd']) >>> space_hash = vw.hash_space('x') >>> feat_hash = vw.hash_feature('a', space_hash) >>> ex.push_features('x', [feat_hash]) #'x' should match the space_hash! >>> ex.num_features_in('x') 3 >>> ex.num_features_in('y') 2
- push_hashed_feature(ns, f, v=1.0)¶
Add a hashed feature to a given namespace.
- push_namespace(ns)¶
Push a new namespace onto this example. You should only do this if you’re sure that this example doesn’t already have the given namespace
- Parameters:
ns (
Union
[NamespaceId
,str
,int
]) – namespace which is to be pushed onto example- Return type:
- set_label_string(string)¶
Give this example a new label
- setup_example()¶
If this example hasn’t already been setup (ie, quadratic features constructed, etc.), do so.
- sum_feat_sq(ns)¶
Get the total sum feature-value squared for a given namespace
- Parameters:
ns (
Union
[NamespaceId
,str
,int
]) – namespace Get the total sum feature-value squared of this namespace- Return type:
- Returns:
Total sum feature-value squared of the given ns
- unsetup_example()¶
If this example has been setup, reverse that process so you can continue editing the examples.
- class vowpalwabbit.ExampleNamespace(ex, ns, ns_hash=None)¶
Bases:
object
The ExampleNamespace class is a helper class that allows you to extract namespaces from examples and operate at a namespace level rather than an example level. Mainly this is done to enable indexing like ex[‘x’][0] to get the 0th feature in namespace ‘x’ in example ex.
- __init__(ex, ns, ns_hash=None)¶
Construct an ExampleNamespace
- Parameters:
ex (
Example
) – examples from which namespace is to be extractedns (
NamespaceId
) – Target namespace
- iter_features()¶
iterate over all feature/value pairs in this namespace.
- pop_feature()¶
Remove the top feature from the current namespace; returns True if a feature was removed, returns False if there were no features to pop.
- push_feature(feature, v=1.0)¶
Add an unhashed feature to the current namespace (fails if setup has already run on this example).
- push_features(feature_list, feature_list_legacy=None)¶
Push a list of features to a given namespace.
- Parameters:
feature_list (List[Union[Tuple[Union[str, int], float], Union[str, int]]]) – Each feature in the list can either be an integer (already hashed) or a string (to be hashed) and may be paired with a value or not (if not, the value is assumed to be 1.0).
Examples
See
vowpalwabbit.Example.push_features()
for examples.This function used to have a ns argument that never did anything and a featureList argument. The ns argument has been removed, so inly a feature list should be passed now. The function checks if the old way of calling was used and issues a warning.
- class vowpalwabbit.LabelType(value)¶
Bases:
IntEnum
An enumeration.
- CONDITIONAL_CONTEXTUAL_BANDIT = 6¶
- CONTEXTUAL_BANDIT = 4¶
- CONTEXTUAL_BANDIT_EVAL = 9¶
- CONTINUOUS = 8¶
- COST_SENSITIVE = 3¶
- MULTICLASS = 2¶
- MULTILABEL = 10¶
- SIMPLE = 1¶
- SLATES = 7¶
- class vowpalwabbit.MulticlassLabel(label=1, weight=1.0, prediction=1)¶
Bases:
AbstractLabel
Class for multiclass VW label with prediction
- __init__(label=1, weight=1.0, prediction=1)¶
- class vowpalwabbit.MulticlassProbabilitiesLabel(prediction=None)¶
Bases:
AbstractLabel
Class for multiclass VW label with probabilities
- __init__(prediction=None)¶
- class vowpalwabbit.MultilabelLabel(labels)¶
Bases:
AbstractLabel
Class for multilabel VW label
- __init__(labels)¶
- class vowpalwabbit.NamespaceId(ex, id)¶
Bases:
object
The NamespaceId class is simply a wrapper to convert between hash spaces referred to by character (eg ‘x’) versus their index in a particular example. Mostly used internally, you shouldn’t really need to touch this.
- __init__(ex, id)¶
Given an example and an id, construct a NamespaceId.
- class vowpalwabbit.PredictionType(value)¶
Bases:
IntEnum
An enumeration.
- ACTION_PDF_VALUE = 9¶
- ACTION_PROBS = 3¶
- ACTION_SCORES = 2¶
- ACTIVE_MULTICLASS = 11¶
- DECISION_SCORES = 8¶
- MULTICLASS = 4¶
- MULTICLASSPROBS = 7¶
- MULTILABELS = 5¶
- NOPRED = 12¶
- PDF = 10¶
- PROB = 6¶
- SCALAR = 0¶
- SCALARS = 1¶
- class vowpalwabbit.SimpleLabel(label=0.0, weight=1.0, initial=0.0, prediction=0.0)¶
Bases:
AbstractLabel
Class for simple VW label
- __init__(label=0.0, weight=1.0, initial=0.0, prediction=0.0)¶
- class vowpalwabbit.SlatesLabel(type=SlatesLabelType.UNSET, weight=1.0, labeled=False, cost=0.0, slot_id=0, probabilities=[])¶
Bases:
AbstractLabel
Class for slates VW label
- __init__(type=SlatesLabelType.UNSET, weight=1.0, labeled=False, cost=0.0, slot_id=0, probabilities=[])¶
- class vowpalwabbit.SlatesLabelType(value)¶
Bases:
IntEnum
An enumeration.
- ACTION = 1¶
- SHARED = 0¶
- SLOT = 2¶
- UNSET = 3¶
- class vowpalwabbit.Workspace(arg_str=None, enable_logging=False, arg_list=None, **kw)¶
Bases:
vw
Workspace exposes most of the library functionality. It wraps the native code. The Workspace Python class should always be used instead of the binding glue class.
- __init__(arg_str=None, enable_logging=False, arg_list=None, **kw)¶
Initialize the Workspace object. arg_str, arg_list and the kwargs will be merged together. Duplicates will result in duplicate values in the command line.
- Parameters:
arg_str (
Optional
[str
]) – The command line arguments to initialize VW with, for example “–audit”. This list is naively split by spaces. To control the splitting behavior please pass a list of strings to arg_list instead.enable_logging (
bool
) – Enable captured logging. By default is False. This must be True to be able to callget_log()
arg_list (
Optional
[List
[str
]]) – List of tokens that comprise the command line.**kw – Using key/value pairs for different options available. Using this append an option to the command line in the form of “–key value”, or in the case of a bool “–key” if true.
Examples
>>> from vowpalwabbit import Workspace >>> vw1 = Workspace('--audit') >>> vw2 = Workspace(audit=True, b=24, k=True, c=True, l2=0.001) >>> vw3 = Workspace("--audit", b=26) >>> vw4 = Workspace(q=["ab", "ac"]) >>> vw4 = Workspace(arg_list=["--audit", "--interactions", "ab"])
- example(stringOrDict=None, labelType=None)¶
Helper function to create an example object associated with this Workspace instance.
- Parameters:
- Return type:
- Returns:
Constructed Example
- finish_example(ex)¶
Every example that is created with
parse()
,example()
, orExample
, should be passed to this method when you are finished with them.This will return them to the Workspace instance to be reused and it will update internal statistics. If you care about statistics of used Examples then you should only use them once before passing them to finish.
- get_config(filtered_enabled_reductions_only=True)¶
- get_log()¶
Get all log messages produced. One line per item in the list. To get the complete log including run results, this should be called after
finish()
- get_prediction_type()¶
- Return type:
- get_weight(index, offset=0)¶
Get the weight at a particular position in the (learned) weight vector.
- get_weight_from_name(feature_name, namespace_name=' ')¶
Get the weight based on the feature name and the namespace name.
- Args
feature_name: The name of the feature namespace_name: The name of the namespace where the feature lives
- Returns:
Weight for the given feature and namespace name
- Return type:
- init_search_task(search_task, task_data=None)¶
- learn(ec)¶
Perform an online update
- Parameters:
ec (
Union
[Example
,List
[Example
],str
,List
[str
]]) – Examples on which the model gets updated. If using a single object the learner must be a single line learner. If using a list of objects the learner must be a multiline learner. If passing strings they are parsed usingparse()
before being learned from. If passing Example objects then they must be given tofinish_example()
at a later point.- Return type:
- parse(str_ex, labelType=None)¶
Returns a collection of examples for a multiline example learner or a single example for a single example learner.
- Parameters:
str_ex – str/list of str string representing examples. If the string is multiline then each line is considered as an example. In case of list, each string element is considered as an example
labelType (
Union
[int
,LabelType
,None
]) – The direct integer value of theLabelType
enum can be used or the enum directly. Supplying 0 or None means to use the default label type based on the setup VW learner.
Examples
>>> from vowpalwabbit import Workspace >>> model = Workspace(quiet=True) >>> ex = model.parse("0:0.1:0.75 | a:0.5 b:1 c:2") >>> type(ex) <class 'vowpalwabbit.pyvw.Example'> >>> model = Workspace(quiet=True, cb_adf=True) >>> ex = model.parse(["| a:1 b:0.5", "0:0.1:0.75 | a:0.5 b:1 c:2"]) >>> type(ex) <class 'list'> >>> len(ex) # Shows the multiline example is parsed 2
- predict(ec, prediction_type=None)¶
Make a prediction on the example
- Parameters:
ec (
Union
[Example
,List
[Example
],str
,List
[str
]]) – Examples of which to get a prediction from. If using a single object the learner must be a single line learner. If using a list of objects the learner must be a multiline learner. If passing strings they are parsed usingparse()
before being predicted on. If passing Example objects then they must be given tofinish_example()
at a later point.prediction_type (
Union
[int
,PredictionType
,None
]) – If none, use the prediction type of the example object. This is usually what is wanted. To request a specific type a value can be supplied here.
- Returns:
Prediction based on the given example