vowpalwabbit.pyvw

Python binding for pylibvw class

class vowpalwabbit.pyvw.SearchTask(vw, sch, num_actions)

Search task class

Methods

example(self[, initStringOrDict, labelType]) Create an example initStringOrDict can specify example as VW formatted string, or a dictionary labelType can specify the desire label type
learn(self, data_iterator) Train search task by providing an iterator of examples.
predict(self, my_example[, useOracle]) Predict on the example
__init__(self, vw, sch, num_actions)
Parameters:
vw : vw object
sch : search object
num_actions : integer

The number of actions with the object can be initialized with

Returns:
self : SearchTask

See also

pyvw.vw
example(self, initStringOrDict=None, labelType=0)

Create an example initStringOrDict can specify example as VW formatted string, or a dictionary labelType can specify the desire label type

Parameters:
initStringOrDict : str/dict

Example in either string or dictionary form

labelType : integer
  • 0 : lDEFAULT
  • 1 : lBINARY
  • 2 : lMULTICLASS
  • 3 : lCOST_SENSITIVE
  • 4 : lCONTEXTUAL_BANDIT
  • 5 : lMAX
  • 6 : lCONDITIONAL_CONTEXTUAL_BANDIT

The integer is used to map the corresponding labelType using the above available options

Returns:
out : Example
learn(self, data_iterator)

Train search task by providing an iterator of examples.

Parameters:
data_iterator: iterable objects

Consists of examples to be learned

Returns:
self : SearchTask
predict(self, my_example, useOracle=False)

Predict on the example

Parameters:
my_example : Example

example used for prediction

useOracle : bool
Returns:
out : integer

Prediction on the example

class vowpalwabbit.pyvw.abstract_label

An abstract class for a VW label.

Methods

from_example(self, ex) grab a label from a given VW example
__init__(self)
from_example(self, ex)

grab a label from a given VW example

class vowpalwabbit.pyvw.cbandits_label(costs=[], prediction=0)

Bases: vowpalwabbit.pyvw.abstract_label

Class for contextual bandits VW label

Methods

from_example  
__init__(self, costs=[], prediction=0)
from_example(self, ex)
class vowpalwabbit.pyvw.cost_sensitive_label(costs=[], prediction=0)

Bases: vowpalwabbit.pyvw.abstract_label

Class for cost sensative VW label

Methods

from_example  
__init__(self, costs=[], prediction=0)
from_example(self, ex)
class vowpalwabbit.pyvw.example(vw, initStringOrDictOrRawExample=None, labelType=0)

Bases: pylibvw.example

The example class is a (non-trivial) wrapper around pylibvw.example. Most of the wrapping is to make the interface easier to use (by making the types safer via namespace_id) and also with added python-specific functionality.

Methods

ensure_namespace_exists(self, ns) Check to see if a namespace already exists.
erase_namespace() Remove all the features from a given namespace
feature(self, ns, i) Get the i-th hashed feature id in a given namespace
feature_weight(self, ns, i) Get the value(weight) associated with a given feature id
get_action_scores() Get action scores from example prediction
get_cbandits_class((example)arg1, (int)arg2) Assuming a contextual_bandits label type, get the label for a given pair (i=0..
get_cbandits_cost((example)arg1, (int)arg2) Assuming a contextual_bandits label type, get the cost for a given pair (i=0..
get_cbandits_num_costs() Assuming a contextual_bandits label type, get the total number of label/cost pairs
get_cbandits_partial_prediction(…) Assuming a contextual_bandits label type, get the partial prediction for a given pair (i=0..
get_cbandits_prediction() Assuming a contextual_bandits label type, get the prediction
get_cbandits_probability((example)arg1, …) Assuming a contextual_bandits label type, get the bandits probability for a given pair (i=0..
get_costsensitive_class((example)arg1, (int)arg2) Assuming a cost_sensitive label type, get the label for a given pair (i=0..
get_costsensitive_cost((example)arg1, (int)arg2) Assuming a cost_sensitive label type, get the cost for a given pair (i=0..
get_costsensitive_num_costs() Assuming a cost_sensitive label type, get the total number of label/cost pairs
get_costsensitive_partial_prediction(…) Assuming a cost_sensitive label type, get the partial prediction for a given pair (i=0..
get_costsensitive_prediction() Assuming a cost_sensitive label type, get the prediction
get_costsensitive_wap_value((example)arg1, …) Assuming a cost_sensitive label type, get the weighted-all-pairs recomputed cost for a given pair (i=0..
get_decision_scores() Get decision scores from example prediction
get_example_counter() Returns the counter of total number of examples seen up to and including this one
get_feature_id(self, ns, feature[, ns_hash]) Get the hashed feature id for a given feature in a given namespace.
get_feature_number() Returns the total number of features for this example
get_ft_offset((example)arg1) Returns the feature offset for this example (used, eg, by multiclass classification to bulk offset all features)
get_label(self[, label_class]) Given a known label class (default is simple_label), get the corresponding label structure for this example.
get_loss() Returns the loss associated with this example
get_multiclass_label() Assuming a multiclass label type, get the true label
get_multiclass_prediction() Assuming a multiclass label type, get the prediction
get_multiclass_weight() Assuming a multiclass label type, get the importance weight
get_multilabel_predictions() Get multilabel predictions from example prediction
get_ns(self, id) Construct a namespace_id
get_partial_prediction() Returns the partial prediction associated with this example
get_prob() Get probability from example prediction
get_scalars() Get scalar values from example prediction
get_simplelabel_initial() Assuming a simple_label label type, return the initial (baseline) prediction
get_simplelabel_label((example)arg1) Assuming a simple_label label type, return the corresponding label (class/regression target/etc.)
get_simplelabel_prediction() Assuming a simple_label label type, return the final prediction
get_simplelabel_weight() Assuming a simple_label label type, return the importance weight
get_tag() Returns the tag associated with this example
get_topic_prediction() For LDA models, returns the topic prediction for the topic id given
get_total_sum_feat_sq() The total sum of feature-value squared for this example
get_updated_prediction() Returns the partial prediction as if we had updated it after learning
iter_features(self) Iterate over all feature/value pairs in this example (all namespace included).
learn(self) Learn on this example (and before learning, automatically call setup_example if the example hasn’t yet been setup).
namespace() Get the namespace id for namespace i (for i = 0..
num_features_in(self, ns) Get the total number of features in a given namespace
num_namespaces() The total number of namespaces associated with this example
pop_feature(self, ns) Remove the top feature from a given namespace
pop_namespace(self) Remove the top namespace from an example
push_feature(self, ns, feature[, v, ns_hash]) Add an unhashed feature to a given namespace
push_feature_dict() Add a (Python) dictionary of namespace/feature-list pairs
push_feature_list() Add a (Python) list of features to a given namespace
push_features(self, ns, featureList) Push a list of features to a given namespace.
push_hashed_feature(self, ns, f[, v]) Add a hashed feature to a given namespace.
push_namespace(self, ns) Push a new namespace onto this example.
set_label_string(self, string) Give this example a new label
set_test_only() Change the test-only bit on an example
setup_example(self) If this example hasn’t already been setup (ie, quadratic features constructed, etc.), do so.
sum_feat_sq(self, ns) Get the total sum feature-value squared for a given namespace
unsetup_example(self) If this example has been setup, reverse that process so you can continue editing the examples.
__init__(self, vw, initStringOrDictOrRawExample=None, labelType=0)

Construct a new example from vw.

Parameters:
vw : vw

vw model

initStringOrDictOrRawExample : dict/string/None

If initString is None, you get an “empty” example which you can construct by hand (see, eg, example.push_features). If initString is a string, then this string is parsed as it would be from a VW data file into an example (and “setup_example” is run). if it is a dict, then we add all features in that dictionary. finally, if it’s a function, we (repeatedly) execute it fn() until it’s not a function any more(for lazy feature computation). By default is None

labelType : integer
  • 0 : lDEFAULT
  • 1 : lBINARY
  • 2 : lMULTICLASS
  • 3 : lCOST_SENSITIVE
  • 4 : lCONTEXTUAL_BANDIT
  • 5 : lMAX
  • 6 : lCONDITIONAL_CONTEXTUAL_BANDIT

The integer is used to map the corresponding labelType using the above available options

Returns:
self : Example

See also

pyvw.vw
ensure_namespace_exists(self, ns)

Check to see if a namespace already exists.

Parameters:
ns : namespace

If namespace exists does, do nothing. If it doesn’t, add it.

feature(self, ns, i)

Get the i-th hashed feature id in a given namespace

Parameters:
ns : namespace

namespace used to get the feature

i : integer

to get i-th hashed feature id in a given ns. It must range from 0 to self.num_features_in(ns)-1

Returns:
f : integer

i-th hashed feature-id in a given ns

feature_weight(self, ns, i)

Get the value(weight) associated with a given feature id

Parameters:
ns : namespace

namespace used to get the feature id

i : integer

to get the weight of i-th feature in the given ns. It must range from 0 to self.num_features_in(ns)-1

Returns:
out : float

weight(value) of the i-th feature of given ns

get_feature_id(self, ns, feature, ns_hash=None)

Get the hashed feature id for a given feature in a given namespace. feature can either be an integer (already a feature id) or a string, in which case it is hashed.

Parameters:
ns : namespace

namespace used to get the feature

feature : integer/string

If integer the already a feature else will be hashed

ns_hash : Optional, by default is None

The hash of the namespace

Returns:
out : integer

Hashed feature id

Note

If –hash all is on, then get_feature_id(ns,”5”) !=
get_feature_id(ns, 5). If you’ve already hashed the namespace,
you can optionally provide that value to avoid re-hashing it.
get_label(self, label_class=<class vowpalwabbit.pyvw.simple_label at 0x7f13f184b130>)

Given a known label class (default is simple_label), get the corresponding label structure for this example.

Parameters:
label_class : label classes

Get the label of the example of label_class type, by default is simple_label

get_ns(self, id)

Construct a namespace_id

Parameters:
id : namespace_id/str/integer

id used to create namespace

Returns:
out : namespace_id

namespace_id created using parameter passed(if id was namespace_id, just return it directly)

iter_features(self)

Iterate over all feature/value pairs in this example (all namespace included).

learn(self)

Learn on this example (and before learning, automatically call setup_example if the example hasn’t yet been setup).

num_features_in(self, ns)

Get the total number of features in a given namespace

Parameters:
ns : namespace

Get the total features of this namespace

Returns:
num_features : integer

Total number of features in the given ns

pop_feature(self, ns)

Remove the top feature from a given namespace

Parameters:
ns : namespace

namespace from which feature is popped

Returns:
out : bool

True if feature was removed else False as no feature was there to pop

pop_namespace(self)

Remove the top namespace from an example

Returns:
out : bool

True if namespace was removed else False as no namespace was there to pop

push_feature(self, ns, feature, v=1.0, ns_hash=None)

Add an unhashed feature to a given namespace

Parameters:
ns : namespace

namespace in which the feature is to be pushed

f : integer

feature

v : float

The value of the feature, be default is 1.0

ns_hash : Optional, by default is None

The hash of the namespace

push_features(self, ns, featureList)

Push a list of features to a given namespace.

Parameters:
ns : namespace

namespace in which the features are pushed

featureList : list

Each feature in the list can either be an integer (already hashed) or a string (to be hashed) and may be paired with a value or not (if not, the value is assumed to be 1.0

Examples

>>> from vowpalwabbit import pyvw
>>> vw = pyvw.vw(quiet=True)
>>> ex = vw.example('1 |a two features |b more features here')
>>> ex.push_features('x', ['a', 'b'])
>>> ex.push_features('y', [('c', 1.), 'd'])
>>> space_hash = vw.hash_space('x')
>>> feat_hash  = vw.hash_feature('a', space_hash)
>>> ex.push_features('x', [feat_hash]) #'x' should match the space_hash!
>>> ex.num_features_in('x')
3
>>> ex.num_features_in('y')
2
push_hashed_feature(self, ns, f, v=1.0)

Add a hashed feature to a given namespace.

Parameters:
ns : namespace

namespace in which the feature is to be pushed

f : integer

feature

v : float

The value of the feature, be default is 1.0

push_namespace(self, ns)

Push a new namespace onto this example. You should only do this if you’re sure that this example doesn’t already have the given namespace

Parameters:
ns : namespace

namespace which is to be pushed onto example

set_label_string(self, string)

Give this example a new label

Parameters:
string : str

a new label to this example, formatted as a string (ala the VW data file format)

setup_example(self)

If this example hasn’t already been setup (ie, quadratic features constructed, etc.), do so.

sum_feat_sq(self, ns)

Get the total sum feature-value squared for a given namespace

Parameters:
ns : namespace

Get the total sum feature-value squared of this namespace

Returns:
sum_sq : float

Total sum feature-value squared of the given ns

unsetup_example(self)

If this example has been setup, reverse that process so you can continue editing the examples.

class vowpalwabbit.pyvw.example_namespace(ex, ns, ns_hash=None)

The example_namespace class is a helper class that allows you to extract namespaces from examples and operate at a namespace level rather than an example level. Mainly this is done to enable indexing like ex[‘x’][0] to get the 0th feature in namespace ‘x’ in example ex.

Methods

iter_features(self) iterate over all feature/value pairs in this namespace.
num_features_in(self) Return the total number of features in this namespace.
pop_feature(self) Remove the top feature from the current namespace; returns True if a feature was removed, returns False if there were no features to pop.
push_feature(self, feature[, v]) Add an unhashed feature to the current namespace (fails if setup has already run on this example).
push_features(self, ns, featureList) Push a list of features to a given namespace.
__init__(self, ex, ns, ns_hash=None)

Construct an example_namespace

Parameters:
ex : Example

examples from which namespace is to be extracted

ns : namespace_id

Target namespace

ns_hash : Optional, by default is None

The hash of the namespace

Returns:
self : example_namespace
iter_features(self)

iterate over all feature/value pairs in this namespace.

num_features_in(self)

Return the total number of features in this namespace.

pop_feature(self)

Remove the top feature from the current namespace; returns True if a feature was removed, returns False if there were no features to pop.

push_feature(self, feature, v=1.0)

Add an unhashed feature to the current namespace (fails if setup has already run on this example).

Parameters:
feature : integer/str

Feature to be pushed to current namespace

v : float

Feature value, by default is 1.0

push_features(self, ns, featureList)

Push a list of features to a given namespace.

Parameters:
ns : namespace

namespace to which feature list is to be pushed

featureList : list

Each feature in the list can either be an integer (already hashed) or a string (to be hashed) and may be paired with a value or not (if not, the value is assumed to be 1.0).

See example.push_features for examples.
vowpalwabbit.pyvw.get_prediction(ec, prediction_type)

Get specified type of prediction from example

Parameters:
ec : Example
prediction_type : integer
  • 0: pSCALAR
  • 1: pSCALARS
  • 2: pACTION_SCORES
  • 3: pACTION_PROBS
  • 4: pMULTICLASS
  • 5: pMULTILABELS
  • 6: pPROB
  • 7: pMULTICLASSPROBS
  • 8: pDECISION_SCORES
Returns:
out : integer/list

Prediction according to parameter prediction_type

Examples

>>> from vowpalwabbit import pyvw
>>> import pylibvw
>>> vw = pyvw.vw(quiet=True)
>>> ex = vw.example('1 |a two features |b more features here')
>>> pyvw.get_prediction(ex, pylibvw.vw.pSCALAR)
0.0
class vowpalwabbit.pyvw.multiclass_label(label=1, weight=1.0, prediction=1)

Bases: vowpalwabbit.pyvw.abstract_label

Class for multiclass VW label with prediction

Methods

from_example  
__init__(self, label=1, weight=1.0, prediction=1)
from_example(self, ex)
class vowpalwabbit.pyvw.multiclass_probabilities_label(label, prediction=None)

Bases: vowpalwabbit.pyvw.abstract_label

Class for multiclass VW label with probabilities

Methods

from_example  
__init__(self, label, prediction=None)
from_example(self, ex)
class vowpalwabbit.pyvw.namespace_id(ex, id)

The namespace_id class is simply a wrapper to convert between hash spaces referred to by character (eg ‘x’) versus their index in a particular example. Mostly used internally, you shouldn’t really need to touch this.

__init__(self, ex, id)

Given an example and an id, construct a namespace_id.

Parameters:
ex : Example

example used to create a namespace id

id : integer/str

The id can either be an integer (in which case we take it to be an index into ex.indices[]) or a string (in which case we take the first character as the namespace id).

Returns:
self : namespace_id
class vowpalwabbit.pyvw.simple_label(label=0.0, weight=1.0, initial=0.0, prediction=0.0)

Bases: vowpalwabbit.pyvw.abstract_label

Class for simple VW label

Methods

from_example  
__init__(self, label=0.0, weight=1.0, initial=0.0, prediction=0.0)
from_example(self, ex)
class vowpalwabbit.pyvw.vw(arg_str=None, **kw)

Bases: pylibvw.vw

The pyvw.vw object is a (trivial) wrapper around the pylibvw.vw object; you’re probably best off using this directly and ignoring the pylibvw.vw structure entirely.

Methods

audit_example() print example audit information
example(self[, stringOrDict, labelType]) Create an example initStringOrDict can specify example as VW formatted string, or a dictionary labelType can specify the desire label type
finish(self) stop VW by calling finish (and, eg, write weights to disk)
finish_example(self, ex) Should only be used in conjunction with the parse method
get_arguments() return the arguments after resolving all dependencies
get_id() return the model id
get_label_type() return parse label type
get_prediction_type() return prediction type
get_search_ptr() return a pointer to the search data structure
get_stride() return the internal stride
get_sum_loss() return the total cumulative loss suffered so far
get_weight(self, index[, offset]) Get the weight at a particular position in the (learned) weight vector.
get_weighted_examples() return the total weight of examples so far
hash_feature() given a feature string (arg2) and a hashed namespace (arg3), hash that feature
hash_space() given a namespace (as a string), compute the hash of that namespace
learn(self, ec) Perform an online update
learn_multi() given a list pyvw examples, learn (and predict) on those examples
num_weights(self) Get length of weight vector.
parse(self, str_ex[, labelType]) Returns a collection of examples for a multiline example learner or a single example for a single example learner.
predict(self, ec[, prediction_type]) Just make a prediction on the example
predict_multi() given a list of pyvw examples, predict on that example
run_parser() parse external data file
save(self, filename) save model to disk
set_weight() set the weight for a particular index
setup_example((vw)arg1, (example)arg2) given an example that you’ve created by hand, prepare it for learning (eg, compute quadratic feature)
unsetup_example() reverse the process of setup, so that you can go back and modify this example
init_search_task  
__init__(self, arg_str=None, **kw)

Initialize the vw object.

Parameters:
arg_str : str

The command line arguments to initialize VW with, for example “–audit”. By default is None.

**kw : Using key/value pairs for different options available
Returns:
self : vw

Examples

>>> from vowpalwabbit import vw
>>> vw1 = pyvw.vw('--audit')
>>> vw2 = pyvw.vw(audit=True, b=24, k=True, c=True, l2=0.001)
>>> vw3 = pyvw.vw("--audit", b=26)
>>> vw4 = pyvw.vw("-q", ["ab", "ac"])
example(self, stringOrDict=None, labelType=0)

Create an example initStringOrDict can specify example as VW formatted string, or a dictionary labelType can specify the desire label type

Parameters:
initStringOrDict : str/dict

Example in either string or dictionary form

labelType : integer
  • 0 : lDEFAULT
  • 1 : lBINARY
  • 2 : lMULTICLASS
  • 3 : lCOST_SENSITIVE
  • 4 : lCONTEXTUAL_BANDIT
  • 5 : lMAX
  • 6 : lCONDITIONAL_CONTEXTUAL_BANDIT

The integer is used to map the corresponding labelType using the above available options

Returns:
out : Example
finish(self)

stop VW by calling finish (and, eg, write weights to disk)

finish_example(self, ex)

Should only be used in conjunction with the parse method

Parameters:
ex : Example

example to be finished

get_weight(self, index, offset=0)

Get the weight at a particular position in the (learned) weight vector.

Parameters:
index : integer

position in the learned weight vector

offset : integer

By default is 0

Returns:
weight : float

Weight at the given index

init_search_task(self, search_task, task_data=None)
learn(self, ec)

Perform an online update

Parameters:
ec : example/str/list

examples on which the model gets updated

num_weights(self)

Get length of weight vector.

parse(self, str_ex, labelType=0)

Returns a collection of examples for a multiline example learner or a single example for a single example learner.

Parameters:
str_ex : str/list of str

string representing examples. If the string is multiline then each line is considered as an example. In case of list, each string element is considered as an example

labelType : integer
  • 0 : lDEFAULT
  • 1 : lBINARY
  • 2 : lMULTICLASS
  • 3 : lCOST_SENSITIVE
  • 4 : lCONTEXTUAL_BANDIT
  • 5 : lMAX
  • 6 : lCONDITIONAL_CONTEXTUAL_BANDIT

The integer is used to map the corresponding labelType using the above available options

Returns:
ec : list

list of examples parsed

Examples

>>> from vowpalwabbit import pyvw
>>> model = pyvw.vw(quiet=True)
>>> ex = model.parse("0:0.1:0.75 | a:0.5 b:1 c:2")
>>> len(ex)
1
>>> model = vw(quiet=True, cb_adf=True)
>>> ex = model.parse(["| a:1 b:0.5", "0:0.1:0.75 | a:0.5 b:1 c:2"])
>>> len(ex) # Shows the multiline example is parsed
2
predict(self, ec, prediction_type=None)

Just make a prediction on the example

Parameters:
ec : Example/list/str

examples to be predicted

prediction_type : optional, by default is None

if provided then the matching return type is used otherwise the the learner’s prediction type will determine the output.

Returns:
prediction : Prediction made on each examples
save(self, filename)

save model to disk