Namespaces
namespace	details

Functions
template<typename It >
int	generate_epsilon_greedy (float epsilon, uint32_t top_action, It pmf_first, It pmf_last)
	Experimental: Generates epsilon-greedy style exploration distribution. More...

template<typename InputIt , typename OutputIt >
int	generate_softmax (float lambda, InputIt scores_first, InputIt scores_last, OutputIt pmf_first, OutputIt pmf_last)
	Generates softmax style exploration distribution. More...

template<typename InputIt , typename OutputIt >
int	generate_bag (InputIt top_actions_first, InputIt top_actions_last, OutputIt pmf_first, OutputIt pmf_last)
	Generates an exploration distribution according to votes on actions. More...

template<typename It >
int	enforce_minimum_probability (float uniform_epsilon, bool consider_zero_valued_elements, It pmf_first, It pmf_last)
	Updates the pmf to ensure each action is explored with at least minimum_uniform/num_actions. More...

template<typename It >
int	mix_with_uniform (float uniform_epsilon, It pmf_first, It pmf_last)
	Mix original PMF with uniform distribution. More...

template<typename It >
int	sample_after_normalizing (uint64_t seed, It pmf_first, It pmf_last, uint32_t &chosen_index)
	Sample an index from the provided pmf. If the pmf is not normalized it will be updated in-place. More...

template<typename It >
int	sample_after_normalizing (const char *seed, It pmf_first, It pmf_last, uint32_t &chosen_index)
	Sample an index from the provided pmf. If the pmf is not normalized it will be updated in-place. More...

template<typename ActionIt >
int	swap_chosen (ActionIt action_first, ActionIt action_last, uint32_t chosen_index)
	Swap the first value with the chosen index. More...

template<typename It >
int	sample_pdf (uint64_t *p_seed, It pdf_first, It pdf_last, float &chosen_value, float &pdf_value)
	Sample a continuous value from the provided pdf. More...

template<typename ActionsIt >
int	swap_chosen (ActionsIt action_first, ActionsIt action_last, uint32_t chosen_index)

Function Documentation

◆ enforce_minimum_probability()

template<typename It >

int VW::explore::enforce_minimum_probability	(	float	uniform_epsilon,
		bool	consider_zero_valued_elements,
		It	pmf_first,
		It	pmf_last
	)

Updates the pmf to ensure each action is explored with at least minimum_uniform/num_actions.

Template Parameters

It	Iterator type of the pmf. Must be a RandomAccessIterator.

Parameters

uniform_epsilon	The minimum amount of uniform distribution to impose on the pmf.
consider_zero_valued_elements	If true elements with zero probability are updated, otherwise those actions will be unchanged.
pmf_first	Iterator pointing to the pre-allocated beginning of the pmf to be generated by this function.
pmf_last	Iterator pointing to the pre-allocated end of the pmf to be generated by this function.

Returns: int returns 0 on success, otherwise an error code as defined by E_EXPLORATION_*.

◆ generate_bag()

template<typename InputIt , typename OutputIt >

int VW::explore::generate_bag	(	InputIt	top_actions_first,
		InputIt	top_actions_last,
		OutputIt	pmf_first,
		OutputIt	pmf_last
	)

Generates an exploration distribution according to votes on actions.

Template Parameters

InputIt	Iterator type of the input actions. Must be an InputIterator.
OutputIt	Iterator type of the pre-allocated pmf. Must be a RandomAccessIterator.

Parameters

top_actions_first	Iterator pointing to the beginning of the top actions.
top_actions_last	Iterator pointing to the end of the top actions.
pmf_first	Iterator pointing to the pre-allocated beginning of the pmf to be generated by this function.
pmf_last	Iterator pointing to the pre-allocated end of the pmf to be generated by this function.

Returns: int returns 0 on success, otherwise an error code as defined by E_EXPLORATION_*.

◆ generate_epsilon_greedy()

template<typename It >

int VW::explore::generate_epsilon_greedy	(	float	epsilon,
		uint32_t	top_action,
		It	pmf_first,
		It	pmf_last
	)

Experimental: Generates epsilon-greedy style exploration distribution.

Template Parameters

It	Iterator type of the pre-allocated pmf. Must be a RandomAccessIterator.

Parameters

epsilon	Minimum probability used to explore among options. Each action is explored with at least epsilon/num_actions.
top_action	Index of the exploit actions. This action will be get probability mass of 1-epsilon + (epsilon/num_actions).
pmf_first	Iterator pointing to the pre-allocated beginning of the pmf to be generated by this function.
pmf_last	Iterator pointing to the pre-allocated end of the pmf to be generated by this function.

Returns: int returns 0 on success, otherwise an error code as defined by E_EXPLORATION_*.

◆ generate_softmax()

template<typename InputIt , typename OutputIt >

int VW::explore::generate_softmax	(	float	lambda,
		InputIt	scores_first,
		InputIt	scores_last,
		OutputIt	pmf_first,
		OutputIt	pmf_last
	)

Generates softmax style exploration distribution.

Template Parameters

InputIt	Iterator type of the input scores. Must be an InputIterator.
OutputIt	Iterator type of the pre-allocated pmf. Must be a RandomAccessIterator.

Parameters

lambda	Lambda parameter of softmax.
scores_first	Iterator pointing to beginning of the scores.
scores_last	Iterator pointing to end of the scores.
pmf_first	Iterator pointing to the pre-allocated beginning of the pmf to be generated by this function.
pmf_last	Iterator pointing to the pre-allocated end of the pmf to be generated by this function.

Returns: int returns 0 on success, otherwise an error code as defined by E_EXPLORATION_*.

◆ mix_with_uniform()

template<typename It >

int VW::explore::mix_with_uniform	(	float	uniform_epsilon,
		It	pmf_first,
		It	pmf_last
	)

Mix original PMF with uniform distribution.

Template Parameters

It	It Iterator type of the pmf. Must be a RandomAccessIterator.

Parameters

uniform_epsilon	The minimum amount of uniform distribution to be mixed with the pmf.
pmf_first	Iterator pointing to the pmf to be updated.
pmf_last	Iterator pointing to the pmf to be updated.

Returns: int returns 0 on success, otherwise an error code as defined by E_EXPLORATION_*.

◆ sample_after_normalizing() [1/2]

template<typename It >

int VW::explore::sample_after_normalizing	(	const char *	seed,
		It	pmf_first,
		It	pmf_last,
		uint32_t &	chosen_index
	)

Sample an index from the provided pmf. If the pmf is not normalized it will be updated in-place.

Template Parameters

It	Iterator type of the pmf. Must be a RandomAccessIterator.

Parameters

seed	The seed for the pseudo-random generator. Will be hashed using MURMUR hash.
pmf_first	Iterator pointing to the beginning of the pmf.
pmf_last	Iterator pointing to the end of the pmf.
chosen_index	returns the chosen index.

Returns: int returns 0 on success, otherwise an error code as defined by E_EXPLORATION_*.

◆ sample_after_normalizing() [2/2]

template<typename It >

int VW::explore::sample_after_normalizing	(	uint64_t	seed,
		It	pmf_first,
		It	pmf_last,
		uint32_t &	chosen_index
	)

Sample an index from the provided pmf. If the pmf is not normalized it will be updated in-place.

Template Parameters

InputIt Iterator type of the pmf. Must be a RandomAccessIterator.

Parameters

seed	The seed for the pseudo-random generator.
pmf_first	Iterator pointing to the beginning of the pmf.
pmf_last	Iterator pointing to the end of the pmf.
chosen_index	returns the chosen index.

Returns: int returns 0 on success, otherwise an error code as defined by E_EXPLORATION_*.

◆ sample_pdf()

template<typename It >

int VW::explore::sample_pdf	(	uint64_t *	p_seed,
		It	pdf_first,
		It	pdf_last,
		float &	chosen_value,
		float &	pdf_value
	)

Sample a continuous value from the provided pdf.

Warning: seed must be sufficiently random for the PRNG to produce uniform random values. Using sequential seeds will result in a very biased distribution. If unsure how to update seed between calls, merand48 (in random_details.h) can be used to inplace mutate it.

Template Parameters

It	Iterator type of the pmf. Must be a RandomAccessIterator.

Parameters

p_seed	The seed for the pseudo-random generator. Will be hashed using MURMUR hash. The seed state will be advanced
pdf_first	Iterator pointing to the beginning of the pdf.
pdf_last	Iterator pointing to the end of the pdf.
chosen_value	returns the sampled continuous value.
pdf_value	returns the probablity density at the sampled location.

Returns: int returns 0 on success, otherwise an error code as defined by E_EXPLORATION_*.

◆ swap_chosen() [1/2]

template<typename ActionIt >

int VW::explore::swap_chosen	(	ActionIt	action_first,
		ActionIt	action_last,
		uint32_t	chosen_index
	)

Swap the first value with the chosen index.

Template Parameters

ActionIt Iterator type of the action. Must be a forward_iterator.

Parameters

action_first	Iterator pointing to the beginning of the pdf.
action_last	Iterator pointing to the end of the pdf.
chosen_index	The index value that should be swapped with the first element

Returns: int returns 0 on success, otherwise an error code as defined by E_EXPLORATION_*.

◆ swap_chosen() [2/2]

template<typename ActionsIt >

int VW::explore::swap_chosen	(	ActionsIt	action_first,
		ActionsIt	action_last,
		uint32_t	chosen_index
	)

Namespaces

Functions

Function Documentation

◆ enforce_minimum_probability()

◆ generate_bag()

◆ generate_epsilon_greedy()

◆ generate_softmax()

◆ mix_with_uniform()

◆ sample_after_normalizing() [1/2]

◆ sample_after_normalizing() [2/2]

◆ sample_pdf()

◆ swap_chosen() [1/2]

◆ swap_chosen() [2/2]