Vowpal Wabbit
|
#include <kskip_ngram_transformer.h>
Public Member Functions | |
void | generate_grams (example *ex) |
std::vector< std::string > | get_initial_ngram_definitions () const |
std::vector< std::string > | get_initial_skip_definitions () const |
kskip_ngram_transformer (const kskip_ngram_transformer &other)=default | |
kskip_ngram_transformer & | operator= (const kskip_ngram_transformer &other)=default |
kskip_ngram_transformer (kskip_ngram_transformer &&other)=default | |
kskip_ngram_transformer & | operator= (kskip_ngram_transformer &&other)=default |
Static Public Member Functions | |
static kskip_ngram_transformer | build (const std::vector< std::string > &grams, const std::vector< std::string > &skips, bool quiet) |
|
default |
|
default |
|
static |
void VW::kskip_ngram_transformer::generate_grams | ( | example * | ex | ) |
This function adds k-skip-n-grams to the feature vector. Definition of k-skip-n-grams: Consider a feature vector - a, b, c, d, e, f 2-skip-2-grams would be - ab, ac, ad, bc, bd, be, cd, ce, cf, de, df, ef 1-skip-3-grams would be - abc, abd, acd, ace, bcd, bce, bde, bdf, cde, cdf, cef, def Note that for a n-gram, (n-1)-grams, (n-2)-grams... 2-grams are also appended The k-skip-n-grams are appended to the feature vector. Hash is evaluated using the principle h(a, b) = h(a)*X + h(b), where X is a random no. 32 random nos. are maintained in an array and are used in the hashing.
|
inline |
|
inline |
|
default |
|
default |