11 Commits (cc7fa5bffa7f5c71efb674eb57fd3c88e739d774)

Author SHA1 Message Date
Your Name 4b4d860235 class_update: remove dfm_gen multicore option
4 years ago
Your Name 9eae486a80 separated data preprocessing routines
5 years ago
Your Name a01a53f105 class_update: added cores parameter for multicore processing of sources when using lemmas
5 years ago
Your Name d9f936c566 modelizer: tf-idf application updated, final model now also includes idf values from training set, explicitly setting positive category in binary classification for confusion matrices, minor code fixes
5 years ago
Erik de Vries 9b0ac775af class_update: add ver variable to set version for class updated articles
6 years ago
Erik de Vries 85306007f4 class_update: added words and clean parameters, in addition to text parameter, to be able to set data preprocessing exactly the same as in the trained model
6 years ago
Erik de Vries 9f3418ef37 class_update; dfm_gen; merger: updated functions to accept text parameter for both old style 'lemmas' and new style 'ud'
6 years ago
Erik de Vries 0e8c127b86 bulk_writer: fixes for JSON generation and added exception for use of 'tokens' varname
6 years ago
Erik de Vries 6bb8f9b635 class_update: added explicit httr::: references
6 years ago
Erik de Vries f543d658bd Major overhaul to ES bulk update integration. Added support for both setting and appending to variables
6 years ago
Erik de Vries d203de0b2a Updated elasticizer docs, created modelizer and class_update functions
6 years ago