15 Commits (2944039f73a2d0e22663e9641205a8eee6b59307)

Author SHA1 Message Date
Your Name 4fd9222a2d lemma_writer: updated to write metadata csv when dumping documents in ud format
4 years ago
Your Name 3cdb68b196 out_parser: updated fncols function
4 years ago
Your Name 77eb51a1bf actorizer: totally revamped way of finding actors
4 years ago
Your Name 5d99ec9509 elasticizer: added option to dump data frames to rds files
4 years ago
Your Name a3b6e19646 revised modeling pipeline:
5 years ago
Erik de Vries 060751237b actorizer, out_parser: switched from mclapply to future_lapply and removed windows-specific code from out_parser
5 years ago
Erik de Vries 8051a81b66 actorizer, dfm_gen, modelizer, out_parser: replaced all instances of detectCores by cores parameter (which defaults to detectCores)
5 years ago
Erik de Vries 90fdbcc982 out_parser: parallelized when not in windoze
6 years ago
Erik de Vries 522c872dba out_parser: moved cleaning regex to end of pipeline, to prevent collissions with other (mandatory) regex cleaning
6 years ago
Erik de Vries e70b6ccf7a actorizer: fixed sentence_count and out_parser calls
6 years ago
Erik de Vries ce5f812252 dfm_gen, merger: Added option for generating lemma_upos hybrids for merged field
6 years ago
Erik de Vries 1955692346 dfm_gen, out_parser: updated documentation
6 years ago
Erik de Vries 34531b0da8 out_parser: added option to clean output using regex to remove numbers and non-words
6 years ago
Erik de Vries d0e9bf565b dupe_detect: Reset the _delete value to 1
6 years ago
Erik de Vries 0a3bdb630b actorizer, dfm_gen, ud_update: unified output parsing from _source and highlight fields into a single function (out_parser)
6 years ago