You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Erik de Vries
522c872dba
out_parser: moved cleaning regex to end of pipeline, to prevent collissions with other (mandatory) regex cleaning
|
6 years ago |
.. |
actorizer.R
|
actorizer: removed nested mclapply
|
6 years ago |
bulk_writer.R
|
actorizer: Removed udmodel dependencies, commented code, changed nested lists to flat lists
|
6 years ago |
class_update.R
|
class_update: add ver variable to set version for class updated articles
|
6 years ago |
dfm_gen.R
|
dfm_gen, merger: Added option for generating lemma_upos hybrids for merged field
|
6 years ago |
dupe_detect.R
|
dupe_detect: fixed error on no duplicates
|
6 years ago |
elastic_update.R
|
actorizer: Removed udmodel dependencies, commented code, changed nested lists to flat lists
|
6 years ago |
elasticizer.R
|
elasticizer: renamed size parameter to batch_size, created max_batch parameter to limit the number of results returned
|
6 years ago |
lemma_writer.R
|
lemma_writer: new function to write raw lemma's (without interpunction) to text file. Is structured as elasticizer update function (despite not updating anything on the server)
|
6 years ago |
merger.R
|
merger: idiotic fix for a non-problem, see comment on line 32
|
6 years ago |
modelizer.R
|
actorizer, ud_update: Updated merging of document fields to properly deal with missing punctuation at the end of fields (e.g. a title without punctuation at the end of the string)
|
6 years ago |
out_parser.R
|
out_parser: moved cleaning regex to end of pipeline, to prevent collissions with other (mandatory) regex cleaning
|
6 years ago |
query_gen_actors.R
|
elasticizer: Updated bulk size to 1024 (a power of 2) and set a timeout of 900s every 500000 updates
|
6 years ago |
query_string.R
|
query_string: updated check for fields value
|
6 years ago |
ud_update.R
|
actorizer: Removed udmodel dependencies, commented code, changed nested lists to flat lists
|
6 years ago |