You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Erik de Vries
54dfb6a8ca
actorizer: major fix to ud parsing, changed regex to remove html tags to only include tags with a maximum of 20 characters in them
...
ud_update: major fix to ud parsing, changed regex to remove html tags to only include tags with a maximum of 20 characters in them
elastic_update: set the minimum break between retries from 10 to 30 seconds
elasticizer: implementation of retries for elasticizer function, 10 retries with a break of 30 seconds in between
6 years ago
..
actorizer.R
actorizer: major fix to ud parsing, changed regex to remove html tags to only include tags with a maximum of 20 characters in them
6 years ago
bulk_writer.R
changed udpipe output variable from tokens to ud
6 years ago
class_update.R
bulk_writer: fixes for JSON generation and added exception for use of 'tokens' varname
6 years ago
dfm_gen.R
dfm_gen: word cutoff now as final step in script, caused bugs with mutating code variables
6 years ago
dupe_detect.R
Fixed dupe_detect error on documents with one sentence or less, and a maximum # of words in dfm_gen
6 years ago
elastic_update.R
actorizer: major fix to ud parsing, changed regex to remove html tags to only include tags with a maximum of 20 characters in them
6 years ago
elasticizer.R
actorizer: major fix to ud parsing, changed regex to remove html tags to only include tags with a maximum of 20 characters in them
6 years ago
merger.R
dfm_gen & merger: Changed word cutoff point to be a general setting in dfm_gen. Cuts off at the last [.?!] before the cutoff point (so returns documents at a sentence, shorter than cutoff).
6 years ago
modelizer.R
modelizer: fixed error when only one class is predicted for junk classification (borderline case)
6 years ago
query_gen_actors.R
elasticizer: Updated bulk size to 1024 (a power of 2) and set a timeout of 900s every 500000 updates
6 years ago
query_string.R
Add query_string function for generating query_string queries
6 years ago
ud_update.R
actorizer: major fix to ud parsing, changed regex to remove html tags to only include tags with a maximum of 20 characters in them
6 years ago