Erik de Vries
889e7e92af
lemma_writer: updated to provide support for writing raw documents to individual files using utf-8 encoding
5 years ago
Erik de Vries
6f5ace8c52
actor_fetcher: elasticizer batch function to fetch actorsDetail fields from all relevant documents
5 years ago
Erik de Vries
7a01a7f18d
query_gen_actors: temporary update for fixing broken shit
6 years ago
Erik de Vries
e3b26c0be3
actor_aggregation: Added function to generate aggregate actor measures at daily, weekly, monthly and yearly level
...
query_string: Added default_operator parameter, to define whether whitespaces should be interpreted as AND or OR, defaults to AND
6 years ago
Erik de Vries
a1b6c6a7cb
actorizer, query_gen_actors: revamped actor searches entirely
...
elasticizer: updated script for use with ES 7.x
6 years ago
Erik de Vries
39005c7518
elasticizer: Updated bulk size to 1024 (a power of 2) and set a timeout of 900s every 500000 updates
...
query_gen_actors: Added an additional generator for the "Institution" type (for EU support)
actorizer: Created an updater function to search for actors and use UDPipe to parse the results
6 years ago
Erik de Vries
061da17c2a
ud_update: Added function to lemmatize documents
6 years ago
Erik de Vries
11d8b31c60
Added generic actor search query generator. Updated elasticizer and elastic_update to connect either to the remote server, or a local ES instance
6 years ago
Erik de Vries
db418d7396
Add query_string function for generating query_string queries
6 years ago
Erik de Vries
d203de0b2a
Updated elasticizer docs, created modelizer and class_update functions
6 years ago
Erik de Vries
c815dc7f2b
Duplicate detection first commit
6 years ago
Erik de Vries
4bbe84ab83
First release of mamlr package
6 years ago