121 Commits (176a8f6de46325ed079d017a7c361e82a06244d9)

Author SHA1 Message Date
Erik de Vries 176a8f6de4 elasticizer: added additional verbosity on errors
5 years ago
Erik de Vries d420b02c20 elasticizer: Added more verbosity to investigate error handling
5 years ago
Erik de Vries 48b589dda0 query_gen_actors: reset to original state
5 years ago
Erik de Vries 7a01a7f18d query_gen_actors: temporary update for fixing broken shit
5 years ago
Erik de Vries 45da9dd929 aggregator_elastic: revert to single-core lapply, due to sendMaster errors
5 years ago
Erik de Vries f8e4111e70 aggregator_elastic: correct partyid implementation
5 years ago
Erik de Vries c047a4a1db aggregator_elastic: explicit reference to aggregator function
5 years ago
Erik de Vries 0d81d6fc7a added aggregator and aggregator_elastic functions for aggregating and storing article level actor aggregations
5 years ago
Erik de Vries 2281d11a68 actor_aggregation: fixed filenaming of .Rds files
5 years ago
Erik de Vries d9f28a46d8 actor_aggregation: small fixes to code
5 years ago
Erik de Vries a29d04dacd actorizer: fixed handling of empty results due to regex filtering
5 years ago
Erik de Vries 8e920f5f37 elasticizer: removed idiotic 15min sleep time after 500 batches
5 years ago
Erik de Vries a11d7728ea actor_aggregation: only aggregate scores on non-junk articles
5 years ago
Erik de Vries 54a70c47a0 actor_aggregation: removed timeout for parallel processing, requires fix in elasticizer (cannot recycle the same connection)
5 years ago
Erik de Vries 58fce4d560 actor_aggregation: added randomized short sleep, to allow for parallel execution
5 years ago
Erik de Vries e3b26c0be3 actor_aggregation: Added function to generate aggregate actor measures at daily, weekly, monthly and yearly level
5 years ago
Erik de Vries 28989f2bc4 dfm_gen: yet another fix for codes
5 years ago
Erik de Vries 0757b6bf8b dfm_gen: re-added codes variable
5 years ago
Erik de Vries 2fc48cc2f7 dfm_gen: fixed absence of out$codes field
5 years ago
Erik de Vries b249ff22de dfm_gen.R: fixed junk mutation
5 years ago
Erik de Vries 0d05765ca7 dfm_gen: removed last remains of summer sample exceptions
5 years ago
Erik de Vries e199b23227 dfm_gen: removed exceptions for NO summer codes
5 years ago
Erik de Vries fbd525dc2e modelizer: updated outer cross validation procedure to output raw prediction and true values, instead of processed and aggregated confusion matrix results
5 years ago
Erik de Vries 6a94bc3ed8 query_gen_actors: removed quotation marks from Minister search part
5 years ago
Erik de Vries 8d19333e59 query_gen_actors: changed script order for belgium exceptions
5 years ago
Erik de Vries 3bfe61e425 query_gen_actors: fixed implementation of Belgian exceptions
5 years ago
Erik de Vries 81697345cb modelizer: removed breaking code
5 years ago
Erik de Vries 9ca952ca89 elastic_update: removed wait_for from url
5 years ago
Erik de Vries 8051a81b66 actorizer, dfm_gen, modelizer, out_parser: replaced all instances of detectCores by cores parameter (which defaults to detectCores)
5 years ago
Erik de Vries ac37d836f5 elasticizer: added scroll_clear to null hits as well
5 years ago
Erik de Vries 75623856f7 elasticizer: updated scroll_clear to use conn object
5 years ago
Erik de Vries c2d666c81d bogus commit
5 years ago
Erik de Vries e34460bf0f elasticizer: clear scroll context when finishing query
5 years ago
Erik de Vries 9bd526fee0 elasticizer: fixed compatibility issues with elastic v1.0.0
5 years ago
Erik de Vries f2312f65d5 elasticizer: update to account for syntax change in newer package versions
5 years ago
Erik de Vries f6006eb9ba actorizer: simplified pre/postfix check, only for NA, replace empty strings by NA beforehand
5 years ago
Erik de Vries 298099a4e6 actorizer: fix to deal with empty updates (ie dont do an update)
5 years ago
Erik de Vries 6961c0b866 query_gen_actors: updated actorid filter to use the keyword subfield
5 years ago
Erik de Vries 703b5e59a4 actorizer: fixed exceptionizer by adding whitespace before and after sentence, which is necessary because of negative regex (match anything before or after the highlight string that is NOT x actually requires something to be in front or after)
5 years ago
Erik de Vries 593d2de6e2 actorizer: add pre_tags and post_tags to argument list
5 years ago
Erik de Vries a1b6c6a7cb actorizer, query_gen_actors: revamped actor searches entirely
5 years ago
Erik de Vries 88fc4ec53c dfm_gen: changed out_parser call to mamlr:::out_parser
6 years ago
Erik de Vries 90fdbcc982 out_parser: parallelized when not in windoze
6 years ago
Erik de Vries 6414f759bd actorizer: parallelized calculation of marker positions
6 years ago
Erik de Vries 522c872dba out_parser: moved cleaning regex to end of pipeline, to prevent collissions with other (mandatory) regex cleaning
6 years ago
Erik de Vries 5b9793cd8c actorizer: removed nested mclapply
6 years ago
Erik de Vries 1a4ba19546 actorizer: Removed udmodel dependencies, commented code, changed nested lists to flat lists
6 years ago
Erik de Vries 3abc3056e0 actorizer: fix to columns selected for actors variable, removed udmodel requirement
6 years ago
Erik de Vries 41c86ea116 actorizer, ud_update: Updated ud parsing and actorizer to work based on character positions. This code is used for local testing
6 years ago
Erik de Vries eae1a22609 actorizer: update to use '|||' as highlight indicator, and set up ud output merging accordingly
6 years ago