Erik de Vries
|
9e433ecf9e
|
actor_fetcher: added handling of exception where all actorsids related to a party are individual actors
|
5 years ago |
Erik de Vries
|
526270900c
|
actor_fetcher: integrated party merging into actor_fetcher in what hopefully is the most efficient way
|
5 years ago |
Erik de Vries
|
84df9658ff
|
actor_fetcher: added lemma output when validating, to detect most problematic lemmas
|
5 years ago |
Erik de Vries
|
499ee74f0d
|
actor_fetcher: fixed code error
|
5 years ago |
Erik de Vries
|
a3e8dcf96e
|
actor_fetcher: switched from binary word sentiment scores to proximity scores (cosine similarity)
|
5 years ago |
Erik de Vries
|
6f5ace8c52
|
actor_fetcher: elasticizer batch function to fetch actorsDetail fields from all relevant documents
|
5 years ago |
Erik de Vries
|
edd4b785a5
|
actor_aggregation: updated to use future package for parallel processing as beta test for switching all parallel processing to future. Also disabled some of the aggregator output to save computation time
|
5 years ago |
Erik de Vries
|
f8bc53006d
|
actor_aggregation: added sentiment analysis support for generating aggregations
|
5 years ago |
Erik de Vries
|
d3d4045f1c
|
actor_aggregation: added sentence count to output, and changed occurences to count instead of mean, changed prom and rel_first to prom_art and rel_first_art, changed output filename to include function
|
5 years ago |
Erik de Vries
|
176a8f6de4
|
elasticizer: added additional verbosity on errors
|
6 years ago |
Erik de Vries
|
d420b02c20
|
elasticizer: Added more verbosity to investigate error handling
|
6 years ago |
Erik de Vries
|
48b589dda0
|
query_gen_actors: reset to original state
|
6 years ago |
Erik de Vries
|
7a01a7f18d
|
query_gen_actors: temporary update for fixing broken shit
|
6 years ago |
Erik de Vries
|
45da9dd929
|
aggregator_elastic: revert to single-core lapply, due to sendMaster errors
|
6 years ago |
Erik de Vries
|
f8e4111e70
|
aggregator_elastic: correct partyid implementation
|
6 years ago |
Erik de Vries
|
c047a4a1db
|
aggregator_elastic: explicit reference to aggregator function
|
6 years ago |
Erik de Vries
|
0d81d6fc7a
|
added aggregator and aggregator_elastic functions for aggregating and storing article level actor aggregations
|
6 years ago |
Erik de Vries
|
2281d11a68
|
actor_aggregation: fixed filenaming of .Rds files
|
6 years ago |
Erik de Vries
|
d9f28a46d8
|
actor_aggregation: small fixes to code
|
6 years ago |
Erik de Vries
|
a29d04dacd
|
actorizer: fixed handling of empty results due to regex filtering
|
6 years ago |
Erik de Vries
|
8e920f5f37
|
elasticizer: removed idiotic 15min sleep time after 500 batches
|
6 years ago |
Erik de Vries
|
a11d7728ea
|
actor_aggregation: only aggregate scores on non-junk articles
|
6 years ago |
Erik de Vries
|
54a70c47a0
|
actor_aggregation: removed timeout for parallel processing, requires fix in elasticizer (cannot recycle the same connection)
|
6 years ago |
Erik de Vries
|
58fce4d560
|
actor_aggregation: added randomized short sleep, to allow for parallel execution
|
6 years ago |
Erik de Vries
|
e3b26c0be3
|
actor_aggregation: Added function to generate aggregate actor measures at daily, weekly, monthly and yearly level
query_string: Added default_operator parameter, to define whether whitespaces should be interpreted as AND or OR, defaults to AND
|
6 years ago |
Erik de Vries
|
28989f2bc4
|
dfm_gen: yet another fix for codes
|
6 years ago |
Erik de Vries
|
0757b6bf8b
|
dfm_gen: re-added codes variable
|
6 years ago |
Erik de Vries
|
2fc48cc2f7
|
dfm_gen: fixed absence of out$codes field
|
6 years ago |
Erik de Vries
|
b249ff22de
|
dfm_gen.R: fixed junk mutation
|
6 years ago |
Erik de Vries
|
0d05765ca7
|
dfm_gen: removed last remains of summer sample exceptions
|
6 years ago |
Erik de Vries
|
e199b23227
|
dfm_gen: removed exceptions for NO summer codes
modelizer: created exception for outer_folds = 1
query_string: added parameter for default_operator
|
6 years ago |
Erik de Vries
|
fbd525dc2e
|
modelizer: updated outer cross validation procedure to output raw prediction and true values, instead of processed and aggregated confusion matrix results
|
6 years ago |
Erik de Vries
|
6a94bc3ed8
|
query_gen_actors: removed quotation marks from Minister search part
|
6 years ago |
Erik de Vries
|
8d19333e59
|
query_gen_actors: changed script order for belgium exceptions
|
6 years ago |
Erik de Vries
|
3bfe61e425
|
query_gen_actors: fixed implementation of Belgian exceptions
|
6 years ago |
Erik de Vries
|
81697345cb
|
modelizer: removed breaking code
|
6 years ago |
Erik de Vries
|
9ca952ca89
|
elastic_update: removed wait_for from url
|
6 years ago |
Erik de Vries
|
8051a81b66
|
actorizer, dfm_gen, modelizer, out_parser: replaced all instances of detectCores by cores parameter (which defaults to detectCores)
|
6 years ago |
Erik de Vries
|
ac37d836f5
|
elasticizer: added scroll_clear to null hits as well
|
6 years ago |
Erik de Vries
|
75623856f7
|
elasticizer: updated scroll_clear to use conn object
|
6 years ago |
Erik de Vries
|
c2d666c81d
|
bogus commit
|
6 years ago |
Erik de Vries
|
e34460bf0f
|
elasticizer: clear scroll context when finishing query
|
6 years ago |
Erik de Vries
|
9bd526fee0
|
elasticizer: fixed compatibility issues with elastic v1.0.0
|
6 years ago |
Erik de Vries
|
f2312f65d5
|
elasticizer: update to account for syntax change in newer package versions
|
6 years ago |
Erik de Vries
|
f6006eb9ba
|
actorizer: simplified pre/postfix check, only for NA, replace empty strings by NA beforehand
|
6 years ago |
Erik de Vries
|
298099a4e6
|
actorizer: fix to deal with empty updates (ie dont do an update)
|
6 years ago |
Erik de Vries
|
6961c0b866
|
query_gen_actors: updated actorid filter to use the keyword subfield
|
6 years ago |
Erik de Vries
|
703b5e59a4
|
actorizer: fixed exceptionizer by adding whitespace before and after sentence, which is necessary because of negative regex (match anything before or after the highlight string that is NOT x actually requires something to be in front or after)
|
6 years ago |
Erik de Vries
|
593d2de6e2
|
actorizer: add pre_tags and post_tags to argument list
bulk_writer: updated to use _doc doctype
query_gen_actors: added NA for all searches that don't have pre- or postfixes
|
6 years ago |
Erik de Vries
|
a1b6c6a7cb
|
actorizer, query_gen_actors: revamped actor searches entirely
elasticizer: updated script for use with ES 7.x
|
6 years ago |