Erik de Vries
|
06bfec71bc
|
lemma_writer: unlist lemmas before writing
|
5 years ago |
Erik de Vries
|
a83ee5dfd0
|
lemma_writer: update to write lemma instead of full document text
|
5 years ago |
Erik de Vries
|
e594185719
|
dfm_gen: set default cores to 1
|
5 years ago |
Erik de Vries
|
889e7e92af
|
lemma_writer: updated to provide support for writing raw documents to individual files using utf-8 encoding
|
5 years ago |
Erik de Vries
|
115297f597
|
actor_aggregation,aggregator,aggregator_elastic: moved out of package directory to Old
actor_fetcher: moved sentiment validation code block
|
5 years ago |
Erik de Vries
|
3fcbbd1f1f
|
actor_fetch: fixed error where source.ud would not exist
|
5 years ago |
Erik de Vries
|
674ef09e10
|
query_gen_actors: added junior minister check to if statement
|
5 years ago |
Erik de Vries
|
853c117daf
|
actor_fetcher: change in code to keep original actorid lists in output
query_gen_actors: added code for junior ministers in BE and NL
|
5 years ago |
Erik de Vries
|
bf3d11ffe0
|
query_gen_actors: various bugfixes and changes
|
5 years ago |
Erik de Vries
|
99af1427f0
|
query_gen_actors: fixed scandinavian query generation
|
5 years ago |
Erik de Vries
|
e49a4ae93e
|
query_gen_actors: fixed problem with too many brackets in query
|
5 years ago |
Erik de Vries
|
060751237b
|
actorizer, out_parser: switched from mclapply to future_lapply and removed windows-specific code from out_parser
query_gen_actors: rewritten minister queries to only use proximity queries
|
5 years ago |
Erik de Vries
|
d0601d2aa7
|
actor_fetcher: added minimum verbosity to identify cases in which an actor is present without a party mention
|
5 years ago |
Erik de Vries
|
82ef165c5f
|
actor_fetcher: quick fix
|
5 years ago |
Erik de Vries
|
9e433ecf9e
|
actor_fetcher: added handling of exception where all actorsids related to a party are individual actors
|
5 years ago |
Erik de Vries
|
526270900c
|
actor_fetcher: integrated party merging into actor_fetcher in what hopefully is the most efficient way
|
5 years ago |
Erik de Vries
|
84df9658ff
|
actor_fetcher: added lemma output when validating, to detect most problematic lemmas
|
5 years ago |
Erik de Vries
|
499ee74f0d
|
actor_fetcher: fixed code error
|
5 years ago |
Erik de Vries
|
a3e8dcf96e
|
actor_fetcher: switched from binary word sentiment scores to proximity scores (cosine similarity)
|
6 years ago |
Erik de Vries
|
6f5ace8c52
|
actor_fetcher: elasticizer batch function to fetch actorsDetail fields from all relevant documents
|
6 years ago |
Erik de Vries
|
edd4b785a5
|
actor_aggregation: updated to use future package for parallel processing as beta test for switching all parallel processing to future. Also disabled some of the aggregator output to save computation time
|
6 years ago |
Erik de Vries
|
f8bc53006d
|
actor_aggregation: added sentiment analysis support for generating aggregations
|
6 years ago |
Erik de Vries
|
d3d4045f1c
|
actor_aggregation: added sentence count to output, and changed occurences to count instead of mean, changed prom and rel_first to prom_art and rel_first_art, changed output filename to include function
|
6 years ago |
Erik de Vries
|
176a8f6de4
|
elasticizer: added additional verbosity on errors
|
6 years ago |
Erik de Vries
|
d420b02c20
|
elasticizer: Added more verbosity to investigate error handling
|
6 years ago |
Erik de Vries
|
48b589dda0
|
query_gen_actors: reset to original state
|
6 years ago |
Erik de Vries
|
7a01a7f18d
|
query_gen_actors: temporary update for fixing broken shit
|
6 years ago |
Erik de Vries
|
45da9dd929
|
aggregator_elastic: revert to single-core lapply, due to sendMaster errors
|
6 years ago |
Erik de Vries
|
f8e4111e70
|
aggregator_elastic: correct partyid implementation
|
6 years ago |
Erik de Vries
|
c047a4a1db
|
aggregator_elastic: explicit reference to aggregator function
|
6 years ago |
Erik de Vries
|
0d81d6fc7a
|
added aggregator and aggregator_elastic functions for aggregating and storing article level actor aggregations
|
6 years ago |
Erik de Vries
|
2281d11a68
|
actor_aggregation: fixed filenaming of .Rds files
|
6 years ago |
Erik de Vries
|
d9f28a46d8
|
actor_aggregation: small fixes to code
|
6 years ago |
Erik de Vries
|
a29d04dacd
|
actorizer: fixed handling of empty results due to regex filtering
|
6 years ago |
Erik de Vries
|
8e920f5f37
|
elasticizer: removed idiotic 15min sleep time after 500 batches
|
6 years ago |
Erik de Vries
|
a11d7728ea
|
actor_aggregation: only aggregate scores on non-junk articles
|
6 years ago |
Erik de Vries
|
54a70c47a0
|
actor_aggregation: removed timeout for parallel processing, requires fix in elasticizer (cannot recycle the same connection)
|
6 years ago |
Erik de Vries
|
58fce4d560
|
actor_aggregation: added randomized short sleep, to allow for parallel execution
|
6 years ago |
Erik de Vries
|
e3b26c0be3
|
actor_aggregation: Added function to generate aggregate actor measures at daily, weekly, monthly and yearly level
query_string: Added default_operator parameter, to define whether whitespaces should be interpreted as AND or OR, defaults to AND
|
6 years ago |
Erik de Vries
|
28989f2bc4
|
dfm_gen: yet another fix for codes
|
6 years ago |
Erik de Vries
|
0757b6bf8b
|
dfm_gen: re-added codes variable
|
6 years ago |
Erik de Vries
|
2fc48cc2f7
|
dfm_gen: fixed absence of out$codes field
|
6 years ago |
Erik de Vries
|
b249ff22de
|
dfm_gen.R: fixed junk mutation
|
6 years ago |
Erik de Vries
|
0d05765ca7
|
dfm_gen: removed last remains of summer sample exceptions
|
6 years ago |
Erik de Vries
|
e199b23227
|
dfm_gen: removed exceptions for NO summer codes
modelizer: created exception for outer_folds = 1
query_string: added parameter for default_operator
|
6 years ago |
Erik de Vries
|
fbd525dc2e
|
modelizer: updated outer cross validation procedure to output raw prediction and true values, instead of processed and aggregated confusion matrix results
|
6 years ago |
Erik de Vries
|
6a94bc3ed8
|
query_gen_actors: removed quotation marks from Minister search part
|
6 years ago |
Erik de Vries
|
8d19333e59
|
query_gen_actors: changed script order for belgium exceptions
|
6 years ago |
Erik de Vries
|
3bfe61e425
|
query_gen_actors: fixed implementation of Belgian exceptions
|
6 years ago |
Erik de Vries
|
81697345cb
|
modelizer: removed breaking code
|
6 years ago |