152 Commits (06bfec71bc29109303e5488a957c143b3cf40dc0)
 

Author SHA1 Message Date
Erik de Vries 06bfec71bc lemma_writer: unlist lemmas before writing
5 years ago
Erik de Vries a83ee5dfd0 lemma_writer: update to write lemma instead of full document text
5 years ago
Erik de Vries e594185719 dfm_gen: set default cores to 1
5 years ago
Erik de Vries 889e7e92af lemma_writer: updated to provide support for writing raw documents to individual files using utf-8 encoding
5 years ago
Erik de Vries 115297f597 actor_aggregation,aggregator,aggregator_elastic: moved out of package directory to Old
5 years ago
Erik de Vries 3fcbbd1f1f actor_fetch: fixed error where source.ud would not exist
5 years ago
Erik de Vries 674ef09e10 query_gen_actors: added junior minister check to if statement
5 years ago
Erik de Vries 853c117daf actor_fetcher: change in code to keep original actorid lists in output
5 years ago
Erik de Vries bf3d11ffe0 query_gen_actors: various bugfixes and changes
5 years ago
Erik de Vries 99af1427f0 query_gen_actors: fixed scandinavian query generation
5 years ago
Erik de Vries e49a4ae93e query_gen_actors: fixed problem with too many brackets in query
5 years ago
Erik de Vries 060751237b actorizer, out_parser: switched from mclapply to future_lapply and removed windows-specific code from out_parser
5 years ago
Erik de Vries d0601d2aa7 actor_fetcher: added minimum verbosity to identify cases in which an actor is present without a party mention
5 years ago
Erik de Vries 82ef165c5f actor_fetcher: quick fix
5 years ago
Erik de Vries 9e433ecf9e actor_fetcher: added handling of exception where all actorsids related to a party are individual actors
5 years ago
Erik de Vries 526270900c actor_fetcher: integrated party merging into actor_fetcher in what hopefully is the most efficient way
5 years ago
Erik de Vries 84df9658ff actor_fetcher: added lemma output when validating, to detect most problematic lemmas
5 years ago
Erik de Vries 499ee74f0d actor_fetcher: fixed code error
5 years ago
Erik de Vries a3e8dcf96e actor_fetcher: switched from binary word sentiment scores to proximity scores (cosine similarity)
6 years ago
Erik de Vries 6f5ace8c52 actor_fetcher: elasticizer batch function to fetch actorsDetail fields from all relevant documents
6 years ago
Erik de Vries edd4b785a5 actor_aggregation: updated to use future package for parallel processing as beta test for switching all parallel processing to future. Also disabled some of the aggregator output to save computation time
6 years ago
Erik de Vries f8bc53006d actor_aggregation: added sentiment analysis support for generating aggregations
6 years ago
Erik de Vries d3d4045f1c actor_aggregation: added sentence count to output, and changed occurences to count instead of mean, changed prom and rel_first to prom_art and rel_first_art, changed output filename to include function
6 years ago
Erik de Vries 176a8f6de4 elasticizer: added additional verbosity on errors
6 years ago
Erik de Vries d420b02c20 elasticizer: Added more verbosity to investigate error handling
6 years ago
Erik de Vries 48b589dda0 query_gen_actors: reset to original state
6 years ago
Erik de Vries 7a01a7f18d query_gen_actors: temporary update for fixing broken shit
6 years ago
Erik de Vries 45da9dd929 aggregator_elastic: revert to single-core lapply, due to sendMaster errors
6 years ago
Erik de Vries f8e4111e70 aggregator_elastic: correct partyid implementation
6 years ago
Erik de Vries c047a4a1db aggregator_elastic: explicit reference to aggregator function
6 years ago
Erik de Vries 0d81d6fc7a added aggregator and aggregator_elastic functions for aggregating and storing article level actor aggregations
6 years ago
Erik de Vries 2281d11a68 actor_aggregation: fixed filenaming of .Rds files
6 years ago
Erik de Vries d9f28a46d8 actor_aggregation: small fixes to code
6 years ago
Erik de Vries a29d04dacd actorizer: fixed handling of empty results due to regex filtering
6 years ago
Erik de Vries 8e920f5f37 elasticizer: removed idiotic 15min sleep time after 500 batches
6 years ago
Erik de Vries a11d7728ea actor_aggregation: only aggregate scores on non-junk articles
6 years ago
Erik de Vries 54a70c47a0 actor_aggregation: removed timeout for parallel processing, requires fix in elasticizer (cannot recycle the same connection)
6 years ago
Erik de Vries 58fce4d560 actor_aggregation: added randomized short sleep, to allow for parallel execution
6 years ago
Erik de Vries e3b26c0be3 actor_aggregation: Added function to generate aggregate actor measures at daily, weekly, monthly and yearly level
6 years ago
Erik de Vries 28989f2bc4 dfm_gen: yet another fix for codes
6 years ago
Erik de Vries 0757b6bf8b dfm_gen: re-added codes variable
6 years ago
Erik de Vries 2fc48cc2f7 dfm_gen: fixed absence of out$codes field
6 years ago
Erik de Vries b249ff22de dfm_gen.R: fixed junk mutation
6 years ago
Erik de Vries 0d05765ca7 dfm_gen: removed last remains of summer sample exceptions
6 years ago
Erik de Vries e199b23227 dfm_gen: removed exceptions for NO summer codes
6 years ago
Erik de Vries fbd525dc2e modelizer: updated outer cross validation procedure to output raw prediction and true values, instead of processed and aggregated confusion matrix results
6 years ago
Erik de Vries 6a94bc3ed8 query_gen_actors: removed quotation marks from Minister search part
6 years ago
Erik de Vries 8d19333e59 query_gen_actors: changed script order for belgium exceptions
6 years ago
Erik de Vries 3bfe61e425 query_gen_actors: fixed implementation of Belgian exceptions
6 years ago
Erik de Vries 81697345cb modelizer: removed breaking code
6 years ago