Commit Graph

200 Commits (a37fc0410d895f28fb0172632e9497fdad5d0936)

Author SHA1 Message Date
Erik de Vries a37fc0410d removed sent_sum_pos/neg 5 years ago
Erik de Vries 153c54b376 reintroduced arousal, but should be warned that arousal performance is not directly evaluated 5 years ago
Erik de Vries cdc78039ed removing text-level output from sentencizer, and optimizing storage by using factors 5 years ago
Erik de Vries 523d86799c removed arousal measures 5 years ago
Erik de Vries 4a0f2206fd removed multicore support, added parameters for dfm_gen 5 years ago
Your Name 274c9179cb remove meta_file argument 5 years ago
Your Name 6e0e693d4e lemma_writer: removed meta csv code 5 years ago
Your Name 4fd9222a2d lemma_writer: updated to write metadata csv when dumping documents in ud format 5 years ago
Your Name 955f034e6a actor_merger: changed computation of arousal, and removed uninformative variables 5 years ago
Your Name 3cdb68b196 out_parser: updated fncols function 5 years ago
Your Name dc40fbbb19 elasticizer: update rbindlist implementation 5 years ago
Your Name 18d47762d2 actor_merger: overhaul to include cutoffs at sentence level as intended, also included options to generate sentiment for text only (don't provide actors_meta or actor_groups) 5 years ago
Your Name 74909ca3a0 sentencizer: removed text sentiment computation from script, because of incorrect implementation 5 years ago
Your Name c99ac23bb5 actor_merger: fixed absence of publication_date in some cases 5 years ago
Your Name cc7fa5bffa actor_merger: added aggregations of all individual actors and all party mentions in an article 5 years ago
Your Name d9d578c06a actor_merger: mult fix 5 years ago
Your Name 771145faf7 actor_merger: added mult='first' to metadata join for parties_actors to deal with duplicate partyIds (see 50Plus, Conservatives and Labour) 5 years ago
Your Name 1c14646e8f actor_merger: dont deselect sent_words and sent_sum columns 5 years ago
Your Name 9bd382f955 actor_merger: fix to generate bogus sentiment columns 5 years ago
Your Name b7f1afddd1 actor_merger: total rewrite based on data.table for performance reasons. Added some exceptions due to non-existing partyIds that some individual actors have in the actor database 5 years ago
Your Name 2c8a88f9a0 elasticizer: switched from bind_rows to rbindlist for composing result 5 years ago
Your Name 559199bb97 sentencizer: totally removed sent_lemmas field 5 years ago
Your Name 36f2b341a8 sentencizer: removed derived output from function 5 years ago
Your Name 80ec0be1f8 actorizer: updated to account for token start offset in udpipe output. Sometimes, the first token in an article doesn't start at character position 1 (or 2 if the article starts with a whitespace), but at position 16 and possibly other positions. 5 years ago
Your Name 336567732c elastic_update: added more debug output 5 years ago
Your Name df7631b9f1 sentencizer: Changed output, removed lemma list and added separate positive and negative sentiment sums 5 years ago
Your Name ecdb5be3b4 actorizer: moved some code 5 years ago
Your Name 69d4b6f5b0 actorizer: updated to data.table for conditional joins 5 years ago
Your Name 085855908c query_gen_actors: switched from Minister to Min 5 years ago
Your Name b406304c80 actorizer: Removed nested parallelization function 5 years ago
Your Name 5de4e1488c estimator, modelizer, preproc: Removed experimental we-vector support, and disabled inefficiently implemented preproc.R 5 years ago
Your Name 77eb51a1bf actorizer: totally revamped way of finding actors 5 years ago
Your Name 0e593075ee query_gen_actors: only retrieve ud field from source 5 years ago
Your Name 6eb405f8bd merger: selecting only relevant columns 5 years ago
Your Name 38ff4dcbf0 ud_update: small fix to file naming 5 years ago
Your Name 4b4d860235 class_update: remove dfm_gen multicore option 5 years ago
Your Name 5d99ec9509 elasticizer: added option to dump data frames to rds files 5 years ago
Your Name aa6587b204 dupe_detect: fix for quotation marks 5 years ago
Your Name 2a220ded5d dupe_detect: fix to query string for multi-word doctype names 5 years ago
Your Name 5bd36dcb44 dupe_detect: Changed query from json to query_string style, and added filter for already detected duplicates 5 years ago
Your Name e499d70671 actor_merger: added ungroup() calls at the start and end of function, to speed up processing 5 years ago
Your Name 8634d549a3 sentencizer: updates to collect sentence word counts and number of sentences also when no sent_dict is provided 5 years ago
Your Name 61e0581595 actor_merger: removed debug line 5 years ago
Your Name f022312485 actor_merger: added function for generating actor-document data frames 5 years ago
Your Name 4e867214dd sentencizer: commented code 5 years ago
Your Name ec8afc4990 sentencizer: fixed actorsDetail coding error 5 years ago
Your Name 9ccfd2952e sentencizer: minor updates 5 years ago
Your Name 98325bde8f sentencizer: added new function for sentiment coding and actor collection 5 years ago
Your Name 7f958bbc11 actor_fetcher: small fixes 5 years ago
Your Name 8eedec8bb5 actor_fetcher: added option for using dictionaries with just lemmas, besides the option of using lemma_upos dictionaries 5 years ago