Erik de Vries
|
bbec8f5547
|
fix in package version check
|
2 months ago |
Erik de Vries
|
e3c8d04984
|
update
|
1 year ago |
Erik de Vries
|
0f7b1ee537
|
Add single_party param
Fix actor.first to use min() instead of first()
|
2 years ago |
Erik de Vries
|
5c80d82828
|
reintroduced certificate checks, linux01 certs work again
|
2 years ago |
Erik de Vries
|
fcdffb6f58
|
removed default_field, so that all text fields are queried by default (this also includes any coder comments!)
|
2 years ago |
Erik de Vries
|
9ae2866c41
|
remove default user
|
2 years ago |
Erik de Vries
|
b130f9c313
|
added es_user parameter
|
2 years ago |
Erik de Vries
|
3f268bbf06
|
Temporarily disable SSL verification
|
2 years ago |
Erik de Vries
|
2944039f73
|
test
|
3 years ago |
Erik de Vries
|
0b17555d99
|
sent_merger: Correctly add party metadata for _mfsa aggregations
|
3 years ago |
Erik de Vries
|
108372452c
|
sent_merger: Correctly add party metadata for _mfsa aggregations
|
3 years ago |
Erik de Vries
|
16d02a055d
|
sent_merger: Updated sentiment aggregation procedure. Now a dedicated actors_final.csv file is used as source of partyIds for individual actors, instead of the (deprecated) [partyId]_a ids that were previously provided as a result of the actor searches, or the (also deprecated) actor metadata provided in the ES actors database.
|
3 years ago |
Erik de Vries
|
8875630235
|
fixed actor metadata generation as well, because the same actorId might occur multiple times in a sentence, if that actor has multiple functions during the same period.
|
4 years ago |
Erik de Vries
|
9419d6dc08
|
Fixed incorrect mfs and mfsa aggregations. Previously multiple party/actor mentions in the same sentence (e.g. both a *_f and *_s mention) would all be taken into account separately, while the sentence should only be considered once
|
4 years ago |
Erik de Vries
|
7703a8cd5b
|
query_gen_actors: removed country argument, now reading country directly from actor data
|
4 years ago |
Erik de Vries
|
64a48e5977
|
sent_merger: fixed bug with publication_date and grouper()
|
4 years ago |
Erik de Vries
|
f6dfc6711b
|
minor fix
|
4 years ago |
Erik de Vries
|
09fd8d0cb2
|
removed some unused aggregations
|
4 years ago |
Erik de Vries
|
17d49f07c0
|
updated namespace and docs
|
4 years ago |
Erik de Vries
|
8ff4097304
|
renamed actor_merger to sent_merger and implemented fixes to work with sentiment data frames without actor ids
|
4 years ago |
Erik de Vries
|
a37fc0410d
|
removed sent_sum_pos/neg
|
4 years ago |
Erik de Vries
|
153c54b376
|
reintroduced arousal, but should be warned that arousal performance is not directly evaluated
|
4 years ago |
Erik de Vries
|
cdc78039ed
|
removing text-level output from sentencizer, and optimizing storage by using factors
|
4 years ago |
Erik de Vries
|
523d86799c
|
removed arousal measures
|
4 years ago |
Erik de Vries
|
4a0f2206fd
|
removed multicore support, added parameters for dfm_gen
|
4 years ago |
Your Name
|
274c9179cb
|
remove meta_file argument
|
4 years ago |
Your Name
|
6e0e693d4e
|
lemma_writer: removed meta csv code
|
4 years ago |
Your Name
|
4fd9222a2d
|
lemma_writer: updated to write metadata csv when dumping documents in ud format
out_parser: fix for generating empty columns
|
4 years ago |
Your Name
|
955f034e6a
|
actor_merger: changed computation of arousal, and removed uninformative variables
|
4 years ago |
Your Name
|
3cdb68b196
|
out_parser: updated fncols function
|
4 years ago |
Your Name
|
dc40fbbb19
|
elasticizer: update rbindlist implementation
|
4 years ago |
Your Name
|
18d47762d2
|
actor_merger: overhaul to include cutoffs at sentence level as intended, also included options to generate sentiment for text only (don't provide actors_meta or actor_groups)
|
4 years ago |
Your Name
|
74909ca3a0
|
sentencizer: removed text sentiment computation from script, because of incorrect implementation
|
4 years ago |
Your Name
|
c99ac23bb5
|
actor_merger: fixed absence of publication_date in some cases
|
4 years ago |
Your Name
|
cc7fa5bffa
|
actor_merger: added aggregations of all individual actors and all party mentions in an article
|
4 years ago |
Your Name
|
d9d578c06a
|
actor_merger: mult fix
|
4 years ago |
Your Name
|
771145faf7
|
actor_merger: added mult='first' to metadata join for parties_actors to deal with duplicate partyIds (see 50Plus, Conservatives and Labour)
|
4 years ago |
Your Name
|
1c14646e8f
|
actor_merger: dont deselect sent_words and sent_sum columns
|
4 years ago |
Your Name
|
9bd382f955
|
actor_merger: fix to generate bogus sentiment columns
|
4 years ago |
Your Name
|
b7f1afddd1
|
actor_merger: total rewrite based on data.table for performance reasons. Added some exceptions due to non-existing partyIds that some individual actors have in the actor database
|
4 years ago |
Your Name
|
2c8a88f9a0
|
elasticizer: switched from bind_rows to rbindlist for composing result
actor_merger: added noactor.* sentiment columns, and switched to data.table for matching actor metadata with articles
|
4 years ago |
Your Name
|
559199bb97
|
sentencizer: totally removed sent_lemmas field
|
4 years ago |
Your Name
|
36f2b341a8
|
sentencizer: removed derived output from function
|
4 years ago |
Your Name
|
80ec0be1f8
|
actorizer: updated to account for token start offset in udpipe output. Sometimes, the first token in an article doesn't start at character position 1 (or 2 if the article starts with a whitespace), but at position 16 and possibly other positions.
|
4 years ago |
Your Name
|
336567732c
|
elastic_update: added more debug output
|
4 years ago |
Your Name
|
df7631b9f1
|
sentencizer: Changed output, removed lemma list and added separate positive and negative sentiment sums
|
4 years ago |
Your Name
|
ecdb5be3b4
|
actorizer: moved some code
|
4 years ago |
Your Name
|
50f33e78d7
|
DESCRIPTION: updated
|
4 years ago |
Your Name
|
69d4b6f5b0
|
actorizer: updated to data.table for conditional joins
DESCRIPTION: added data.table dependency
|
4 years ago |
Your Name
|
085855908c
|
query_gen_actors: switched from Minister to Min
|
4 years ago |