#' @param grid A cross-table of all possible combinations of doctypes and dates
#' @param grid A cross-table of all possible combinations of doctypes and dates
#' @param cutoff Cutoff value for cosine similarity above which documents are considered duplicates
#' @param cutoff Cutoff value for cosine similarity above which documents are considered duplicates
#' @param es_pwd Password for Elasticsearch read access
#' @param es_pwd Password for Elasticsearch read access
#' @return dupe_objects.json (containing each id and all its duplicates) and remove_ids.txt (list of ids to be removed) in current working directory
#' @return dupe_objects.json and data frame containing each id and all its duplicates. remove_ids.txt and character vector with list of ids to be removed. Files are in current working directory
\item{es_pwd}{Password for Elasticsearch read access}
\item{es_pwd}{Password for Elasticsearch read access}
}
}
\value{
\value{
dupe_objects.json (containing each id and all its duplicates) and remove_ids.txt (list of ids to be removed) in current working directory
dupe_objects.json and data frame containing each id and all its duplicates. remove_ids.txt and character vector with list of ids to be removed. Files are in current working directory
}
}
\description{
\description{
Get ids of duplicate documents that have a cosine similarity score higher than [threshold]
Get ids of duplicate documents that have a cosine similarity score higher than [threshold]