Import search results — import_search

This is a wrapper around the synthesisr function read_refs that processes the hits in a set of subdirectories in a directory, specifying the original file, directory, and, if the files were names using a convention, date, database, and interface (see the search_metadataRegex argument).

import_search_results(
  path,
  dirRegex = ".*",
  fileRegex = "\\.ris$",
  dirsToIgnoreRegex = NULL,
  filesToIgnoreRegex = NULL,
  recursive = TRUE,
  perl = TRUE,
  fieldsToCopy = list(doi = c("L3", "DO"), title = "T1"),
  preparatoryReplacements = NULL,
  synthesisr_tag_naming = "best_guess",
  copySep = " || ",
  idFieldName = "original_id",
  parallel = FALSE,
  search_metadataRegex = metabefor::opts$get("search_metadataRegex"),
  silent = metabefor::opts$get("silent")
)

Arguments

path: The path to the files with the search results.
dirRegex: The regular expression to match subdirectories against.
fileRegex: The regular expression to match filenames against.
dirsToIgnoreRegex, filesToIgnoreRegex: A regular expression to specify which directories and files should be ignored.
recursive: Whether to recursively read subdirectories.
perl: Whether to use Perl regular expressions
fieldsToCopy: Fields to copy over to other fields (e.g. L3 sometimes contains DOIs, instead of DO). This takes the form of a list of character vectors, with each vector's name being the new field name (to copy to), and each character vector listing the fields to copy (to that new field name). Set to NULL or pass an empty list to not copy anything over.
preparatoryReplacements: Specify any replacements to be made in the file(s) to import, as a named character vector, where each element's name is the replacement, and the corresponding value is a Perl regular expression which will (case insensitively) be searched in the file (e.g. to replace all RIS tags T1 with TI, pass c("TI" = "^T1") as value of preparatoryReplacements).
synthesisr_tag_naming: The value to pass to the synthesisr::read_refs() function as argument tag_naming.
copySep: When copying fields over (see fieldsToCopy), the separator to use when the new field is not empty (in which case the contents to copy over will be appended).
idFieldName: New name to use for the id field (column), a reserved name in JabRef (if NULL, the field is not renamed).
parallel: Whether to use multiple cores for parallel processing.
search_metadataRegex: A regular expression to match against the filenames. If it matches, metadata will be extracted in three capturing groups, in the order date (using ISO standard 8601 format, i.e. 2022-03-05), interface, and database, separated by underscores (_), with an optional fourth element, again separated with an underscore, that can be used to specify which query was run (in case multiple queries are used in the same database / interface combination and on the same date).
silent: Whether to be silent or chatty.

Value

An object with all the imported information, including, most importantly, the data frame bibHitDf with all results.

Examples

### Path to extra files in {metabefor} package
metabefor_files_path <-
  system.file(
    "extdata",
    package = "metabefor"
  ); 

### Path with OpenAlex exports
OpenAlexExport_path <-
  file.path(
    metabefor_files_path,
    "openalex-exports"
  ); 

bibHits_OpenAlex <-
  metabefor::import_search_results(
    OpenAlexExport_path
  );
#> Error in metabefor::import_search_results(OpenAlexExport_path): To use this function, you need to have the `synthesisr` package installed. To install it, run:
#> 
#>   install.packages('synthesisr');

### Look at the first five titles
bibHits_OpenAlex$bibHitDf$title[1:5]
#> Error: object 'bibHits_OpenAlex' not found

### Another example, using the filenames to
### provide metadata about the date, database,
### interface, and query specification

### Path with Esbco exports
EbscoExport_path <-
  file.path(
    metabefor_files_path,
    "ebsco-exports"
  ); 
  
bibHits_Ebsco <-
  metabefor::import_search_results(
    EbscoExport_path
  );
#> Error in metabefor::import_search_results(EbscoExport_path): To use this function, you need to have the `synthesisr` package installed. To install it, run:
#> 
#>   install.packages('synthesisr');
  
### Show the databases
metabefor::show_search_hits_by_database(
  bibHits_Ebsco
);
#> Error: object 'bibHits_Ebsco' not found