A high-precision rule-based extraction system for expanding geospatial metadata in GenBank records.
The metadata reflecting the location of the infected host (LOIH) of virus sequences in GenBank often lacks specificity. This work seeks to enhance this metadata by extracting more specific geographic information from related full-text articles and mapping them to their latitude/longitudes using knowledge derived from external geographical databases.
Author(s): Tahsin, Tasnia, Weissenbacher, Davy, Rivera, Robert, Beard, Rachel, Firago, Mari, Wallstrom, Garrick, Scotch, Matthew, Gonzalez, Graciela
DOI: 10.1093/jamia/ocv172