Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
Alock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Loading...

About the NLM Dataset Catalog

The NLM Dataset Catalog is a freely available catalog of biomedical datasets available from various repositories. Adhering to Findable, Accessible, Interoperable, and Reusable (FAIR) data management principles, the NLM Dataset Catalog allows users to search across multiple data repositories for specific biomedical datasets. The tool is designed to help improve the discoverability and reuse of research data by making it easier for users to find and connect biomedical datasets in disparate source repositories. This functionality aligns with the National Institutes of Health's (NIH) efforts to make available to the public the results of research it supports and conducts. Bringing disparate metadata into a standardized format empowers researchers to share and discover data in a broader environment and create relationships that might otherwise not be apparent.

Descriptive dataset metadata from the various repositories are converted to the DATMM linked data standard and added to the NLM Dataset Catalog. By harmonizing and standardizing the structure of descriptive data, the NLM Dataset Catalog facilitates discovery and reuse of biomedical datasets and will make it easier to find and connect datasets to related objects on the Semantic Web.

The Dataset Catalog is designed to:

Governance

NLM Dataset Catalog Content and Inclusion Policy

The NLM Dataset Catalog is a freely available catalog of biomedical datasets available from various domain-specific and generalist data sharing repositories. Data sharing repositories which are currently participants in the NIH Generalist Repository Ecocystem Initiative (GREI) or are included in the Trans-NIH BioMedical Informatics Coordinating Committee (BMIC) Data Sharing Repositories are candidates for inclusion in the Dataset Catalog. NIH, an Institute, Center or Office (ICO) may fund the repository in whole or in part, or the repository may house data related to the ICO's research focus. The Dataset Catalog does not store any datasets; it is a collection of bibliographic metadata that describes datasets available from host dataset repositories. The Dataset Catalog provides links to datasets at the host repository, where the datasets themselves can be directly accessed.

The NLM Dataset Catalog provides a standardized description of biomedical dataset information from relevant dataset repositories and some datasets can be available from multiple repositories.

The NLM Dataset Catalog is not a dataset repository and does not hold datasets that are available in host repositories. Aligned with NLM Collection Development Guidelines, information regarding datasets that are removed from host repositories will continue to be discoverable in Dataset Catalog.

Technical Criteria

Biomedical data repositories are reviewed by NLM for potential and continued inclusion in the Dataset Catalog utilizing the technical criteria specified below:

Are the data and datasets freely and easily accessed by satisfying at least one of the following?

Additional Technical Requirements:

Desirable Characteristics for All Data Repositories

Dataset Catalog Content

NLM does not review, evaluate, or judge the quality of individual datasets. The host repository managing organization is responsible for maintaining the currency of the scientific record within their repository. Questions regarding the datasets or the data they contain should be addressed to the host repository consistent with their policies.

Contact

datasetcatalog@nlm.nih.gov
About | NLM Dataset Catalog