Skip to Main Content

Research Data Management at UF: Metadata & Standards

This is a guide on resources available at the University of Florida and beyond on research data management. It includes information about tools for data management planning, data and file sharing, metadata and data standards, and data storage.

Standards

Use standardized taxonomies, controlled vocabularies, and ontologies including domain, national, and international standards in the capture, management and archiving of data.

 

FAIRsharing is a register of metadata standards and their related databases and policies.  Originally BioSharing (with a focus on the life sciences), it now serves all disciplines.  

 

Disciplinary Metadata list developed by the UK's Digital Curation Center.

 

Familiar clinical ontologies: ICD-9, ICD-10 and SNOMED-CT.

NIH Common Data Elements (CDE)

The NIH's Common Data Element (CDE) Resource Portal provides access to NIH-supported CDE initiatives and tools. 

Common Data Elements (CDEs) are data elements common to multiple data sets across different studies. Use of CDEs can help improve data quality and the opportunity for comparison and combination of data from multiple studies.

Clinical Data Interchange Standards Consortium (CDISC)

CDISC is a global nonprofit whose goal is to develop standards for clinical research data collection.  Standards areas include protocol representation, clinical data acquisition,  and pharmcogenomics/genetics testing and description.  You must create a free account to access the standards.

Add value to your research data and collections

Librarians use discipline specific metadata and standards to find relevant information on daily basis. Consult your liaison librarian/ subject specialist on how to increase the visibility of your research data, add value to it, and comply with discipline-specific standards.

What is metadata?

Information about data: the information required to understand data, context, quality, structure, and accessibility (Michener et al., 1997)

 

"Metadata is structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource." (NISO, 2004

 

Descriptive information that helps you and others discover and understand your data

 

Data about data” that acts as a surrogate for your data when you or others are trying to:

  • Find the data later
  • Know what the data is later
  • Share the data later

Benefits of good metadata:

  • Re-use and data sharing are facilitated
  • Data is more easily discovered
  • May expand the scale of study
  • Can address unanticipated questions about the data
  • Diverse data may be integrated

FAIR Data

FAIR Data Principles state that data should be :

  • Findable - with persistent identifiers and basic machine-actionable metadata
  • Accessible - to be read by machines and humans
  • Interoperable - through the use of shared vocabularies/ontologies, machine-accessible data and metadata
  • Re-Usable - with sufficient description to link with other data sources

Metadata across the disciplines

Basic information to keep:

Descriptive

  • What is it about?
  • Title, time, author, keywords
  • Relations to other data objects

Administrative

  • Ownership and use permissions

Provenance

  • Where does it come from?
  • History of changes to the data, versions

More specific information varies by discipline

Annotation and metadata examples

Here are examples of metadata requirements for submitting array data to the NCBI's Gene Expression Omnibus repository.

The presentation below shows some common pitfalls in spreadsheet organization.

Hazards of Not Planning to Share Data

NYU Health Sciences Libraries animation showing the importance of proper data annotation or documentation

How NOT to Label Your Files

xkcd comic - Documents

University of Florida Home Page

This page uses Google Analytics - (Google Privacy Policy)

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.