Skip to main content

Responsible Conduct of STEM Research: Data management

Tips and sources to help you conduct sci-tech research in an ethical and responsible manner.

Better Data = Better Research

Why manage data?

  • to preserve the integrity of the research
  • to allow reuse by others
  • to reduce risk of data loss

Why make data discoverable?

  • to enable work to be reproduced
  • to establish credibility and to hold trust
  • to enable faster progress in research, within or across disciplines
  • to meet requirements of funders, journal publishers, etc.

Why reuse data?

  • to verify research claims
  • to permit new discoveries from exisitng data
  • to foster integration of data sets for new analysis
  • to reduce duplication of effort

Data Management Plan - outline

A successful Data Management Plan (DMP) answers these questions:

Creating your data

  • What types of data will be produced for your project?
  • What identifiers will you use for your data?
  • How will you document your data?
  • How much data will the project produce?
  • How often will the data change or be updated, and will versions need to be tracked?

Organizing your data

  • What file formats will be produced for your project and what kinds of data management risks do they present?
  • How will you organize your files into directories and what naming conventions will you apply to both?
  • Have you included project and data documentation?

Managing your data

  • Who is responsible for managing and controlling the data?
  • For what or whom are the data intended?
  • How long must the data be retained?
  • How secure are the data? Do you have a procedure for backing up the data?

Sharing your data

  • Does project funding require your data to be shared or publicly accessible?
  • When and where do you intend to publish or distribute your data?
  • How should your data be cited?
  • Are there issues with privacy or intellectual property?

Ethical issues in data management

Ownership of data

When should you NOT share data?

  • To protect ongoing research from premature disclosure
  • To protect priority in publication
  • To protect private information about research subjects, trade secrets, or classified government information

When may you exclude data?

  • Human/technical error vs. outliers
  • Cleaning/editing vs. falsifying
  • Note excluded data in discussion

Supervisor's responsibilities

  • Review data-keeping regularly
  • Watch for inconsistent notation, misinterpreted instructions, malfunctioning equipment

Managing your lab notebook

Why keep a lab notebook?

  • To have a complete record of the work you have done and to maintain your rights to your findings.  Keep all your procedures, data, and comments in one place.
  • To prove that you did the work, and on what date -- especially if your work is novel, article-worthy, or potentially patentable.
  • To leave a trail for someone who may be interested in completing your work.
  • To serve multiple audiences: yourself, your colleagues, your funding sources.

Best practices:

Storing your data

Hazards of Not Planning to Share Data

from NYU Health Sciences Libraries, "a mini series showing a data management horror story."

Guidelines, Training, and Tools for Data Management

Guidelines for Image Processing

Data Repositories

Looking for other researchers' data?  Try these data repositories and registries:

Data cherry picking

https://ori.hhs.gov/images/ddblock/download-mp4.png
University of Florida Home Page

This page uses Google Analytics - (Google Privacy Policy)

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.