Skip to Main Content

Genomics: gentle introduction to genomic data analysis with Galaxy

introduction to this guide

Galaxy is a free, web-based, and point-and-click platform for genomic data analysis.  This is a great tool for beginners in this arena of work to use and become familiar with the steps of genomic data analysis.  Once proficient, a transition to more advanced as well as flexible methods such as writing your own analysis pipelines in Unix can be made.

This set of videos serves as a very brief overview of the steps in DNA sequencing data analysis including

  • Uploading of FASTQ genomic data files to Galaxy 
  • An introduction to the content of FASTQ files
  • Mapping FASTQ genomic data to a reference genome
  • Discovering genetic variants in genomic data

You can get access to Galaxy in two ways

  • Register for a free account with public Galaxy at usegalaxy.org.  The public account provides users 250 GB of storage space.
  • Use the Galaxy instance provided by your institution's high performance computing group.  University of Florida (UF) Research Computing hosts an instance of Galaxy so all researchers at UF can use this if they have purchased computing resources from Research Computing.  Storage space for data and computing power is determined by the amount of resources purchased by the researchers.
  • Note that you can also run Galaxy from commercial cloud services such as Amazon Web Service.  These platforms are pay as you go (i.e.  pay by minutes used).

 

introduction to galaxy

logging onto galaxy from UF Research Computing

layout of galaxy

uploading your data to galaxy workspace

content of a fastq genomic data file

mapping reads to reference genome with bowtie2

variant discovery with freebayes

University of Florida Home Page

This page uses Google Analytics - (Google Privacy Policy)

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.