Event box

TCGA RNA-Seq Data Download and Analysis, All on Your Laptop

The Cancer Genome Atlas (TCGA) is a large-scale study that has cataloged genomic data accumulated for many different types of cancers, and includes mutations, copy number variation, mRNA and miRNA gene expression, and DNA methylation.  Being publicly distributed, it has become a major resource for cancer researchers in target discovery and in the biological interpretation and assessment of the clinical impact of genes of interest.

Read more about the TCGA project.

In this session you will learn:

  • Navigation of the new TCGA data-portal and introduction of its analysis pipelines
  • Filter and download the high level RNA-Seq data including FPKM and Clinical annotations for a specific cohort
  • Compile the downloaded files to a FPKM matrix of gene by sample and convert the annotation file (json) to a simple sheet with variables you are interested in
  • Prepare a file ready for input to Qlucore or Partek, which you can have a license to use
  • Introduction to analyze and visualize the data in Rstudio
  • Introduction to analyze the gene expression data with Firehose.


  • Little or no programming skills, but you should be eager to learn and not be afraid to type in and run a few commands
  • Access to a bash shell: linux, os x, windows 10 + subsystem for linux, or windows 7 + babun

GDC client, Rstudio installed on linux, os x or windows


Related LibGuide: Basic Science Resources and Collections by Rolando Garcia-Milian

Monday, April 24, 2017
2:00pm - 4:00pm
Cushing/Whitney Medical Library, Simbonis Conference Room 101A, 333 Cedar St, New
Medical School
Registration has closed.

Event Organizer

Profile photo of Rolando Garcia-Milian
Rolando Garcia-Milian