The Cancer Genome Atlas (TCGA) is a large-scale study that has cataloged genomic data accumulated for many different types of cancers, and includes mutations, copy number variation, mRNA and miRNA gene expression, and DNA methylation. Being publicly distributed, it has become a major resource for cancer researchers in target discovery and in the biological interpretation and assessment of the clinical impact of genes of interest.
In this session you will learn:
- Navigation of the new TCGA data-portal and introduction of its analysis pipelines
- Filter and download the high level RNA-Seq data including FPKM and Clinical annotations for a specific cohort
- Compile the downloaded files to a FPKM matrix of gene by sample and convert the annotation file (json) to a simple sheet with variables you are interested in
- Prepare a file ready for input to Qlucore or Partek, which you can have a license to use
- Introduction to analyze and visualize the data in Rstudio
- Introduction to analyze the gene expression data with Firehose.
- Little or no programming skills, but you should be eager to learn and not be afraid to type in and run a few commands
- Access to a bash shell: linux, os x, windows 10 + subsystem for linux, or windows 7 + babun