Choosing a study/cohort
Last updated
Last updated
We recommend the TCGA Pan-Cancer (PANCAN) study for most analysis. Unless you need a specific type of data or need to run a type of analysis listed below, we recommend the TCGA Pan-Cancer (PANCAN) study.
Why do we recommend this study?
We recommend it because it has the data from the Cancer Genome Atlas (TCGA) Research Network, which generated the most comprehensive cross-cancer analysis to date: The Pan-Cancer Atlas. Xena displays the curated genomics and clinical data generated by the Pan-Cancer Atlas consortium working groups.
Note that if you use the TCGA Pan-Cancer (PANCAN) to study a specific cancer type, you will need to filter down to just that cancer type.
If you don't want to filter ...
Our second most recommended datasets are the cancer-specific GDC TCGA studies. These avoid the need to filter down to a single cancer type and contain harmonized data from the Genomic Data Commons.
More information comparing the data in the GDC to the legacy TCGA data can be found here:
The table below assumes that you are interested in TCGA data. These data types may also appear in other studies, but these are the recommended studies.
Data type
Study
Dataset name
Menu
Transcript expression
TCGA Pan-Cancer (PANCAN)
TOIL Transcript expression
Advanced
lncRNA expression
TCGA Pan-Cancer (PANCAN)
TOIL Gene expression
Advanced
Exon expression
legacy TCGA datasets (per cancer type)
Exon expression
Advanced
miRNA expression
TCGA Pan-Cancer (PANCAN)
Batch Effects normalized miRNA data
Advanced
DNA methylation
Any
DNA methylation
Advanced
ATAC-seq
GDC Pan-Cancer (PANCAN)
ATAC-seq
Advanced
Varied Survival endpoints
TCGA Pan-Cancer (PANCAN)
NA (run KM plot)
--
Analysis
Study
Compare Tumor vs Normal
TCGA, TARGET, GTEx
GRCh38 coordinates
Any GDC study
Cell Line
CCLE
Disease specific survival, disease free survival, progression free survival
TCGA Pan-Cancer (PANCAN)