TCGA, TARGET, and GTEx RNA-seq data are uniformly re-aligned to hg38 genome, and re-processed using RSEM and Kallisto methods with gencode v23 annotations to generate expression estimates for ~60,000 genes and ~200,000 transcripts, including many LncRNAs. Xena hosts and displays gene and transcript expression results of this analysis.
International Cancer Genome Consortium (ICGC) goal is to obtain a comprehensive description of genomic, transcriptomic and epigenomic changes in 50 different tumor types and/or subtypes which are of clinical and societal importance across the globe. It includes TCGA data (U.S.A.) plus data contributed by groups from other countries in the International Cancer Genome Consortium. The resource has publically-accessible non-coding somatic mutation data from non-TCGA samples.
The Pan-Cancer Analysis of Whole Genomes (PCAWG) study is an international collaboration to identify common patterns of mutation in more than 2,600 cancer whole genomes from the International Cancer Genome Consortium. Building upon previous work which examined cancer coding regions, this project explored the nature and consequences of somatic and germline variations in both coding and non-coding regions, with specific emphasis on cis-regulatory sites, non-coding RNAs, and large-scale structural alterations.
The NCI's Genomic Data Commons (GDC) provides the cancer research community with a unified data repository that enables data sharing across cancer genomic studies in support of precision medicine.
The GDC supports several cancer genome programs at the NCI Center for Cancer Genomics (CCG), including The Cancer Genome Atlas (TCGA) and Therapeutically Applicable Research to Generate Effective Treatments (TARGET), and many more.
Xena displays gene expression data from the metastatic cancer study published in Robinson et al 2017 Integrative clinical genomics of metastatic cancer.
Cancer Cell Line Encyclopedia. Detailed genetic and pharmacologic characterization of a large panel (~1100) of human cancer cell lines.
We have a number of sources of pediatric data
The goal of the Gabriella Miller Kids First Pediatric Research Program (Kids First) is to develop a large-scale data resource to help researchers uncover new insights into the biology of childhood cancer and structural birth defects, including the discovery of shared genetic pathways between these disorders. Over 2015-2018, the program selected 26 patient cohorts for whole genome sequencing through a peer-review process.
TARGET data is intended exclusively for biomedical research using pediatric data (i.e., the research objectives cannot be accomplished using data from adults) that focus on the development of more effective treatments, diagnostic tests, or prognostic markers for childhood cancers. Moreover, TARGET data can be used for research relevant to the biology, causes, treatment and late complications of treatment of pediatric cancers, but is not intended for the sole purposes of methods and/or tool development (please see Using TARGET Data section of the OCG website). If you are interested in using TARGET data for publication or other research purposes, you must follow the TARGET Publication Guidelines.
The goal of the Treehouse Childhood Cancer Initiative (Treehouse) is to evaluate the utility of comparative gene expression analysis for difficult-to-treat pediatric cancer patients. Approaching 2000 pediatric tumor data, Treehouse has now assembled a large collection of pediatric cancer RNA-Seq, which, added to adult data, results in a compendium of over 11,000 adult and pediatric tumor-derived gene expression data. Pediatric cancer expression data are from public repository samples and from clinical samples at partner institutions, including UC San Francisco, Stanford, Children’s Hospital of Orange County and British Columbia Cancer Agency. In line with UC Santa Cruz Genomics Institute’s commitment to sharing data and to furthering research everywhere, we have made this data available for all to download and use.
Don't see a study or dataset that you are interested in? Set up a hub, for yourself or your group with the data you need.