GDC

Information on Xena data from GDC release v41.0

This help page is for the Genomic Data Commons (GDC) data we host from GDC Data Release 41.0 - August 28, 2024. We display all GDC open access genomic data and its accompanying phenotype/clinical data. Explore the GDC data on Xena.

In addition to the data from the GDC, we added two new phenotype/clinical fields to all GDC cohorts: age_at_earliest_diagnosis.diagnoses.xena_derived and age_at_earliest_diagnosis_in_years.diagnoses.xena_derived. This was done because some GDC cohorts had multiple diagnoses, each with their own age_at_diagnosis.diagnoses. When there were multiple ages the Xena Visual Spreadsheet would display these fields as a category. In order to have a field that could always be displayed as a continuous feature, we created the age_at_earliest_diagnosis.diagnoses.xena_derived field that has the smallest value when there were multiple entries. age_at_earliest_diagnosis_in_years.diagnoses.xena_derived was created similarly, but also dividing the number of days by 365.

For this release, we worked to not have samples that have no genomic data and only have phenotype/clinical data. This should make visualizing data in our Visual Spreadsheet easier.

You can still view data from the older GDC Data Release v18.0 release - August 28, 2019. This data will be available until October 2025. After October 2025 the data from this release will only be available for download.

CPTAC-3

For the CPTAC-3 cohort, we noted that occasionally samples were pooled into the same aliquot before sequencing was performed. Xena's visualizations are based on the sample-level, thus for these pooled aliquots there are several samples with duplicate data. An example of this is noted for case C3N-03011, where samples C3N-03011-04, C3N-03011-02, and C3N-03011-01 were all pooled into the aliquot CPT0226250007 before sequencing was performed.

Last updated 1 year ago

Was this helpful?

hashtagCPTAC-3

CPTAC-3