# Choosing a study/cohort

## General recommendations

We recommend the TCGA Pan-Cancer (PANCAN) study for most analysis. Unless you need a specific type of data or need to run a type of analysis listed below, we recommend the TCGA Pan-Cancer (PANCAN) study.

{% hint style="info" %}
[TCGA Pan-Cancer (PANCAN) study](https://xenabrowser.net/?bookmark=282d192d37dff30390bfb9d78a668975)
{% endhint %}

**Why do we recommend this study?**

We recommend it because it has the data from the Cancer Genome Atlas (TCGA) Research Network, which generated the most comprehensive cross-cancer analysis to date: The Pan-Cancer Atlas. Xena displays the curated genomics and clinical data generated by the Pan-Cancer Atlas consortium working groups.

Note that if you use the TCGA Pan-Cancer (PANCAN) to study a specific cancer type, you will need to filter down to just that cancer type.

**If you don't want to filter ...**

Our second most recommended datasets are the cancer-specific GDC TCGA studies. These avoid the need to filter down to a single cancer type and contain harmonized data from the Genomic Data Commons.

{% hint style="info" %}
[GDC Data Hub](https://xenabrowser.net/datapages/?host=https%3A%2F%2Fgdc.xenahubs.net\&removeHub=https%3A%2F%2Fxena.treehouse.gi.ucsc.edu%3A443)
{% endhint %}

### Differences between the GDC and the legacy TCGA data

More information comparing the data in the GDC to the legacy TCGA data can be found here:

{% embed url="<https://gdc.cancer.gov/about-data/publications/HG38QC>" %}

## Choosing a study by type of data

The table below assumes that you are interested in TCGA data. These data types may also appear in other studies, but these are the recommended studies.&#x20;

| Data type                 | Study                                  | Dataset name                        | Menu     |
| ------------------------- | -------------------------------------- | ----------------------------------- | -------- |
| Transcript expression     | TCGA Pan-Cancer (PANCAN)               | TOIL Transcript expression          | Advanced |
| lncRNA expression         | TCGA Pan-Cancer (PANCAN)               | TOIL Gene expression                | Advanced |
| Exon expression           | legacy TCGA datasets (per cancer type) | Exon expression                     | Advanced |
| miRNA expression          | TCGA Pan-Cancer (PANCAN)               | Batch Effects normalized miRNA data | Advanced |
| DNA methylation           | Any                                    | DNA methylation                     | Advanced |
| ATAC-seq                  | GDC Pan-Cancer (PANCAN)                | ATAC-seq                            | Advanced |
| Varied Survival endpoints | TCGA Pan-Cancer (PANCAN)               | NA (run KM plot)                    | --       |

## Choosing a study based on a specific analysis or sample type

| Analysis                                                                    | Study                    |
| --------------------------------------------------------------------------- | ------------------------ |
| Compare Tumor vs Normal                                                     | TCGA, TARGET, GTEx       |
| GRCh38 coordinates                                                          | Any GDC study            |
| Cell Line                                                                   | CCLE                     |
| Disease specific survival, disease free survival, progression free survival | TCGA Pan-Cancer (PANCAN) |


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://ucsc-xena.gitbook.io/project/public-data-we-host/choosing-a-study-cohort.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
