User Help Pages
  • Welcome to the Help Pages for UCSC Xena
  • Tutorials and webinars
    • Webinars
    • Basic Tutorial: Section 1
    • Basic Tutorial: Section 2
    • Basic Tutorial: Section 3
    • Advanced Tutorial: Section 1
    • Advanced Tutorial: Section 2
    • Tutorial: Tumor vs Normal
    • Tutorial: Viewing your own data
    • Live examples
  • How do I ...
    • How do I make a KM plot?
    • How do I compare tumor vs normal expression?
    • How do I remove null data (gray lines) from view?
    • How do I make subgroups?
    • How do I make more than 2 subgroups?
    • How do I make subgroups with geneA high and geneB high?
    • How do I compare gene expression between subgroups?
    • How do I compare gene expression between different cancer types?
    • How do I remove duplicate samples from a KM plot?
    • How do I view multiple types of cancer together?
    • How do I filter to just one cancer type
    • How do I view my data with the data from TCGA?
    • How do I change the color of a column?
    • How do I interact with the tooltip?
    • How do I cite UCSC Xena?
  • Overview of features
    • Visual Spreadsheet
      • Coloring for Mutation Columns
      • Coloring for Segmented Copy Number Columns
    • Kaplan Meier Plots
    • Chart & Statistics View
    • Filtering and subgrouping
      • Supported search terms for finding samples
    • Differential Gene Expression
    • GSEA
    • Genomic Signatures
    • Bookmarks
    • Download Data
    • Xena Single Cell
    • TumorMap
    • MuPIT
    • Accessing data through python
    • Transcript View
    • Xena Gene Set Viewer
  • Overview of public data
    • Types of data we have
    • TCGA
    • GDC
    • More studies
    • Choosing a study/cohort
  • FAQ
    • Xena Browser
    • Data and datasets
  • Viewing your own data
    • Getting Started
    • Probes/transcripts/identifiers we recognize
    • Data format specifications and supported biological data types
    • KM plots using data from a Local Xena Hub
    • Hubs for institutions, collaborations, labs, and larger projects
    • Loading data from the command line
    • FAQ/Troubleshooting Guide
  • Technical documentation
    • Setting up Xena for your institution
    • Deep Linking Into Xena
    • Metadata Specification
  • Contact us
  • Cite us
  • Data Use Agreement
Powered by GitBook
On this page
  • Description
  • Prerequisites
  • Estimated time needed
  • Learning goals
  • Tutorial
  • Part A
  • Part B
  • Part C
  • Test your knowledge

Was this helpful?

Export as PDF
  1. Tutorials and webinars

Basic Tutorial: Section 2

Learn how to remove samples with no data, subgroup samples, and make Kaplan Meier plots

Last updated 7 months ago

Was this helpful?

Description

This tutorial is made for those who have never used Xena but who have completed Section 1 of the Basic Tutorial. We will cover how to filter to just the samples you are interested in, how to create subgroups, and how to run a Kaplan Meier survival analysis.

Prerequisites

This tutorial assumes completion of the . This tutorial begins where the Basic Tutorial: Section 1 ends.

Estimated time needed

Part A: 7 min

Part B: 15 min

Part C: 5 min

Learning goals

Part A

  • Search for samples of interest

  • Remove samples with no data

Part B

  • Make subgroups

  • Rename subgroups

Part C

  • Run a Kaplan Meier survival analysis

  • Use a custom time endpoint

Tutorial

In the Basic Tutorial Section 1 we found that we found that samples from patients that have aberrations in EGFR have relatively higher expression. These aberrations could be mutations or copy number amplifications.

Now we are going to look at whether those patient with aberrations in their samples also have a worse survival prognosis.

Part A

Our goal is to remove patient's samples with no data (i.e. null) from the view. This will make the view look cleaner and remove irrelevant samples from our Kaplan Meier survival analysis.

Steps

  1. Type 'null' into the samples search bar. This will highlight samples that have 'null' values in any column on the screen. Null means that there is no data for that sample for that column.

  2. Click the filter menu and select 'Remove samples'.

  3. Delete the search term.

Video of steps

More information

Shortcut for Part A

Instead of typing 'null' and removing those samples from the view, you can also use the 'Remove samples with nulls' shortcut in the filter menu.

Part B

Our goal is to create two subgroups, those patient's with samples with aberrations in EGFR and those patient's samples without aberrations in EGFR. We will then name the subgroups.

Steps

  1. Type '(mis OR inframe) OR B:>0.5' into the samples search bar. This will select samples that either have a missense or inframe deletion '(mis OR inframe)', or where copy number variation (column B) is greater than 0.5. Note that I arbitrarily choose a cutoff of 0.5.

You must have the copy number variation column as column B for the search term '(mis OR inframe) OR B:>0.5' to work. The 'B' in 'B:>0.5' is instructing Xena to search in column B for values that are greater than 0.5.

  1. Click the filter menu and select 'New subgroup column'. This will create a new column that has samples that met our search term marked as 'true' (ie. those that have an EGFR aberration) and those that did not meet our search term as 'false' (ie. those that do not have an EGFR aberration).

  2. Click the column menu for the column we just created (column B) and chose 'Display'.

  3. Rename the display so that samples that are 'true' are instead labeled as 'EGFR Aberrations' and the samples that are 'false' are instead labeled as 'No EGFR Aberrations'. Click 'Done'

  4. Delete the search term. This will remove the black tick marks for matching samples.

Video of steps 1

Video of steps 2-4

More information

Part C

Now that we have our subgroups we will run a Kaplan Meier survival analysis. Note that TCGA survival data is in days, hence the x-axis will be in days.

We can now see that there is no difference in survival between patients with EGFR aberrations and those without.

Steps

  1. Click the column menu at the top of column B.

  2. Choose 'Kaplan Meier Plot'.

  3. Click 'Custom survival time cutoff' at the bottom of the Kaplan Meier plot.

  4. Enter 3650, as this is 10 years.

Video of steps

More information

Test your knowledge

Starting at the end of Part A, filter down to only those patient's samples that have a missense mutation.

Search term: "missense"

Starting at the end of Part A, create two subgroups: those patient's samples with EGFR expression greater than 4 and those with EGFR expression less than 4.

Search term: "C:>4"

Starting at the end of Part A, run a Kaplan Meier analysis on the EGFR expression column.

To ensure your columns are sorted the same as those in this tutorial,

Basic Tutorial: Section 1
please start at this link
Ending Screenshot
Filtering and subgrouping samples
Supported search terms
Ending Screenshot
Filtering and subgrouping samples
Supported search terms
Ending Screenshot
Kaplan Meier survival analysis
Ending screenshot
Ending screenshot
Ending screenshot