Basic Tutorial: Section 2

Covers filtering, subgrouping, and making Kaplan Meier plots


This tutorial is intended for those who have never used Xena before but who have completed Section 1 of the Basic Tutorial. We will cover how to filter to just the samples you are interested in, how to create subgroups, and how to run a Kaplan Meier survival analysis.


This tutorial assumes completion of the Basic Tutorial: Section 1. This tutorial begins where the Basic Tutorial: Section 1 ends.

Estimated time needed

Part A: 15 min

Part B: 15 min

Part C: 5 min

Learning goals

Part A

  • Search for samples of interest

  • Filter to keep or remove those samples of interest

Part B

  • Make subgroups

  • Rename subgroups

Part C

  • Run a Kaplan Meier survival analysis

  • Use a custom time endpoint


In the Basic Tutorial Section 1 we found that we found that samples that have aberrations in EGFR (mutations or amplifications) have higher expression.

Now we are going to investigate whether those samples with aberrations have a worse survival prognosis.

To ensure your columns are sorted the same as those in this tutorial, please start at this link:

Part A

Our goal is to remove samples with no data (i.e. null) from the view. This will make the view look cleaner and remove irrelevant samples from our Kaplan Meier survival analysis.


  1. Type 'null' into the samples search bar. This will highlight samples that have 'null' values in any column on the screen. Null means that there is no data for that sample for that column.

  2. Click the filter menu and select 'Remove samples'.

  3. Delete the search term.

Video of steps

Part B

Our goal is to create two subgroups, those samples with aberrations in EGFR and those samples without aberrations in EGFR. We will then name the subgroups.


  1. Type '(mis OR infra) OR C:>0.5' into the samples search bar. This will select samples that either have a missense or inframe deletion '(mis OR infra)', or where copy number variation (column C) is greater than 0.5. Note that I arbitrarily choose a cutoff of 0.5.

  2. Click the filter menu and select 'New column subgroup'. This will create a new column that has samples that met our search term marked as 'true' (ie. those that have an EGFR aberration) and those that did not meet our search term as 'false' (ie. those that do not have an EGFR aberration).

  3. Click the column menu and chose 'Display'.

  4. Rename the display so that samples that are 'true' are instead labeled as 'EGFR Aberrations' and the samples that are 'false' are instead labeled as 'No EGFR Aberrations'. Click 'Done'

  5. Delete the search term. This will remove the black tick marks for matching samples.

Video of steps 1-2

Video of steps 2-5

Part C

Now that we have our subgroups we will run a Kaplan Meier survival analysis. Note that TCGA survival data is in days, hence the x-axis will be in days.

More information


  1. Click the column menu at the top of column B.

  2. Choose 'Kaplan Meier Plot'.

  3. Click 'Custom survival time cutoff' at the bottom of the Kaplan Meier plot.

  4. Enter 3650, as this is 10 years.

Video of steps

Test your knowledge

Question 1
Answer 1
Question 1

Starting at the end of Part A, filter down to only those samples that have a missense mutation.

Question 2
Answer 2
Question 2

Starting at the end of Part A, create two subgroups: those with EGFR expression greater than 17 and those with EGFR expression less than 17.

Question 3
Answer 3
Question 3

Starting at the end of Part A, run a Kaplan Meier analysis on EGFR expression column.