1 of 19

Overview of features

More details about all the features we have on Xena

Visual Spreadsheet

This dynamic, powerful, and flexible view is our default view into the data.

The Visual Spreadsheet allows you to add an arbitrary number of columns of any data type (mutation, copy number, expression, protein, phenotype, methylation, etc) on any number of patient's samples into a spreadsheet-like view. We line up all columns so that each row is the same sample, allowing you to easily see trends in the data. Data is always sorted left to right and sub-sorted on columns thereafter.

Get started by going to the Xena Browser and following the wizard to enter your data of interest.

Making a Visual Spreadsheet

The wizard on the screen will guide you to choose a study to view and TWO columns of data to view on those samples. Note that if you do not choose at least two columns, the wizard will not exit and let you interact with the data.

Selecting a cohort

You can select a cohort either by choosing 'Help me select a cohort' and searching our cohorts for you cancer type, etc. or by choosing 'I know the study I want to use' and searching for the partial or full name of the cohort you are interested in.

Adding a Gene or Position

Enter a HUGO gene name or a dataset-specific probe names (e.g. a CpG island). You can enter one gene or multiple genes. Separate multiple genes with a space, comma, tab, or new line.

To display a genomic region, enter the genomic region, choose your dataset and click 'done'. We recongize chromosomes (e.g. chr1), arms of chromosomes (e.g. chr19q), and chromosomes coordinates (e.g. chr1:100-4,000).

Selecting a Dataset

After entering a gene or probe name, you will need to select one or more datasets.

Basic Datasets

We have pre-selected default datasets for most cohorts. These datasets are selected based because they are the most used datasets. Typically there is a default mutation, copy number, and expression dataset.

Advanced Datasets

Xena also has more datasets than those listed in the Basic Menu. Depending on the cohort, these can include DNA methylation, exon expression, thresholded CNV data and more. To access them, click on 'Show Advanced' below:

More information on basic datasets

We annotate datasets used in the basic Visual Spreadsheet wizard with a red asterisk in our datasets pages. For an example see:

Video of making a Visual Spreadsheet

After you made a Visual Spreadsheet

Overview

Patient samples are on the y-axis and your columns of data are on the x-axis. We line up all columns so that each row is the same sample, allowing you to easily see trends in the data. Data is always sorted left to right and sub-sorted on columns thereafter.

If you entered a single gene

If you entered a single gene, that gene will be listed at the top of the column. If there are multiple probes mapped to that gene in the dataset you selected they will be displayed as subcolumns ordered left to right in the direction of transcription.

If you selected a positional dataset, such as segmented copy number variation or mutation we will display the gene model will be displayed at the top of the column. The gene model is a composite of all transcripts of the gene. Boxes show different exons with UTR regions being short and CDS regions being tall. We display 2Kb upstream to show the promoter region. Use the column menu to toggle to show intronic regions.

If you entered multiple genes

If you entered multiple genes, each gene will be listed as a subcolumn for that dataset. If there are multiple probes mapped to that gene in the dataset (i.e. if you entered a single gene then you would see the probes as subcolumns), then the probes are averaged for a single value per gene.

Note that if you entered more than one gene and selected a mutation dataset, we will only show the first gene. If you wish to see multiple mutation columns, please enter each gene individually and click 'done'

If you entered a chromosome or chromosome position

When displaying a chromosome range, genes will be shown at the top of the column, with dark blue genes being on the forward strand and red genes being on the reverse strand. Hovering over a gene will display the gene name in the tooltip. Note that introns are always shown in this mode.

Data values

Individual values vary by dataset. The legend at the bottom of the dataset will tell you the units for your particular dataset, including any normalization that was performed. If a sample does not have data for a column, it will show as gray and be labeled as 'null'.

If the entire column is gray this means we did not recognize the gene, probe, or position. If you believe this to be in error, please try an alternate name.

More information about a dataset can be found in the dataset details page. To get there, click on the column menu and choose 'About'.

Sample sorting

The Xena Browser uses the y-axis for samples and the x-axis/columns for genomic/phenotypic features. Data from a single sample is always on the same horizontal line across all columns, allowing you to see screen-wide trends. The Xena Browser orders samples left to right first by the first columns, then the second, etc. If there are multiple genes, identifiers, probes within in a column, samples is ordered from left to right by 1st sub-column, then 2nd sub-column, and so on.

Numerical data are ordered in descending order (e.g. 3.5, 1.2, ...). Categorical data (e.g. stage, tumor type, etc) are ordered by categories. CNV data is sorted by the average of the entire column. Positional mutation data is ordered by genomic coordinates (from 5'->3') and then by the predicted impact of the mutation. Both CNV and positional mutation data has the option to instead sort by the zoomed region. Click the column menu at the top of the column and choose 'Sort by zoom region avg'.

To reverse the ordering, click the column menu at the top of the column and chose 'Reverse sort'

Move a column/change the sample sorting

As the sample sort order is controlled by the left most columns, it can be useful to explore the data by moving a different column to the left.

To move a column click on the column header and drag a column to the right or left.

Zooming

Click and drag any where in any column to zoom in in either direction. Zoom out to all samples by clicking the 'Clear Zoom' at the top. Zoom out to the whole column by clicking the red 'x' at the top of a column.

The Tooltip at the top of the Visual Spreadsheet shows more information about the data under the mouse. Links are links to the UCSC Genome Browser to learn more about that gene or genomic position. Alt-click to freeze and unfreeze the tooltip to be able to click on the links. .

Resize a column

You can change the size of a column by clicking on the bottom right corner of a column and dragging to a new size.

Add another column

You can add another column of data by clicking on 'Click to add column' either on the right edge of the visual spreadsheet or by hovering between columns until 'Click to insert column' displays'.

Coloring for Mutation Columns

More information about how we color mutation columns

Samples that have mutation data are white with a dot or line for the mutation for where the mutation falls in relation to the gene model at the top of the column. Mutation data is colored by the functional impact:

Red - Deleterious
Blue - Missense
Orange - Splice site mutation
Green - Silent
Gray - Unknown

Samples for which there is no mutation data are gray with no dot or line, and are marked as 'null'.

More details for 'Somatic mutation (SNP and INDEL)' datasets

Red --> Nonsense_Mutation, frameshift_variant, stop_gained, splice_acceptor_variant, splice_acceptor_variant&intron_variant, splice_donor_variant, splice_donor_variant&intron_variant, Splice_Site, Frame_Shift_Del, Frame_Shift_Ins

Blue --> splice_region_variant, splice_region_variant&intron_variant, missense, non_coding_exon_variant, missense_variant, Missense_Mutation, exon_variant, RNA, Indel, start_lost, start_gained, De_novo_Start_OutOfFrame, Translation_Start_Site, De_novo_Start_InFrame, stop_lost, Nonstop_Mutation, initiator_codon_variant, 5_prime_UTR_premature_start_codon_gain_variant, disruptive_inframe_deletion, inframe_deletion, inframe_insertion, In_Frame_Del, In_Frame_Ins

Green --> synonymous_variant, 5_prime_UTR_variant, 3_prime_UTR_variant, 5'Flank, 3'Flank, 3'UTR, 5'UTR, Silent, stop_retained_variant

Orange --> others, SV, upstreamgenevariant, downstream_gene_variant, intron_variant, intergenic_region

Note that we are case insensitive when we color for these terms.

For the gene-level mutation datasets (Somatic gene-level non-silent mutation):

Red (=1) --> indicates that a non-silent somatic mutation (nonsense, missense, frame-shif indels, splice site mutations, stop codon readthroughs, change of start codon, inframe indels) was identified in the protein coding region of a gene, or any mutation identified in a non-coding gene

White (=0) --> indicates that none of the above mutation calls were made in this gene for the specific sample

Pink (=0.5) --> some samples have two aliquots. In the event that in one aliquot a mutation was called and in the other no mutation was called, we assign a value of 0.5.

Coloring for Segmented Copy Number Columns

When there are no overlapping segments, Xena displays the value and color of the copy number segment as indicated in the column legend at the bottom of the column.

When there are overlapping segments, Xena follows these steps:

Compute overlaps by slicing segments that overlap with other segments. For example if there was one segment from chr1:10000-20000 and a second segment from chr1:10050-10100, then resulting segments from this step would be chr1:10000-10050, chr1:10050-10100, and chr1:10100-20000.
For each segment defined in step 1, determine which segments in the original data overlap with this segment.
Divide data segments into those that are greater than copy number neutral (i.e. are amplifications) and those that are less than copy number neutral (i.e. are deletions). Average the segments for each of these two groups.
Find the colors corresponding to the two averages from step 3. Then pick a color that is in between those two colors on the color wheel. An example would be that if the amplifications are red and deletions are blue, the resulting color from a strong amplification and a strong deletion would be purple. Note that copy number neutral in this example would be white.

Kaplan Meier Plots

Kaplan Meier Survival Analyses are a way of comparing the survival of groups of patients. More information on what a Kaplan Meier analysis is can be found in this article

Generating a plot

To generate a KM plot, click on the column menu at the top of a column and choose 'Kaplan Meier Plot'.

Features

Sample groups

For numerical or continuous features, you will have the option of having 2 groups of samples, 3 groups of samples, or viewing the upper vs lower quartile. For 2 groups, we divide the samples on the median. For 3 groups, we divide samples into the upper third, middle third, and lower third.

When viewing the upper vs lower quartile, note that we only include samples that are greater than (not greater than or equal to) the upper quartile, and the same for the lower quartile.

Note that all are used to calculate the median and other dividing values, whether or not they have survival data. To see which samples have survival data, add the column 'OS' from the phenotype data.

If more than one sample has the same value, we put the samples in a group together, even if this means the groups end up being unequal in size.

For categorical features, we only show the first 10 categories.

For mutation features, we divide samples into those with any mutation and those without. To make different groups (e.g. samples with nonsense mutations vs those without), and run a KM plot on the new column

We remove samples with 'null' data for all plots.

Type of survival

We default to Overall Survival. Users can select different end points if they are available. An example of this is in the .

Survival time cutoff

We default to the last time any individual in the plot was known to be alive. You can change this to be 1-year or 5-year survival by changing the time cutoff at the bottom of the screen. The statistics will automatically recalculate. TCGA data uses days as their measurement of time.

PDF

You can generate a high quality PDF by clicking the PDF icon.

Download

You can download the data used to generate the KM plot using the download icon. It will download the , in addition to the sample ID, patient ID, groups, and underlying data.

Statistics used

When there are multiple curves or lines in a KM plot, Xena Browser compares the different Kaplan–Meier curves using the log-rank test. The Browser reports the test statistics (𝜒 2) and p-value (𝜒 2 distribution). Data is retrieved in real-time from Xena Hub(s) to a user's web browser and the test is performed in the browser to maintain your data privacy.

The statistics the Xena Browser reports are equivalent to R's survival package, , with rho=0 (default in R).

Exceptions

If all patients in a particular group (i.e. line) are censored before any event happens for the whole population (including all the groups), we exclude this group from the statistical analysis and perform the log-rank test on the remaining groups. We do this because we have no way to know the number of people at risk for this particular group at any of event times, and therefore can not compute any statistics for this group. R handles this exception in the same way. Although this group is removed from the statistical analysis, we still display the group in the KM plot.

Duplicate samples

Note that we do not automatically remove duplicate patients (for instance if there is a tumor and a normal sample from the same patient). You can determine if there are duplicate patients by looking for the "!" icon next to the p value. .

More information on how to load your own survival data into Xena

Chart & Statistics View

Chart View will generate bar plots, box plots, violin plots, scatter plots, and distribution graphs using any of the columns in a Visual Spreadsheet. Statistics, such as Welch's t-test, Pearson's and Spearman's rank correlation, and ANOVA will be calculated automatically.

Enter Chart View

To get to the chart view click on the icon indicated below by the red box or use the column menu and select 'Chart & Statistics'.

Build a chart

Once you enter Chart View, it will ask you a series of questions about what type of graph you are trying to make.

Compare subgroups will allow you to compare groups of patient's samples, either those that you have made or via a categorical feature, such as sample type. It will build the appropriate graph depending on whether you have selected a continuous numerical or categorical column. This option will let you make box plots, violin plots, bar charts, and dot plots.

See a distribution will let you see a histogram distribution of the data in a single column. You can view the mean, median, and various standard deviations on the distribution. The column can have sub-columns, either multiple probes or multiple genes, which will instead create a plot with multiple box plots.

Make a scatterplot will make a scatterplot from two continuous numerical columns. The second column can have multiple sub-columns, either multiple probes or multiple genes, which will create overlapping scatterplots

If an option is grayed out, this means that you do not have enough or the right type of data on the screen. Return to the Visual Spreadsheet and add more data.

After building a chart

If you are viewing a distribution of a continuous feature, such as gene expression for a single gene, you can add lines to the graph that indicate the mean/median or percentiles.

If you are viewing a scatterplot, you can color the points by a third column of data.

If you are viewing a dot plot, you can select if you would like to view the data as 'continuous value' where the size of the dot reflects the mean, same as the intensity of the color, or if you would like to view the data as 'single cell count data' where the size of the dot reflects the percent of cells/samples that have a non-zero value.

Advanced options available under the graph will allow you to change the scales of the axes.

We show statistics in the bottom right corner of the screen for most graphs. If we detect it will take some time run the statistics we may instead show a button with 'run stats', so that you can decide if you would like to run the statistical test.

Note that for violin plots, the width of each plot is does not relate to the number of samples in the plot.

Return to the Visual Spreadsheet

To return to the Visual Spreadsheet, click either the icon in the upper left, or the 'x' close button.

Filtering and subgrouping

How to find samples that you want to remove or keep in the view. How to make subgroups.

Use the search box at the top of the screen to first pick/find your samples of interest. Then filter to keep or removes these samples, create a new subgroup column, or zoom.

The bar highlighted above allows you to search all data on the screen for your search term. Note that it will not search data that is not on the screen. Samples that match your criteria are marked with a black bar in the Visual Spreadsheet.

Searching for samples

You can search for samples by either typing in the search bar or by clicking on the dropper icon to enter the pick samples mode. The pick samples mode will allow you to click on a column to select samples. The search term for your picked samples will appear in the search bar. To exit the pick samples mode, click on the dropper icon again.

Note the pick samples mode tends to work best if the column you are selecting from is the first column.

More information on

Example of pick samples mode

Once you have your sample(s) of interest, click on the filter + subgroup menu and choose to:

Keep samples: Keep only the samples which match your criteria.

Remove samples: Remove the samples which match your criteria.

Clear sample filter: Remove ALL filters currently applied.

Remove Samples with nulls: Removes samples that have no data for one or more columns. Equivalent to typing 'null' in the search bar and choosing 'Remove samples'.

Zoom: Zoom to the samples that meet your criteria. Shift-click to zoom out.

New subgroup column: Create a new column where samples that meet your criteria are annotated as 'true' and samples that don't meet your criteria are annotated as 'false'. This new columns can then be used for or in the .

To create more than 2 subgroups, please see our guide.

Search bar history

Once you have either filtered, created a subgroup column, or zoomed to samples, your search term will be added to the search history. Access the search history by clicking the downward facing arrow at the upper right of the search bar.

Note this search history will be preserved in .

Changing subgroup labels

Once the subgroup column is created, users can change the labels from "true" or "false" to, for example, "wild type" or "EGFR mutant" by adjusting the column display settings. To access these select the three dot menu at the top of the column and choose 'Display'

Supported search terms for finding samples

Categorical features

Our search is 'contains' search, meaning the term you enter can be at the beginning, end or in the middle of a matched term. Our search is case-independent. An example is

IIA

will match 'Stage IIIA' and 'Stage IIA'. To specify a specific string, use quotes

"Stage IIA"

Numerical and Continuous features

You can specify a certain column and mathematical expression such as

A:>2

which will find all values greater than 2 in the first column. We support the following operators

= (equal)
>= (less than or equal)
>= (greater than or equal)

Mutation data

You can search any annotation on a mutation, such as the functional impact, protein position, or gene name itself

To find all samples with mutations with the protein change, enter:

V600E

To find all samples where the functional impact has the text 'frame' or 'nonsense' in it:

frame OR nonsense

To find all samples that have a mutation, search the gene annotation:

TP53

To find all samples that do not have a mutation, use the negation of the gene annotation:

!=TP53

No data or 'null'

To find all samples that do not have data in one or more columns, use:

null

and choose 'Remove samples'. To find all samples that do not have data for just one column, use:

B:null

Sample IDs

Enter a sample ID to find a sample of interest. An example:

TCGA-DB-A4XH

If you are searching for multiple sample IDs, you will need to separate each by an 'OR'. You can copy and paste a list of sample IDs into the search bar as long as they are separated by a space, tab, or return (new line).

TCGA-DB-A4XH OR TCGA-2F-A9KO-01 OR TCGA-02-0001

for copying a sample ID from the tooltip.

Search a specific column

To make it easy to search a specific column, we use shorthand to annotate the first column as 'A:', the second as 'B:', etc. An example is

A:YES

This will search ONLY the first column for the word 'YES'. Note that we will retain your original search if you move the columns around.

Boolean operators: OR, AND, and !=

You can enter multiple search terms and we will match all of them with an implicit 'AND'. We also support 'OR'.

Use parentheses to group search terms. For example:

"Stage II" (B:Negative OR C:Negative)

will search for samples that match 'Stage II' in any column and are 'Negative' for either the second or third column.

You can also use '!=' to negate a term such as:

!=null

which will match all samples that have data across all columns.

Differential Gene Expression

Run a genome-wide differential gene expression analysis to compare groups of samples

To run a differential gene expression analysis, click on the 3 dot column menu at the top of a categorical column (not a numerical column) and choose 'Differential Expression'.

This will take you to new page where you will define the sample subgroups you would like to compare (note that you can select multiple categories for a single subgroup).

After you have your subgroups, scroll to the bottom and click 'submit'.

Due to compute limitations you can only run a total of 2000 samples through the analysis pipeline.

This will start the analysis, which make take a while to run depending on the size of the dataset. As the results are completed, the web page will update. Scroll to see more results. Once the analysis is finished it will say 'Done' at the top of the page.

More details

The gene expression dataset chosen for a specific study/cohort is the same gene expression dataset as the one in the .

The Advanced Visualization parameters only apply to the PCA or t-SNE plot. They do not apply to any other analyses.

Running it on your own data

We disable running our differential gene expression analysis on your own data since we send the data in the analysis to various websites, which may not be secure. There are 3 options to run our analysis on your own data:

Upload your data to to run a somewhat similar analysis. BioJupies by the Ma'ayan lab will run a somewhat similar analysis to the one we run and has a very user friendly interface.
Upload your data to the to run a very similar analysis. This pipeline is what our analysis is based off of and will require a bit more familiarity with running differential gene expression analyses. Our modifications to this analysis are just to automatically pick the best normalization, etc options based on our public data. You will need to know which options are best given your own data.

GSEA

Run a genome-wide differential GSEA analysis to compare groups of samples

To run a GSEA analysis, click on the 3 dot column menu at the top of a categorical column (not a numerical column) and choose 'GSEA'.

This will take you to new page where you will define the sample subgroups you would like to compare (note that you can select multiple categories for a single subgroup).

After you have your subgroups, choose a gene set library, scroll to the bottom and click 'submit'.

Due to compute limitations you can only run a total of 2000 samples through the analysis pipeline.

More details

The gene expression dataset chosen for a specific study/cohort is the same gene expression dataset as the one in the .

The Advanced Visualization parameters apply to the PCA or t-SNE plot, as well as the blitzGSEA analysis itself.

Note that the GSEA analysis runs , a faster implementation of a traditional GSEA analysis.

Running it on your own data

We disable running our GSEA analysis on your own data since we send the data in the analysis to various websites, which may not be secure. Currently we only offer a as a method for running this pipeline on your own data. Please contact us if you need help setting this up.

Genomic Signatures

Enter a genomic signature over a set of genes for a particular dataset

Genomic signatures, sometimes expressed as a weighted sum of genes, are an algebra over genes, such as "ESR1 + 0.5*ERBB2 - GRB7". Once a signature is entered, the value for each gene name for each sample are substituted and the algebraic expression is evaluated.

Entering a signature

Open the Add column menu
Enter '=' and then your signature into the gene entry box
Select 'gene expression' as the dataset
Click 'Done'

There must be a space on both sides of the "+" and "-".

Alternatively enter a list of genes and we will automatically add a '+' in between each gene when evaluating the signature

If we can not find a gene that is part of the signature, the missing gene will be included as a zero in the expression calculation and the label will list the genes as missing.

Example: TFAC30 Gene Signature

Hess et.al. identified 30 genes whose gene expression profile is predictive of complete pathologic response to chemotherapy treatment in breast cancer.

Gene signature

=E2F3 + MELK + RRM2 + BTG3 - CTNND2 - GAMT - METRN - ERBB4 - ZNF552 - CA12 - KDM4B - NKAIN1 - SCUBE2 - KIAA1467 - MAPT - FLJ10916 - BECN1 - RAMP1 - GFRA1 - IGFBP4 - FGFR1OP - MDM2 - KIF3A - AMFR - MED13L - BBS4

Here we can see that the predicted chemo response signature is high in the basal subtype and low in luminal subtype. Additionally, the signature is high for ER negative samples and low for ER positive samples.

Bookmark:

Signatures datasets

We also have a number of signature datasets under the from the PanCan Atlas project:

To use these signatures, go to the dataset pages (links above) to see what the names of the specific signatures are (under Identifiers). Then in the visualization enter the name of the specific signature as a gene, click 'Advanced', choose the appropriate dataset, and click 'Done'

Bookmarks

Bookmarks are a great way to save a particular view in Xena, either for yourself or to share with others.

Creating a bookmark

To bookmark a view, click on 'Bookmark' in the top navigation bar. From here you can either click 'Bookmark' to create a bookmark URL or click 'Export' to export a file that can then be imported back to the browser.

When you click 'Bookmark' you will then need to click 'Copy Bookmark' to copy the bookmark URL to your copy buffer. Large views may take a second or two to generate a URL.

Note that your filter and subgroup history, as well as the last Chart View you created, if any, will be saved as part of the bookmark.

Bookmarks are only guaranteed for 3 months

More information: Bookmark vs. Export/Import

The 'Bookmark' option will store all the data in view on our servers and provide you a link. This is the easiest way to share a view. Note that if you have any private data in view, this option will be disabled to preserve your privacy. Please also note that if you lose the link there is no way to get it back.

If you chose Export, it will give you a file with everything Xena needs to recreate your view. You can then save this file and import it back into Xena. While this option can be a bit cumbersome, it will allow you to share private data. Note that these files are still only guaranteed for 3 months, though they may last for longer.

Recent Bookmarks

The 'Recent Bookmarks' option will temporarily show the 15 most recent bookmarks you have created. This can be useful if you're constructing many bookmarks. Note that this menu is frequently reset so do not use this as permanent storage for a bookmark.

FAQ

How do I make a bookmark with private data in view?

When you create a bookmark link, we save the data in view on our servers. To protect user data privacy, we have disabled this option when private data is in view. Please use the Export/Import option instead.

Download Data

There are 4 ways to download data

The four ways to download data

1. Download data in a single column of a Visual Spreadsheet In a Visual Spreadsheet, click on the column Hamburger menu, then "Download" to download just the data from the column.

2. Download data in an entire Visual Spreadsheet In a Visual Spreadsheet, clicking on the download icon in the upper right corner of the spreadsheet.

3. Bulk download a whole dataset file Click top banner "Data Sets" to navigate to the dataset of your interest, where a download url link is in the page. You can also reach the dataset page by clicking on the column Hamburger menu, then "About". Click on the download url to download the entire dataset. Or use "wget", "curl" to download from command line.

4. Via our APIs:

How do I open the download files?

Our files are tab-delimited or '.tsv'. We recommend opening them on the command line if you hare able.

If you not able to use the command line then we recommend using your favorite spreadsheet program, such as Microsoft Excel, which will automatically convert the tabs into new columns. Please note that if you have many thousands of samples, Microsoft Excel will likely have difficulty opening the file.

Please be careful when using Microsoft Excel to open files with gene names as Microsoft Excel will automatically convert some gene names into dates. For more information see:

Xena Single Cell

Overview of how to view single cell data

Entering Xena Single Cell

To enter Xena Single Cell click on 'SINGLECELL' in the top navigation bar. This will welcome you to where you can then click 'Enter'.

TumorMap

A tool developed by the Stuart Lab to view samples in a 2D layout

UCSC TumorMap is a separate project developed by the Stuart Lab at UCSC. We link to them to help users gain another perspective on the data they are seeing in Xena. From their Overview page:

TumorMap is a tool that enables grouping samples based on their omic signatures in a visually accessible way. Similar to dimensionality reduction methods, Tumor Map method takes a high-dimensional omics space and produces a two dimensional visualization. Unlike most dimensionality reduction methods, the TumorMap method is able to combine multiple types of omics data (e.g. mRNA expression and methylation data types in a single map). Furthermore, TumorMap is an interactive tool that allows navigating through a tumor landscape that represents a heterogeneous multi-dimensional and multi-platform omic space of oncogenic signatures.
In the TumorMap, each node is a sample and clusters of samples indicate groups with similar oncogenic signatures and genomic alteration events. The samples in a map may be colored by various molecular, clinical, diagnostic, prognostic, and phenotypic annotations (e.g. tumor type, molecular subtype, etc.) to visualize associations with the data type used in clustering.

MuPIT

A 3D protein viewer developed by Rachel Karchin's lab

We use the MuPIT 3D protein viewer from Rachel Karchin's lab at John Hopkins to provide this visualization to our users. From their Help Page:

MuPIT interactive is an online tool that allows you to map sequence variants from their genomic position onto protein structures. Viewing a variant on protein structure can be useful in interpreting its potential biological consequences. After mapping, the variants are displayed on an interactive 3d structure. The user may turn variants on and off, and display annotations on the protein structure.

Access this tool by going to our Visualization tab and following the wizard to select samples. Next, enter your gene of interest, click 'somatic mutation' and then click 'Done'. You may need to choose another variable such as 'gene expression'.

Once you have the mutation data you're interested in, click the menu at the top of the column and chose 'MuPIT View'. This will send your mutation data to MuPIT and open their viewer in a new tab.

MuPIT Help:

Example

On the left of the figure is Xena mutation column view of ERBB2 somatic mutations from the TCGA breast cancer cohort. Users click on the MuPIT link from the caret menu at the top of the column. It will send all the mutations' genomic positions as well as their recurrence p-values to the MuPIT display. On the right side of the figure, MuPIT displays mutations in various size of bright green spheres. Large spheres for recurrent mutations. Size of the mutation spheres are determined by recurrence p values. The MuPIT display shows these ERBB2 somatic mutations cluster around the ERBB2 active site (ATP binding site in blue and proton acceptor site in teal).

Accessing data through python

You can use the python API, xenaPython, to programmatically access data in the public Xena Data Hubs.

Installation

Usage

Example

Help

More Information

Transcript View

About

Xena's shows transcript-specific expression or isoform percentage for 'tumor' TCGA data and 'normal' GTEX data. It allows you to compare the distribution of these values for two groups of patient samples.

This tool was created by Akhil Kamath as part of . Akhil was advised by and . Thank you Akhil for all your work!

Xena Gene Set Viewer

The Xena Gene Sets Viewer https://xenagoweb.xenahubs.net/xena compares gene expression, somatic mutation, and copy number variation profile of cancer related gene sets across cancer cohorts. It queries genomics data hosted on public Xena Hubs, in a similar way as other tools in the Xena Visualization suite. And then it generates gene set visualizations of those data.

Source code:

Xena Gene Set Viewer

Overview

The Gene Set Viewer allows comparison of individual gene sets or pathways and their genes across two cancer tumor sample cohorts as well as comparison within the same sub cohorts.

As an overview, Figure 1 shows two cohorts, the left (olive background, TCGA Ovarian Cancer) and the right (tan background , TCGA Prostate Cancer). Figure 1A shows the selection for the analysis, Gene Set, view limit, and filter (differential versus similar). Figure 1B shows the view comparing the Mean Gene Set Score in the center and individual samples on the right. 1C shows the individual samples, with the hover result showing the sample and score in 1E. 1D provides a link directly into Xena for the given gene set. 1F provides a sharable URL link. 1G provides a login for use in uploading.

Figure 8 shows analysis of a GMT file using the BPA method [: thanks to Verena Friedl]. This is only available to logged in users and they may only see their own analysis and are limited to 100 pathways. Logins are any valid google login. Several public pathway sets are available including those curated from the Gene Ontology Consortium (thanks to Laurent-Philippe Albou) as well as those from the Hallmark [cite] and Pancan [cite] analyses.

Analysis

Gene Expression

BPA GENE EXPRESSION
PARADIGM IPL
REGULON ACTIVITY (only avaiable for the LUAD Cohort)

Mutation / CNV

CNV ∩ MUTATION
COPY NUMBER
MUTATION

Sources for the somatic mutation and copy number variation data

)

Visual Spreadsheet

This dynamic, powerful, and flexible view is our default view into the data.

Get started by going to the Xena Browser and following the wizard to enter your data of interest.

Making a Visual Spreadsheet

Selecting a cohort

Adding a Gene or Position

Enter a HUGO gene name or a dataset-specific probe names (e.g. a CpG island). You can enter one gene or multiple genes. Separate multiple genes with a space, comma, tab, or new line.

Selecting a Dataset

After entering a gene or probe name, you will need to select one or more datasets.

Basic Datasets

Advanced Datasets

More information on basic datasets

We annotate datasets used in the basic Visual Spreadsheet wizard with a red asterisk in our datasets pages. For an example see:

Video of making a Visual Spreadsheet

After you made a Visual Spreadsheet

Overview

If you entered a single gene

If you entered multiple genes

If you entered a chromosome or chromosome position

Data values

If the entire column is gray this means we did not recognize the gene, probe, or position. If you believe this to be in error, please try an alternate name.

More information about a dataset can be found in the dataset details page. To get there, click on the column menu and choose 'About'.

Sample sorting

To reverse the ordering, click the column menu at the top of the column and chose 'Reverse sort'

Move a column/change the sample sorting

As the sample sort order is controlled by the left most columns, it can be useful to explore the data by moving a different column to the left.

To move a column click on the column header and drag a column to the right or left.

Zooming

Resize a column

You can change the size of a column by clicking on the bottom right corner of a column and dragging to a new size.

Add another column

You can add another column of data by clicking on 'Click to add column' either on the right edge of the visual spreadsheet or by hovering between columns until 'Click to insert column' displays'.

Overview of features

Visual Spreadsheet

hashtagMaking a Visual Spreadsheet

hashtagSelecting a cohort

hashtagAdding a Gene or Position

hashtagSelecting a Dataset

hashtagBasic Datasets

hashtagAdvanced Datasets

hashtagVideo of making a Visual Spreadsheet

hashtagAfter you made a Visual Spreadsheet

hashtagOverview

hashtagIf you entered a single gene

hashtagIf you entered multiple genes

hashtagIf you entered a chromosome or chromosome position

hashtagData values

hashtagSample sorting

hashtagMove a column/change the sample sorting

hashtagZooming

hashtagTooltip

hashtagResize a column

hashtagAdd another column

Coloring for Mutation Columns

hashtagMore details for 'Somatic mutation (SNP and INDEL)' datasets

Coloring for Segmented Copy Number Columns

Kaplan Meier Plots

hashtagGenerating a plot

hashtagFeatures

hashtagSample groups

hashtagType of survival

hashtagSurvival time cutoff

hashtagPDF

hashtagDownload

hashtagStatistics used

hashtagExceptions

hashtagDuplicate samples

hashtagMore information on how to load your own survival data into Xena

Chart & Statistics View

hashtagEnter Chart View

hashtagBuild a chart

hashtagAfter building a chart

hashtagReturn to the Visual Spreadsheet

Filtering and subgrouping

hashtagSearching for samples

hashtagExample of pick samples mode

hashtagFilter + Subgroup menu

hashtagSearch bar history

hashtagChanging subgroup labels

Supported search terms for finding samples

hashtagCategorical features

hashtagNumerical and Continuous features

hashtagMutation data

hashtagNo data or 'null'

hashtagSample IDs

hashtagSearch a specific column

hashtagBoolean operators: OR, AND, and !=

Differential Gene Expression

hashtagMore details

hashtagRunning it on your own data

GSEA

hashtagMore details

hashtagRunning it on your own data

Genomic Signatures

hashtagEntering a signature

hashtagExample: TFAC30 Gene Signature

hashtagGene signature

hashtagSignatures datasets

Bookmarks

hashtagCreating a bookmark

hashtagMore information: Bookmark vs. Export/Import

hashtagRecent Bookmarks

hashtagFAQ

hashtagHow do I make a bookmark with private data in view?

Download Data

hashtagThe four ways to download data

hashtagHow do I open the download files?

Xena Single Cell

hashtagEntering Xena Single Cell

hashtag

TumorMap

MuPIT

Making a Visual Spreadsheet

Selecting a cohort

Adding a Gene or Position

Selecting a Dataset

Basic Datasets

Advanced Datasets

Video of making a Visual Spreadsheet

After you made a Visual Spreadsheet

Overview

If you entered a single gene

If you entered multiple genes

If you entered a chromosome or chromosome position

Data values

Sample sorting

Move a column/change the sample sorting

Zooming

Tooltip

Resize a column

Add another column

More details for 'Somatic mutation (SNP and INDEL)' datasets

Generating a plot

Features

Sample groups

Type of survival

Survival time cutoff

PDF

Download

Statistics used

Exceptions

Duplicate samples

More information on how to load your own survival data into Xena

Enter Chart View

Build a chart

After building a chart

Return to the Visual Spreadsheet

Searching for samples

Example of pick samples mode

Filter + Subgroup menu

Search bar history

Changing subgroup labels

Categorical features

Numerical and Continuous features

Mutation data

No data or 'null'

Sample IDs

Search a specific column

Boolean operators: OR, AND, and !=

More details

Running it on your own data

More details

Running it on your own data

Entering a signature

Example: TFAC30 Gene Signature

Gene signature

Signatures datasets

Creating a bookmark

More information: Bookmark vs. Export/Import

Recent Bookmarks

FAQ

How do I make a bookmark with private data in view?

The four ways to download data

How do I open the download files?

Entering Xena Single Cell

Example

Installation

Usage

Example

Help

More Information

About

Overview

Analysis

Gene Expression

Mutation / CNV

Sources for the somatic mutation and copy number variation data

Generating a plot

Features

Sample groups

Type of survival

Survival time cutoff