It is often helpful to visualize point mutations in a spatial context on a gene, creating an image like:
where regions in gene are indicated as well as variants.
We can make these plots by using G3viz package
, that,
thanks to its dependencies, also allows us to retrieve data directly
from cBioPortal
.
The cBioPortal for Cancer Genomics is an open-access, open-source resource for interactive exploration of multidimensional cancer genomics data sets. The goal of cBioPortal is to significantly lower the barriers between complex genomic data and cancer researchers by providing rapid, intuitive, and high-quality access to molecular profiles and clinical attributes from large-scale cancer genomics projects, and therefore to empower researchers to translate these rich data sets into biologic insights and clinical applications.
Let’s explore it @ cBioPortal for Cancer Genomics
A complete user guide can also be found at https://docs.cbioportal.org
The G3viz
package allows us to create lollipop diagrams
with the addition of detailed translational effect of genetic
mutations.
Using G3viz we can:
make interactive plots
label positional mutations
use different themes
save charts in PNG or high-quality SVG format
retrieve protein domain information and resolve gene isoforms
map variant classification to mutation class
You can find the tutorial @ https://g3viz.github.io/g3viz/ and a deepening on themes @ https://g3viz.github.io/g3viz/chart_themes.html
install.packages("g3viz", repos = "http://cran.us.r-project.org")
Once the installing process has been completed you can normally load it:
library(g3viz)
In the next steps we will explore package examples, explaining passages.
Mutation Annotation Format (MAF) is a commonly-used tab-delimited
text file for storing aggregated mutation information. It could be
generated from VCF file using tools like vcf2maf
.
Translational effect of variant alleles in MAF files are usually in the
column named Variant_Classification
or
Mutation_Type
(i.e., Frame_Shift_Del, Splice_Site). In this
example, the somatic mutation data of the TCGA-BRCA study was originally
downloaded from the GDC Data Portal.
maf.file <- system.file("extdata", "TCGA.BRCA.varscan.somatic.maf.gz", package = "g3viz")
mutation.dat <- readMAF(maf.file)
head(mutation.dat)
## Hugo_Symbol Chromosome Start_Position End_Position Strand
## 1 TP53 chr17 7676564 7676564 +
## 2 TP53 chr17 7676399 7676399 +
## 3 TP53 chr17 7676267 7676280 +
## 4 TP53 chr17 7676273 7676273 +
## 5 TP53 chr17 7676215 7676215 +
## 6 TP53 chr17 7676203 7676203 +
## Variant_Classification Variant_Type Reference_Allele Tumor_Seq_Allele1
## 1 Missense_Mutation SNP C C
## 2 Missense_Mutation SNP G G
## 3 Splice_Site DEL GGGGGACTGTAGAT GGGGGACTGTAGAT
## 4 Splice_Site SNP C G
## 5 Nonsense_Mutation SNP G G
## 6 Nonsense_Mutation SNP C C
## Tumor_Seq_Allele2 HGVSp HGVSp_Short
## 1 T p.Glu11Lys p.E11K
## 2 A p.Pro27Ser p.P27S
## 3 - p.X33_splice
## 4 G p.X33_splice
## 5 A p.Gln52Ter p.Q52*
## 6 A p.Glu56Ter p.E56*
## COSMIC
## 1 COSM3820734;COSM3820735;COSM3820736;COSM3820737;COSM3820738;COSM3820739
## 2 COSM1167900;COSM1167901;COSM1167902;COSM1167903;COSM3522716;COSM3522717
## 3
## 4 COSM2745171;COSM29761;COSM4272163;COSM437642;COSM437643
## 5 COSM1750375;COSM3932748;COSM44041;COSM99948;COSM99949
## 6 COSM12168;COSM126989;COSM126990;COSM2745104;COSM4272098
## Mutation_Class AA_Position
## 1 Missense 11
## 2 Missense 27
## 3 Truncating 33
## 4 Truncating 33
## 5 Truncating 52
## 6 Truncating 56
str(mutation.dat)
## 'data.frame': 828 obs. of 15 variables:
## $ Hugo_Symbol : chr "TP53" "TP53" "TP53" "TP53" ...
## $ Chromosome : chr "chr17" "chr17" "chr17" "chr17" ...
## $ Start_Position : int 7676564 7676399 7676267 7676273 7676215 7676203 98890391 179199035 15872946 7676140 ...
## $ End_Position : int 7676564 7676399 7676280 7676273 7676215 7676203 98890391 179199035 15872947 7676141 ...
## $ Strand : chr "+" "+" "+" "+" ...
## $ Variant_Classification: chr "Missense_Mutation" "Missense_Mutation" "Splice_Site" "Splice_Site" ...
## $ Variant_Type : chr "SNP" "SNP" "DEL" "SNP" ...
## $ Reference_Allele : chr "C" "G" "GGGGGACTGTAGAT" "C" ...
## $ Tumor_Seq_Allele1 : chr "C" "G" "GGGGGACTGTAGAT" "G" ...
## $ Tumor_Seq_Allele2 : chr "T" "A" "-" "G" ...
## $ HGVSp : chr "p.Glu11Lys" "p.Pro27Ser" "" "" ...
## $ HGVSp_Short : chr "p.E11K" "p.P27S" "p.X33_splice" "p.X33_splice" ...
## $ COSMIC : chr "COSM3820734;COSM3820735;COSM3820736;COSM3820737;COSM3820738;COSM3820739" "COSM1167900;COSM1167901;COSM1167902;COSM1167903;COSM3522716;COSM3522717" "" "COSM2745171;COSM29761;COSM4272163;COSM437642;COSM437643" ...
## $ Mutation_Class : chr "Missense" "Missense" "Truncating" "Truncating" ...
## $ AA_Position : num 11 27 33 33 52 56 69 70 73 77 ...
chart.options <- g3Lollipop.theme(theme.name = "default",
title.text = "PIK3CA gene (default theme)")
g3Lollipop(mutation.dat,
gene.symbol = "PIK3CA",
plot.options = chart.options,
output.filename = "default_theme")
## Factor is set to Mutation_Class
## legend title is set to Mutation_Class
# Notice that this is an interactive plot. You have to save it directly from the "Viewer" panel.
In this example, we read genetic mutation data from CSV or TSV files, and visualize it using some customized chart options. Note this is equivalent to dark chart theme.
mutation.csv <- system.file("extdata", "ccle.csv", package = "g3viz")
head(mutation.csv)
## [1] "/Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/library/g3viz/extdata/ccle.csv"
# "gene.symbol.col" : column of gene symbol
# "variant.class.col" : column of variant class
# "protein.change.col" : colum of protein change column
mutation.dat <- readMAF(mutation.csv,
gene.symbol.col = "Hugo_Symbol", # names of column in wich information is contained
variant.class.col = "Variant_Classification",
protein.change.col = "amino_acid_change",
sep = ",") # column-separator of csv file
head(mutation.dat)
## Hugo_Symbol Chromosome Start_Position End_Position Strand
## 1 APC 5 112090592 112090592 +
## 2 TP53 17 7579882 7579882 +
## 3 TP53 17 7579882 7579882 +
## 4 TP53 17 7579882 7579882 +
## 5 TP53 17 7579882 7579882 +
## 6 APC 5 112090633 112090633 +
## Variant_Classification Variant_Type Reference_Allele Tumor_Seq_Allele1
## 1 Missense_Mutation SNP C C
## 2 Missense_Mutation SNP C C
## 3 Missense_Mutation SNP C C
## 4 Missense_Mutation SNP C C
## 5 Missense_Mutation SNP C C
## 6 Silent SNP C C
## Tumor_Seq_Allele2 amino_acid_change Mutation_Class AA_Position
## 1 T p.A2V Missense 2
## 2 G p.E11Q Missense 11
## 3 G p.E11Q Missense 11
## 4 G p.E11Q Missense 11
## 5 G p.E11Q Missense 11
## 6 T p.L16L Inframe 16
str(mutation.dat)
## 'data.frame': 1535 obs. of 13 variables:
## $ Hugo_Symbol : chr "APC" "TP53" "TP53" "TP53" ...
## $ Chromosome : int 5 17 17 17 17 5 3 17 5 3 ...
## $ Start_Position : int 112090592 7579882 7579882 7579882 7579882 112090633 178916661 7579866 112090638 178916673 ...
## $ End_Position : int 112090592 7579882 7579882 7579882 7579882 112090633 178916661 7579866 112090638 178916673 ...
## $ Strand : chr "+" "+" "+" "+" ...
## $ Variant_Classification: chr "Missense_Mutation" "Missense_Mutation" "Missense_Mutation" "Missense_Mutation" ...
## $ Variant_Type : chr "SNP" "SNP" "SNP" "SNP" ...
## $ Reference_Allele : chr "C" "C" "C" "C" ...
## $ Tumor_Seq_Allele1 : chr "C" "C" "C" "C" ...
## $ Tumor_Seq_Allele2 : chr "T" "G" "G" "G" ...
## $ amino_acid_change : chr "p.A2V" "p.E11Q" "p.E11Q" "p.E11Q" ...
## $ Mutation_Class : chr "Missense" "Missense" "Missense" "Missense" ...
## $ AA_Position : num 2 11 11 11 11 16 16 16 17 20 ...
# set up chart options
plot.options <- g3Lollipop.options(
# Chart settings
chart.width = 600,
chart.type = "pie",
chart.margin = list(left = 30, right = 20, top = 20, bottom = 30),
chart.background = "#d3d3d3",
transition.time = 300,
# Lollipop track settings
lollipop.track.height = 200,
lollipop.track.background = "#d3d3d3",
lollipop.pop.min.size = 1,
lollipop.pop.max.size = 8,
lollipop.pop.info.limit = 5.5,
lollipop.pop.info.dy = "0.24em",
lollipop.pop.info.color = "white",
lollipop.line.color = "#a9A9A9",
lollipop.line.width = 3,
lollipop.circle.color = "#ffdead",
lollipop.circle.width = 0.4,
lollipop.label.ratio = 2,
lollipop.label.min.font.size = 12,
lollipop.color.scheme = "dark2",
highlight.text.angle = 60,
# Domain annotation track settings
anno.height = 16,
anno.margin = list(top = 0, bottom = 0),
anno.background = "#d3d3d3",
anno.bar.fill = "#a9a9a9",
anno.bar.margin = list(top = 4, bottom = 4),
domain.color.scheme = "pie5",
domain.margin = list(top = 2, bottom = 2),
domain.text.color = "white",
domain.text.font = "italic 8px Serif",
# Y-axis label
y.axis.label = "# of TP53 gene mutations",
axis.label.color = "#303030",
axis.label.alignment = "end",
axis.label.font = "italic 12px Serif",
axis.label.dy = "-1.5em",
y.axis.line.color = "#303030",
y.axis.line.width = 0.5,
y.axis.line.style = "line",
y.max.range.ratio = 1.1,
# Chart title settings
title.color = "#303030",
title.text = "TP53 gene (customized chart options)",
title.font = "bold 12px monospace",
title.alignment = "start",
# Chart legend settings
legend = TRUE,
legend.margin = list(left=20, right = 0, top = 10, bottom = 5),
legend.interactive = TRUE,
legend.title = "Variant classification",
# Brush selection tool
brush = TRUE,
brush.selection.background = "#F8F8FF",
brush.selection.opacity = 0.3,
brush.border.color = "#a9a9a9",
brush.border.width = 1,
brush.handler.color = "#303030",
# tooltip and zoom
tooltip = TRUE,
zoom = TRUE
)
g3Lollipop(mutation.dat,
gene.symbol = "TP53",
protein.change.col = "amino_acid_change",
btn.style = "blue", # blue-style chart download buttons
plot.options = plot.options,
output.filename = "customized_plot")
## Factor is set to Mutation_Class
cBioPortal offers downloadable data for numerous cancer genomics
datasets. g3viz
has a convenient way to retrieve data
directly from this portal.
In this example, we first retrieve genetic mutation data of TP53 gene
for the msk_impact_2017
study, and then visualize the data
using the built-in cbioportal theme, to mimic cBioPortal’s
mutation_mapper.
# Retrieve mutation data of "msk_impact_2017" from cBioPortal
mutation.dat <- getMutationsFromCbioportal("msk_impact_2017", "TP53")
## The Entrez Gene ID for TP53 is: 7157
## Found mutation dataset for msk_impact_2017: msk_impact_2017_mutations
plot.options <- g3Lollipop.theme(theme.name = "cbioportal",
title.text = "TP53 gene (cbioportal theme)",
y.axis.label = "# of TP53 Mutations")
g3Lollipop(mutation.dat,
gene.symbol = "TP53",
btn.style = "gray", # gray-style chart download buttons
plot.options = plot.options,
output.filename = "cbioportal_theme")
## Factor is set to Mutation_Class
## legend title is set to Mutation_Class
But how can we know which data we can find in cBioPortal?
G3viz has, as dependency, cBioPortalData package from Bioconductor.
You can see all available datasets by using:
library(cBioPortalData)
cbio <- cBioPortal()
studies <- getStudies(cbio)
studies=as.data.frame(studies)
head(studies)
## name
## 1 Acute Lymphoblastic Leukemia (St Jude, Nat Genet 2015)
## 2 Hypodiploid Acute Lymphoid Leukemia (St Jude, Nat Genet 2013)
## 3 Adenoid Cystic Carcinoma (FMI, Am J Surg Pathl. 2014)
## 4 Adenoid Cystic Carcinoma (JHU, Cancer Prev Res 2016)
## 5 Adenoid Cystic Carcinoma (MDA, Clin Cancer Res 2015)
## 6 Adenoid Cystic Carcinoma (MGH, Nat Gen 2016)
## description
## 1 Comprehensive profiling of infant MLL-rearranged acute lymphoblastic leukemia (MLL-R ALL)
## 2 Whole genome or exome sequencing of 44 (20 whole genome, 20 exome) ALL tumor/normal pairs.
## 3 Targeted Sequencing of 28 metastatic Adenoid Cystic Carcinoma samples.
## 4 Whole-genome or whole-exome sequencing of 25 adenoid cystic carcinoma tumor/normal pairs.
## 5 WGS of 21 salivary ACCs and targeted molecular analyses of a validation set (81 patients).
## 6 Whole-genome/exome sequencing of 10 ACC PDX models.
## publicStudy pmid citation groups status
## 1 TRUE 25730765 Andersson et al. Nat Genet 2015 PUBLIC 0
## 2 TRUE 23334668 Holmfeldt et al. Nat Genet 2013 0
## 3 TRUE 24418857 Ross et al. Am J Surg Pathl 2014 ACYC;PUBLIC 0
## 4 TRUE 26862087 Rettig et al, Cancer Prev Res 2016 ACYC;PUBLIC 0
## 5 TRUE 26631609 Mitani et al. Clin Cancer Res 2015 ACYC;PUBLIC 0
## 6 TRUE 26829750 Drier et al. Nature Genetics 2016 ACYC 0
## importDate allSampleCount readPermission studyId
## 1 2024-12-03 11:48:34 93 TRUE all_stjude_2015
## 2 2024-12-03 11:50:01 44 TRUE all_stjude_2013
## 3 2024-12-03 11:50:37 28 TRUE acyc_fmi_2014
## 4 2024-12-03 11:50:39 25 TRUE acyc_jhu_2016
## 5 2024-12-03 11:50:44 102 TRUE acyc_mda_2015
## 6 2024-12-03 11:50:49 10 TRUE acyc_mgh_2016
## cancerTypeId referenceGenome
## 1 bll hg19
## 2 myeloid hg19
## 3 acyc hg19
## 4 acyc hg19
## 5 acyc hg19
## 6 acyc hg19
unique(studies$referenceGenome)
## [1] "hg19" "hg38"
unique(studies$studyId)
## [1] "all_stjude_2015" "all_stjude_2013"
## [3] "acyc_fmi_2014" "acyc_jhu_2016"
## [5] "acyc_mda_2015" "acyc_mgh_2016"
## [7] "acyc_sanger_2013" "all_stjude_2016"
## [9] "appendiceal_msk_2022" "blca_plasmacytoid_mskcc_2016"
## [11] "bcc_unige_2016" "brca_broad"
## [13] "blca_mskcc_solit_2014" "blca_nmibc_2017"
## [15] "bfn_duke_nus_2015" "brca_jup_msk_2020"
## [17] "brca_mapk_hp_msk_2021" "brca_hta9_htan_2022"
## [19] "biliary_tract_summit_2022" "bowel_colitis_msk_2022"
## [21] "bladder_columbia_msk_2018" "bladder_msk_2023"
## [23] "bm_nsclc_mskcc_2023" "breast_msk_2018"
## [25] "brca_mskcc_2019" "breast_alpelisib_2020"
## [27] "cfdna_msk_2019" "ccrcc_dfci_2019"
## [29] "breast_ink4_msk_2021" "brca_pareja_msk_2020"
## [31] "cervix_msk_2023" "chol_jhu_2013"
## [33] "chol_nccs_2013" "chol_nus_2012"
## [35] "coadread_mskcc" "cllsll_icgc_2011"
## [37] "coad_caseccc_2015" "chol_msk_2018"
## [39] "chol_icgc_2017" "coadread_mskresistance_2022"
## [41] "cscc_dfarber_2015" "ctcl_columbia_2015"
## [43] "crc_msk_2017" "crc_eo_2020"
## [45] "crc_apc_impact_2020" "crc_nigerian_2020"
## [47] "crc_dd_2022" "difg_msk_2023"
## [49] "escc_ucla_2014" "esca_broad"
## [51] "gct_msk_2016" "egc_msk_2017"
## [53] "hcc_mskimpact_2018" "hcc_msk_venturaa_2018"
## [55] "dlbcl_duke_2017" "glioma_mskcc_2019"
## [57] "glioma_msk_2018" "gbc_msk_2018"
## [59] "gbm_columbia_2019" "gct_msk_2020"
## [61] "egc_mskcc_2020" "egc_msk_tp53_ccr_2022"
## [63] "gbc_mskcc_2022" "gist_msk_2022"
## [65] "egc_msk_2023" "hcc_jcopo_msk_2023"
## [67] "es_dsrct_msk_2023" "kirc_bgi"
## [69] "hnsc_jhu" "hnsc_tcga_pub"
## [71] "hnc_mskcc_2016" "ihch_smmu_2014"
## [73] "histiocytosis_cobi_msk_2019" "ihch_mskcc_2020"
## [75] "ihch_ismms_2015" "ihch_msk_2021"
## [77] "hgsoc_msk_2021" "ilc_msk_2023"
## [79] "lihc_riken" "liad_inserm_fr_2014"
## [81] "lcll_broad_2013" "lgsoc_mapk_msk_2022"
## [83] "luad_tsp" "lung_msk_2017"
## [85] "lung_msk_pdx" "lymphoma_cellline_msk_2020"
## [87] "lung_msk_mind_2020" "mbc_msk_2021"
## [89] "luad_mskimpact_2021" "lung_pdx_msk_2021"
## [91] "lung_nci_2022" "mbl_broad_2012"
## [93] "mbl_icgc" "mbl_pcgp"
## [95] "mcl_idibips_2013" "mds_tokyo_2011"
## [97] "mbl_dkfz_2017" "mds_iwg_2022"
## [99] "alal_target_gdc" "aml_target_gdc"
## [101] "bll_target_gdc" "nbl_target_gdc"
## [103] "os_target_gdc" "wt_target_gdc"
## [105] "mpn_cimr_2013" "mnm_washu_2016"
## [107] "metastatic_solid_tumors_mich_2017" "mixed_selpercatinib_2020"
## [109] "mixed_cfdna_msk_2020" "mel_mskimpact_2020"
## [111] "mixed_kunga_msk_2022" "mixed_impact_subset_2022"
## [113] "nbl_amc_2012" "msk_ch_2020"
## [115] "msk_access_2021" "msk_ch_ped_2021"
## [117] "msk_spectrum_tme_2022" "mtnn_msk_2022"
## [119] "msk_ch_2023" "npc_nusingapore"
## [121] "odg_msk_2017" "nsclc_unito_2016"
## [123] "nsclc_pd1_msk_2018" "nsclc_mskcc_2015"
## [125] "nsclc_ctdx_msk_2022" "paac_msk_jco_2023"
## [127] "panet_jhu_2011" "pcnsl_mayo_2015"
## [129] "panet_shanghai_2013" "plmeso_nyu_2015"
## [131] "past_dkfz_heidelberg_2013" "pediatric_dkfz_2017"
## [133] "paired_bladder_2022" "panet_msk_erc_2023"
## [135] "prad_cpcg_2017" "prad_mskcc_2017"
## [137] "prad_msk_2019" "prad_mcspc_mskcc_2020"
## [139] "scco_mskcc" "rms_nih_2014"
## [141] "sarc_mskcc" "rectal_msk_2019"
## [143] "sarcoma_mskcc_2022" "rbl_cfdna_msk_2020"
## [145] "rbl_mskcc_2020" "prostate_pcbm_swiss_2019"
## [147] "rms_msk_2023" "sarcoma_msk_2023"
## [149] "skcm_tcga" "skcm_yale"
## [151] "skcm_vanderbilt_mskcc_2015" "soft_tissue_msk_2023"
## [153] "thyroid_mskcc_2016" "summit_2018"
## [155] "ucec_msk_2018" "uccc_nih_2017"
## [157] "tmb_mskcc_2018" "ucec_ccr_msk_2022"
## [159] "ucec_ccr_cfdna_msk_2022" "ucec_ancestry_cds_msk_2023"
## [161] "ucec_msk_2024" "um_qimr_2016"
## [163] "urcc_mskcc_2016" "utuc_mskcc_2015"
## [165] "utuc_msk_2019" "utuc_pdx_msk_2019"
## [167] "usarc_msk_2020" "utuc_igbmc_2021"
## [169] "plmeso_msk_2024" "acc_tcga_gdc"
## [171] "blca_tcga_gdc" "brca_tcga_gdc"
## [173] "cesc_tcga_gdc" "chol_tcga_gdc"
## [175] "dlbclnos_tcga_gdc" "esca_tcga_gdc"
## [177] "gbm_tcga_gdc" "hnsc_tcga_gdc"
## [179] "chrcc_tcga_gdc" "ccrcc_tcga_gdc"
## [181] "prcc_tcga_gdc" "aml_tcga_gdc"
## [183] "difg_tcga_gdc" "hcc_tcga_gdc"
## [185] "luad_tcga_gdc" "lusc_tcga_gdc"
## [187] "plmeso_tcga_gdc" "hgsoc_tcga_gdc"
## [189] "paad_tcga_gdc" "mnet_tcga_gdc"
## [191] "prad_tcga_gdc" "read_tcga_gdc"
## [193] "soft_tissue_tcga_gdc" "skcm_tcga_gdc"
## [195] "stad_tcga_gdc" "nsgct_tcga_gdc"
## [197] "thpa_tcga_gdc" "thym_tcga_gdc"
## [199] "ucec_tcga_gdc" "ucs_tcga_gdc"
## [201] "um_tcga_gdc" "brain_cptac_gdc"
## [203] "breast_cptac_gdc" "coad_cptac_gdc"
## [205] "luad_cptac_gdc" "lusc_cptac_gdc"
## [207] "ohnca_cptac_gdc" "ovary_cptac_gdc"
## [209] "pancreas_cptac_gdc" "rcc_cptac_gdc"
## [211] "uec_cptac_gdc" "msk_impact_2017"
## [213] "coad_tcga_gdc" "pancreas_msk_2024"
## [215] "brca_aurora_2023" "crc_orion_2024"
## [217] "lms_msk_2024" "prostate_msk_2024"
## [219] "ucs_msk_2024" "panet_msk_2018"
## [221] "kirp_tcga" "ntrk_msk_2019"
## [223] "heme_msk_impact_2022" "makeanimpact_ccr_2023"
## [225] "pancan_mimsi_msk_2024" "acc_tcga"
## [227] "blca_tcga" "ampca_bcm_2016"
## [229] "blca_dfarber_mskcc_2014" "blca_mskcc_solit_2012"
## [231] "brca_bccrc_xenograft_2014" "blca_bgi"
## [233] "blca_tcga_pub" "brca_bccrc"
## [235] "brca_igr_2015" "acbc_mskcc_2015"
## [237] "acyc_mskcc_2013" "angs_project_painter_2018"
## [239] "all_phase2_target_2018_pub" "aml_target_2018_pub"
## [241] "blca_tcga_pub_2017" "blca_tcga_pan_can_atlas_2018"
## [243] "blca_cornell_2016" "aml_ohsu_2018"
## [245] "acc_2019" "blca_bcan_hcrn_2022"
## [247] "angs_painter_2020" "blca_msk_tcga_2020"
## [249] "brain_cptac_2020" "brca_cptac_2020"
## [251] "brca_mbcproject_2022" "aml_ohsu_2022"
## [253] "asclc_msk_2024" "pcnsl_msk_2024"
## [255] "msk_ctdna_vte_2024" "brca_dfci_2020"
## [257] "brca_tcga" "cesc_tcga"
## [259] "chol_tcga" "brca_sanger"
## [261] "brca_tcga_pub2015" "brca_tcga_pub"
## [263] "cellline_ccle_broad" "ccrcc_irc_2014"
## [265] "ccrcc_utokyo_2013" "coadread_genentech"
## [267] "cellline_nci60" "cll_iuopa_2015"
## [269] "brca_metabric" "coadread_dfci_2016"
## [271] "cll_broad_2015" "brca_mbcproject_wagle_2017"
## [273] "brca_tcga_pan_can_atlas_2018" "cesc_tcga_pan_can_atlas_2018"
## [275] "chol_tcga_pan_can_atlas_2018" "ccle_broad_2019"
## [277] "coad_cptac_2019" "brca_smc_2018"
## [279] "coadread_cass_2020" "cll_broad_2022"
## [281] "coadread_tcga" "dlbc_tcga"
## [283] "coadread_tcga_pub" "desm_broad_2015"
## [285] "dlbc_broad_2012" "cscc_hgsc_bcm_2014"
## [287] "coadread_tcga_pan_can_atlas_2018" "dlbc_tcga_pan_can_atlas_2018"
## [289] "dlbcl_dfci_2018" "difg_glass_2019"
## [291] "cscc_ucsf_2021" "crc_hta11_htan_2021"
## [293] "cscc_ranson_2022" "difg_glass"
## [295] "esca_tcga" "escc_icgc"
## [297] "es_dfarber_broad_2014" "es_iocurie_2014"
## [299] "egc_tmucih_2015" "esca_tcga_pan_can_atlas_2018"
## [301] "egc_trap_msk_2020" "egc_trap_ccr_msk_2023"
## [303] "gbm_tcga" "gbc_shanghai_2014"
## [305] "gbm_tcga_pub" "gbm_tcga_pub2013"
## [307] "gbm_tcga_pan_can_atlas_2018" "gbm_mayo_pdx_sarkaria_2019"
## [309] "gbm_cptac_2021" "gist_msk_2023"
## [311] "kirc_tcga" "kich_tcga"
## [313] "hnsc_tcga" "kirc_tcga_pub"
## [315] "kich_tcga_pub" "hnsc_broad"
## [317] "hnsc_mdanderson_2013" "hcc_inserm_fr_2015"
## [319] "hccihch_pku_2019" "hcc_meric_2021"
## [321] "hcc_clca_2024" "hcc_msk_2024"
## [323] "laml_tcga" "lgg_tcga"
## [325] "lihc_tcga" "luad_tcga"
## [327] "laml_tcga_pub" "lgg_ucsf_2014"
## [329] "lgggbm_tcga_pub" "lihc_amc_prv"
## [331] "luad_mskcc_2015" "luad_broad"
## [333] "luad_mskcc_2020" "luad_mskcc_2023_met_organotropism"
## [335] "luad_oncosg_2020" "luad_msk_npjpo_2021"
## [337] "luad_cptac_2020" "lusc_tcga"
## [339] "meso_tcga" "luad_tcga_pub"
## [341] "lusc_tcga_pub" "mm_broad"
## [343] "mpnst_mskcc" "mbl_sickkids_2016"
## [345] "mrt_bcgsc_2016" "mel_tsam_liang_2017"
## [347] "mel_ucla_2016" "mixed_pipseq_2017"
## [349] "mixed_allen_2018" "mbn_mdacc_2013"
## [351] "mds_mskcc_2020" "mel_dfci_2019"
## [353] "lung_smc_2016" "mixed_msk_tcga_2021"
## [355] "mng_utoronto_2021" "mpcproject_broad_2021"
## [357] "lusc_cptac_2021" "mbn_sfu_2023"
## [359] "mbn_msk_2024" "ov_tcga"
## [361] "paad_tcga" "nccrcc_genentech_2014"
## [363] "ov_tcga_pub" "paac_jhu_2014"
## [365] "paad_icgc" "paad_utsw_2015"
## [367] "nepc_wcm_2016" "nbl_ucologne_2015"
## [369] "nsclc_tcga_broad_2016" "paad_qcmg_uq_2016"
## [371] "pact_jhu_2011" "nbl_broad_2013"
## [373] "nhl_bcgsc_2011" "nhl_bcgsc_2013"
## [375] "nsclc_tracerx_2017" "nbl_target_2018_pub"
## [377] "nsclc_mskcc_2018" "paad_cptac_2021"
## [379] "pcpg_tcga" "prad_tcga"
## [381] "prad_fhcrc" "prad_broad"
## [383] "prad_broad_2013" "prad_mich"
## [385] "prad_mskcc" "prad_mskcc_2014"
## [387] "prad_su2c_2015" "prad_tcga_pub"
## [389] "panet_arcnet_2017" "pcpg_tcga_pub"
## [391] "prad_eururol_2017" "prad_p1000"
## [393] "prad_su2c_2019" "prostate_dkfz_2018"
## [395] "pptc_2019" "prad_cdk12_mskcc_2020"
## [397] "prad_mskcc_cheny1_organoids_2014" "pan_origimed_2020"
## [399] "prad_msk_stopsack_2021" "pancan_pcawg_2020"
## [401] "prad_pik3r1_msk_2021" "pog570_bcgsc_2020"
## [403] "prad_organoids_msk_2022" "ptad_msk_2024"
## [405] "prad_msk_mdanderson_2023" "sarc_tcga"
## [407] "stad_tcga" "tgct_tcga"
## [409] "thym_tcga" "thca_tcga"
## [411] "ucec_tcga" "ucs_tcga"
## [413] "uvm_tcga" "sclc_clcgp"
## [415] "sclc_jhu" "skcm_broad"
## [417] "skcm_broad_dfarber" "stad_pfizer_uhongkong"
## [419] "stad_tcga_pub" "stad_uhongkong"
## [421] "stad_utokyo" "tet_nci_2014"
## [423] "thca_tcga_pub" "ucs_jhu_2014"
## [425] "ucec_tcga_pub" "stes_tcga_pub"
## [427] "skcm_broad_brafresist_2012" "sarc_tcga_pub"
## [429] "skcm_mskcc_2014" "sclc_cancercell_gardner_2017"
## [431] "rt_target_2018_pub" "wt_target_2018_pub"
## [433] "skcm_tcga_pub_2015" "vsc_cuk_2018"
## [435] "utuc_cornell_baylor_mdacc_2019" "skcm_dfci_2015"
## [437] "sclc_ucologne_2015" "stad_oncosg_2018"
## [439] "rectal_msk_2022" "ucec_cptac_2020"
## [441] "stmyec_wcm_2022" "sarcoma_msk_2022"
## [443] "hnsc_tcga_pan_can_atlas_2018" "kich_tcga_pan_can_atlas_2018"
## [445] "kirc_tcga_pan_can_atlas_2018" "kirp_tcga_pan_can_atlas_2018"
## [447] "laml_tcga_pan_can_atlas_2018" "lgg_tcga_pan_can_atlas_2018"
## [449] "lihc_tcga_pan_can_atlas_2018" "luad_tcga_pan_can_atlas_2018"
## [451] "lusc_tcga_pan_can_atlas_2018" "meso_tcga_pan_can_atlas_2018"
## [453] "ov_tcga_pan_can_atlas_2018" "paad_tcga_pan_can_atlas_2018"
## [455] "pcpg_tcga_pan_can_atlas_2018" "prad_tcga_pan_can_atlas_2018"
## [457] "sarc_tcga_pan_can_atlas_2018" "nst_nfosi_ntap"
## [459] "skcm_tcga_pan_can_atlas_2018" "stad_tcga_pan_can_atlas_2018"
## [461] "tgct_tcga_pan_can_atlas_2018" "thca_tcga_pan_can_atlas_2018"
## [463] "thym_tcga_pan_can_atlas_2018" "ucec_tcga_pan_can_atlas_2018"
## [465] "ucs_tcga_pan_can_atlas_2018" "uvm_tcga_pan_can_atlas_2018"
## [467] "coad_silu_2022" "acc_tcga_pan_can_atlas_2018"
## [469] "msk_chord_2024" "pancan_mappyacts_2022"
## [471] "msk_met_2021" "pdac_msk_2024"
## [473] "rectal_radiation_msk_2024" "normal_skin_fibroblast_2024"
## [475] "normal_skin_keratinocytes_2024" "breast_msk_2025"
## [477] "blca_msk_2024" "normal_skin_melanocytes_2024"
## [479] "ovary_geomx_gray_foundation_2024" "brca_fuscc_2020"
## [481] "braf_msk_archer_2024" "braf_msk_impact_2024"
## [483] "thyroid_gatci_2024"
sort(unique(studies$cancerTypeId))
## [1] "acbc" "acc" "acyc" "alal"
## [5] "aml" "ampca" "angs" "apad"
## [9] "bcc" "bfn" "biliary_tract" "bladder"
## [13] "blca" "bll" "bowel" "brain"
## [17] "brca" "breast" "ccrcc" "cervix"
## [21] "cesc" "chol" "chrcc" "cllsll"
## [25] "coad" "coadread" "cscc" "desm"
## [29] "difg" "dlbclnos" "egc" "es"
## [33] "esca" "escc" "gbc" "gbm"
## [37] "gist" "hcc" "hccihch" "hdcn"
## [41] "head_neck" "hgsoc" "hnsc" "ihch"
## [45] "lgsoc" "liad" "luad" "lung"
## [49] "lusc" "lymph" "mbc" "mbl"
## [53] "mbn" "mcl" "mds" "mel"
## [57] "mixed" "mnet" "mng" "mnm"
## [61] "mpn" "mpnst" "mrt" "mtnn"
## [65] "myeloid" "nbl" "nccrcc" "nhl"
## [69] "npc" "nsclc" "nsgct" "nst"
## [73] "odg" "ohnca" "os" "ovary"
## [77] "paac" "paad" "pact" "pancreas"
## [81] "panet" "past" "pcm" "pcnsl"
## [85] "plmeso" "prad" "prcc" "prostate"
## [89] "ptad" "rbl" "rcc" "read"
## [93] "rms" "scco" "sclc" "skcm"
## [97] "skin" "soft_tissue" "stad" "stmyec"
## [101] "stomach" "testis" "tet" "thpa"
## [105] "thym" "thyroid" "uccc" "ucec"
## [109] "ucs" "uec" "um" "urcc"
## [113] "usarc" "utuc" "vsc" "wt"
# Suppose we are interested in prostate cancer, we can explore available datasets:
prad=subset(studies, cancerTypeId=="prad")
# To know for which genes mutation data are available you can directly interrogate cBioPortal by searching experiment "name".
prad2=subset(prad, name=="Prostate Adenocarcinoma (MSK, Clin Cancer Res. 2022)")
By searching this experiment on cBioPortal and clicking on the
Explore Selected Study
button, we obtain:
https://www.cbioportal.org/study/summary?id=prad_pik3r1_msk_2021
In the panel Mutated Genes
we can see statistics for
mutations, so we can choose our gene of interest.
Notice that each experiment is related to a scientific publication (you can search for it!).
On the contrary, if we are interested in a specific study we found in
cBioPortal, we can check if it is available for downloading and retrieve
its studyId that we have to use in
getMutationsFromCbioportal()
function.
mutation.dat <- getMutationsFromCbioportal("prad_pik3r1_msk_2021", "FOXA1")
## The Entrez Gene ID for FOXA1 is: 3169
## Found mutation dataset for prad_pik3r1_msk_2021: prad_pik3r1_msk_2021_mutations
# "cbioportal" chart theme
plot.options <- g3Lollipop.theme(theme.name = "nature2",
title.text = "FOXA1 mutations",
y.axis.label = "# of FOXA1 Mutations")
g3Lollipop(mutation.dat,
gene.symbol = "FOXA1",
btn.style = "gray", # gray-style chart download buttons
plot.options = plot.options)
## Factor is set to Mutation_Class
## legend title is set to Mutation_Class
Now, we move directly to the tutorial.