#Introduction
Circular layout is very useful to represent complicated information.
it represents information with long axes or a large amount of categories;
it intuitively shows data with multiple tracks focusing on the same object;
it easily demonstrates relations between elements.
Gu, Z. (2014) circlize implements and enhances circular visualization in R. Bioinformatics. DOI: 10.1093/bioinformatics/btu393
A circular layout is composed of sectors and tracks.
The rule for making the circular plot is rather simple. It follows the sequence of initialize layout -> create track -> add graphics -> create track -> add graphics - … -> clear. Graphics can be added at any time as long as the tracks are created. Details are shown in Figure and as follows:
For data in different categories, they are allocated into different sectors and for multiple measurements of the same category, they are represented as stacked tracks from outside of the circle to the inside.
The intersection of a sector and a track is called a cell (or a grid, a panel), which is the basic unit in a circular layout. It is an imaginary plotting region for data points in a certain category.
circlize implements low-level graphic functions for adding graphics in the circular plotting regions, so that more complicated graphics can be easily generated by different combinations of low-level graphic functions.
Function | Action |
---|---|
circos.points() |
adds points in a cell. |
circos.lines() |
adds lines in a cell. |
circos.segments() |
adds segments in a cell. |
circos.rect() |
adds rectangles in a cell. |
circos.polygon() |
adds polygons in a cell. |
circos.text() |
adds text in a cell. |
circos.axis() , circos.yaxis() |
add axis in a cell. |
Following functions arrange the circular layout.
Function | Action |
---|---|
circos.initialize() |
allocates sectors on the circle. |
circos.track() |
creates plotting regions for cells in one single track. |
circos.update() |
updates an existed cell. |
circos.par() |
graphic parameters. |
circos.info() |
prints general parameters of current circular plot. |
circos.clear() |
resets graphic parameters and internal variables. |
Let’s generate some random data. There needs a character vector to represent categories, a numeric vector of x values and a vectoe of y values.
set.seed(123)
n = 1000
df = data.frame(
factors = sample(letters[1:8], n, replace = TRUE)
, x = rnorm(n)
, y = runif(n)
)
str(df)
## 'data.frame': 1000 obs. of 3 variables:
## $ factors: chr "g" "g" "c" "f" ...
## $ x : num -0.602 -0.994 1.027 0.751 -1.509 ...
## $ y : num 0.206 0.943 0.379 0.626 0.184 ...
First we initialize the circular layout.
The circle is split into sectors based on the data range on x-axes in each category.
df$x
is split by df$factors
and the width
of sectors are automatically calculated based on data ranges in each
category.
library(circlize)
circos.par("track.height" = 0.1)
circos.par("points.overflow.warning" = FALSE)
circos.initialize(factors = df$factors, x = df$x)
After initialization, graphics can be added to the plot in a track-by-track manner.
All tracks should be first created by
circos.trackPlotRegion()
or, for short,circos.track()
, then the low-level functions can be added afterwards.
Just think in the base R graphic engine, you need first call
plot()
then you can use functions such as
points()
and lines()
to add graphics. Here we
only need to specify the y ranges for each cell.
circos.track(
factors = df$factors
, y = df$y,
panel.fun = function(x, y) {
circos.text(CELL_META$xcenter,
CELL_META$cell.ylim[2] + uy(5, "mm"),
CELL_META$sector.index)
# Adding Axes
circos.axis(labels.cex = 0.6)
}
)
col = rep(c("#FF0000", "#00FF00"), 4)
# Adding points
circos.trackPoints(df$factors, df$x, df$y, col = col, pch = 16, cex = 0.5)
# add sector name outside
circos.text(-1, 0.5, "text", sector.index = "a", track.index = 1)
## ========================================
## circlize version 0.4.15
## CRAN page: https://cran.r-project.org/package=circlize
## Github page: https://github.com/jokergoo/circlize
## Documentation: https://jokergoo.github.io/circlize_book/book/
##
## If you use it in published research, please cite:
## Gu, Z. circlize implements and enhances circular visualization
## in R. Bioinformatics 2014.
##
## This message can be suppressed by:
## suppressPackageStartupMessages(library(circlize))
## ========================================
circos.track()
creates plotting region in a
cell-by-cell manner.Thus,
panel.fun
actually means adding graphics in the “current cell”
circos.axis()
draws x-axes on the top of each cell (or
the outside of each cell).-circos.text()
add sector name outside the first
track.
CELL_META
provides “meta information” for the current cell. There are several parameters which can be retrieved byCELL_META
.
When specifying the position of text on the y direction, an offset of
uy(5,"mm")
is added to the y position of the text.
In circos.text()
, x and y values are measured in the
data coordinate (the coordinate in cell), and uy()
function
(or ux()
which is measured on x direction) converts
absolute units to corresponding values in data coordinate.
circos.trackPoints()
simply adds points in all cells
simultaneously.
circos.trackHist()
is a high-level function which means
it creates a new track.bin.size
is explicitly set so that the bin size for
histograms .bgcol = rep(c("#EFEFEF", "#CCCCCC"), 4)
circos.trackHist(df$factors, df$x, bin.size = 0.2, bg.col = bgcol, col = NA)
In the third track and in panel.fun
, we randomly picked
10 data points in each cell, sort them and connect them with lines.
In following code, when factors
, x
and
y
arguments are set in circos.track()
, x and y
values are split by df$factors
and corresponding subset of
x and y values are sent to panel.fun
through
panel.fun
’s x
and y
arguments.
Thus, x
and y
in panel.fun
are
exactly the values in the “current” cell.
circos.track(factors = df$factors, x = df$x, y = df$y,
panel.fun = function(x, y) {
ind = sample(length(x), 10)
x2 = x[ind]
y2 = y[ind]
od = order(x2)
circos.lines(x2[od], y2[od])
})
Now we go back to the second track and update the cell in sector “d”.
-circos.update()
erases graphics which have been added.
circos.update()
can not modify the xlim
and
ylim
of the cell as well as other settings related to the
position of the cell. circos.update()
needs to explicitly
specify the sector index and track index unless the “current” cell is
what you want to update. After the calling of
circos.update()
, the “current” cell is redirected to the
cell you just specified and you can use low-level graphic functions to
add graphics directly into it.
circos.update(sector.index = "d"
, track.index = 2
, bg.col = "#FF8080"
, bg.border = "black")
circos.points(x = -2:2, y = rep(0.5, 5), col = "white")
circos.text(CELL_META$xcenter, CELL_META$ycenter, "UPDATED", col = "white")
Although we have gone back to the second track, when creating a new track, the new track is still created after the track which is most inside.
circos.track(ylim = c(0, 1), panel.fun = function(x, y) {
xlim = CELL_META$xlim
ylim = CELL_META$ylim
breaks = seq(xlim[1], xlim[2], by = 0.1)
n_breaks = length(breaks)
circos.rect(breaks[-n_breaks], rep(ylim[1], n_breaks - 1),
breaks[-1], rep(ylim[2], n_breaks - 1),
col = rand_color(n_breaks), border = NA)
})
In the most inside of the circle, links or ribbons are added. There can be links from single point to point, point to interval or interval to interval.
circos.link("a", 0, "b", 0, h = 0.4)
circos.link("c", c(-0.5, 0.5), "d", c(-0.5,0.5), col = "red", border = "blue", h = 0.2)
circos.link("e", 0, "g", c(-1,1), col = "green", border = "black", lwd = 2, lty = 2)
Finally we need to reset the graphic parameters and internal variables, so that it will not mess up your next plot.
circos.clear()
circlize package particularly provides functions which focus on genomic plots. These functions are synonymous to the basic graphic functions but expect special format of input data:
Function | Action |
---|---|
circos.genomicTrack() |
create a new track and add graphics. |
circos.genomicPoints() |
low-level function, add points. |
circos.genomicLines() |
low-level function, add lines or segments. |
circos.genomicRect() |
low-level function, add rectangles. |
circos.genomicText() |
low-level function, add text. |
circos.genomicLink() |
add links. |
Genomic data is usually stored as BED format.
circlize provides a simple function
generateRandomBed()
which generates random genomic
data.
In the function, nr
and nc
control the
number of rows and numeric columns that users need. Please note
nr
are not exactly the same as the number of rows which are
returned by the function. fun
argument is a self-defined
function to generate random values.
set.seed(999)
bed = generateRandomBed(nr = 200, nc = 4)
head(bed)
## chr start end value1 value2 value3 value4
## 1 chr1 2660551 16303823 0.5402657 -0.01736319 -0.57347883 0.49264928
## 2 chr1 22482472 25730719 -0.1234061 -0.05833207 -0.70408976 -0.61428666
## 3 chr1 26350059 37256898 -1.0568685 -0.32249104 -0.14116436 0.04261233
## 4 chr1 39575364 51959170 -0.1852637 0.87220580 -0.20888501 -0.60204692
## 5 chr1 58315212 60462404 0.2614339 0.18304724 0.49833176 -0.18842388
## 6 chr1 67739286 78935415 0.2589028 -0.03340496 -0.05314289 0.68182429
Cytoband data is an ideal data source to initialize genomic plots. It contains length of chromosomes as well as so called “chromosome band” annotation to help to identify positions on chromosomes.
If you work on human genome, the most straightforward way is to
directly use circos.initializeWithIdeogram()
.
circos.initializeWithIdeogram()
text(0, 0, "default", cex = 1)
circos.info()
## All your sectors:
## [1] "chr1" "chr2" "chr3" "chr4" "chr5" "chr6" "chr7" "chr8" "chr9"
## [10] "chr10" "chr11" "chr12" "chr13" "chr14" "chr15" "chr16" "chr17" "chr18"
## [19] "chr19" "chr20" "chr21" "chr22" "chrX" "chrY"
##
## All your tracks:
## [1] 1 2
##
## Your current sector.index is chrY
## Your current track.index is 2
circos.clear()
By default, circos.initializeWithIdeogram()
initializes
the plot with cytoband data of human genome hg19
. Users can
also initialize with other species by specifying species
argument and it will automatically download cytoband files for
corresponding species.
circos.initializeWithIdeogram(species = "hg19")
circos.initializeWithIdeogram(species = "mm10")
When you are dealing rare species and there is no cytoband data
available yet, circos.initializeWithIdeogram()
will try to
continue to download the “chromInfo” file form UCSC, which also contains
lengths of chromosomes, but of course, there is no ideogram track on the
plot.
In some cases, when there is no internet connection for downloading
or there is no corresponding data avaiable on UCSC yet. You can manually
construct a data frame which contains ranges of chromosomes or a file
path if it is stored in a file, and sent to
circos.initializeWithIdeogram()
.
cytoband.file = system.file(package = "circlize", "extdata", "cytoBand.txt")
circos.initializeWithIdeogram(cytoband.file)
cytoband.df = read.table(cytoband.file, colClasses = c("character", "numeric",
"numeric", "character", "character"), sep = "\t")
str(cytoband.df)
circos.initializeWithIdeogram(cytoband.df)
If you read cytoband data directly from file, please explicitly
specify colClasses
arguments and set the class of position
columns as numeric
. The reason is since positions are
represented as integers, read.table
would treat those
numbers as integer
by default. In initialization of
circular plot, circlize needs to calculate the
summation of all chromosome lengths. The summation of such large
integers would throw error of integer overflow.
By default, circos.intializeWithIdeogram()
uses all
chromosomes which are available in cytoband data to initialize the
circular plot. Users can choose a subset of chromosomes by specifying
chromosome.index
. This argument is also for ordering
chromosomes (Figure @ref(fig:genomic-initialize-ideogram-subset)).
circos.initializeWithIdeogram(chromosome.index = paste0("chr", c(3,5,2,8)))
text(0, 0, "subset of chromosomes", cex = 1)
circos.clear()
When there is no cytoband data for the specified species, and when
chromInfo data is used instead, there may be many many extra short
contigs. chromosome.index
can also be useful to remove
unnecessary contigs.
After the initialization of the circular plot,
circos.initializeWithIdeogram()
additionally creates a
track where there are genomic axes and chromosome names, and create
another track where there is an ideogram (depends on whether cytoband
data is available). plotType
argument is used to control
which type of tracks to add.
circos.initializeWithIdeogram(plotType = c("axis", "labels"))
text(0, 0, "plotType = c('axis', 'labels')", cex = 1)
circos.clear()
circos.initializeWithIdeogram(plotType = NULL)
text(0, 0, "plotType = NULL", cex = 1)
circos.clear()
Similar as general circular plot, the parameters for the layout can
be controlled by circos.par()
. Do remember when you
explicitly set circos.par()
, you need to call
circos.clear()
to finish the plotting.
circos.par("start.degree" = 90)
circos.initializeWithIdeogram()
circos.clear()
text(0, 0, "'start.degree' = 90", cex = 1)
circos.par("gap.degree" = rep(c(2, 4), 12))
circos.initializeWithIdeogram()
circos.clear()
text(0, 0, "'gap.degree' = rep(c(2, 4), 12)", cex = 1)
By default circos.initializeWithIdeogram()
initializes
the layout and adds two tracks. When plotType
argument is
set to NULL
, the circular layout is only initialized but
nothing is added. This makes it possible for users to completely design
their own style of chromosome track.
In the following example, we use different colors to represent chromosomes and put chromosome names in the center of each cell.
set.seed(123)
circos.initializeWithIdeogram(plotType = NULL)
circos.track(ylim = c(0, 1), panel.fun = function(x, y) {
chr = CELL_META$sector.index
xlim = CELL_META$xlim
ylim = CELL_META$ylim
circos.rect(xlim[1], 0, xlim[2], 1, col = rand_color(1))
circos.text(mean(xlim), mean(ylim), chr, cex = 0.7, col = "white",
facing = "inside", niceFacing = TRUE)
}, track.height = 0.15, bg.border = NA)
circos.clear()
Chromosome is just a special case of genomic category.
circos.genomicInitialize()
can initialize circular layout
with any type of genomic categories. In fact,
circos.initializeWithIdeogram()
is implemented by
circos.genomicInitialize()
. The input data for
circos.genomicInitialize()
is also a data frame with at
least three columns. The first column is genomic category (for cytoband
data, it is chromosome name), and the next two columns are positions in
each genomic category. The range in each category will be inferred as
the minimum position and the maximum position in corresponding
category.
In the following example, a circular plot is initialized with three genes.
df = data.frame(
name = c("TP53", "TP63", "TP73"),
start = c(7565097, 189349205, 3569084),
end = c(7590856, 189615068, 3652765))
circos.genomicInitialize(df)
Note it is not necessary that the record for each gene is only one row.
In following example, we plot the transcripts for TP53, TP63 and TP73 in a circular layout.
tp_family = readRDS(system.file(package = "circlize", "extdata", "tp_family_df.rds"))
head(tp_family)
## gene start end transcript exon
## 1 TP53 7565097 7565332 ENST00000413465.2 7
## 2 TP53 7577499 7577608 ENST00000413465.2 6
## 3 TP53 7578177 7578289 ENST00000413465.2 5
## 4 TP53 7578371 7578554 ENST00000413465.2 4
## 5 TP53 7579312 7579590 ENST00000413465.2 3
## 6 TP53 7579700 7579721 ENST00000413465.2 2
In the following code, we first create a track which identifies three genes.
circos.genomicInitialize(tp_family)
circos.track(ylim = c(0, 1),
bg.col = c("#FF000040", "#00FF0040", "#0000FF40"),
bg.border = NA, track.height = 0.05)
Next, we put transcripts one after the other for each gene. It is simply to add lines and rectangles.
n = max(tapply(tp_family$transcript, tp_family$gene, function(x) length(unique(x))))
circos.genomicTrack(tp_family
, ylim = c(0.5, n + 0.5)
, panel.fun = function(region, value, ...) {
all_tx = unique(value$transcript)
for(i in seq_along(all_tx)) {
l = value$transcript == all_tx[i]
# for each transcript
current_tx_start = min(region[l, 1])
current_tx_end = max(region[l, 2])
circos.lines(c(current_tx_start, current_tx_end),
c(n - i + 1, n - i + 1), col = "#CCCCCC")
circos.genomicRect(region[l, , drop = FALSE], ytop = n - i + 1 + 0.4,
ybottom = n - i + 1 - 0.4, col = "orange", border = NA)
}
}
, bg.border = NA, track.height = 0.4)
circos.clear()
We first define a function extend_chromosomes()
which
copy data in subset of chromosomes into the original data frame.
extend_chromosomes = function(bed, chromosome, prefix = "zoom_") {
zoom_bed = bed[bed[[1]] %in% chromosome, , drop = FALSE]
zoom_bed[[1]] = paste0(prefix, zoom_bed[[1]])
rbind(bed, zoom_bed)
}
We use read.cytoband()
to download and read cytoband
data from UCSC. In following, x ranges for normal chromosomes and zoomed
chromosomes are normalized separetely.
cytoband = read.cytoband()
cytoband_df = cytoband$df
chromosome = cytoband$chromosome
xrange = c(cytoband$chr.len, cytoband$chr.len[c("chr1", "chr2")])
normal_chr_index = 1:24
zoomed_chr_index = 25:26
# normalize in normal chromsomes and zoomed chromosomes separately
sector.width = c(xrange[normal_chr_index] / sum(xrange[normal_chr_index]),
xrange[zoomed_chr_index] / sum(xrange[zoomed_chr_index]))
The extended cytoband data which is in form of a data frame is sent
to circos.initializeWithIdeogram()
. You can see the
ideograms for chromosome 1 and 2 are zoomed (Figure
@ref(fig:genomic-zoom)).
circos.par(start.degree = 90)
circos.initializeWithIdeogram(extend_chromosomes(cytoband_df, c("chr1", "chr2")),
sector.width = sector.width)
Add a new track.
bed = generateRandomBed(500)
circos.genomicTrack(extend_chromosomes(bed, c("chr1", "chr2")),
panel.fun = function(region, value, ...) {
circos.genomicPoints(region, value, pch = 16, cex = 0.3)
})
Add a link from original chromosome to the zoomed chromosome (Figure @ref(fig:genomic-zoom)).
circos.link("chr1", get.cell.meta.data("cell.xlim", sector.index = "chr1"),
"zoom_chr1", get.cell.meta.data("cell.xlim", sector.index = "zoom_chr1"),
col = "#00000020", border = NA)
circos.clear()
I have a data frame with common features between 4 groups of patients and cell types. I have a lot of different features, but the shared ones (present in more than 1 group) are just a few.
I want to make a circos plot that reflects the few connections between shared features across groups of patients and cell types, while giving an idea of how many unshared features there are in each group.
The way I think of it, it should be a plot with 4 sectors (one for each group of patient and cell type) with a few connections between them. Each sector size should reflect the total number of features in the group, and most of this area should not be connected to other groups, but empty.
# Prepare the data --------------------------------------------------------------
nonshared <- data.frame(patient=c(rep("Patient1",20), rep("Patient2",10)), cell.type=c(rep("CellType1",12), rep("CellType2",8),rep("CellType1",6), rep("CellType2",4)), feature=paste("a",1:30,sep=''))
sharedcells <- data.frame(patient=c(rep("Patient1",3), rep("Patient2",4)), cell.types=c(rep("CellType1||CellType2",3),rep("CellType1||CellType2",4)), features=c("b1||b1","b1||b1","b1||b1","b2||b2","b3||b3","b4||b4","b4||b5"))
sharedpats <- data.frame(patients=c(rep("Patient1||Patient2",2), rep("Patient1||Patient2",6)), cell.type=c(rep("CellType1",2),rep("CellType2",6)), features=c("c1||c1","c2||c1","c3||c3","c3||c4","c3||c5","c6||c5","c7||c7","c8||c8"))
sharedall1 <- data.frame(both=c(rep("Patient1-CellType1||Patient1-CellType2||Patient2-CellType1||Patient2-CellType2",4)), features=c("d1||d1||d1||d1","d2||d2||d2||d3","d4||d4||d3||d3","d5||d5||d5||d5"))
sharedall2 <- data.frame(both=c(rep("Patient1-CellType1||Patient1-CellType2||Patient2-CellType1",2)), features=c("d6||d6||d6","d7||d7||d7"))
sharedall3 <- data.frame(both="Patient1-CellType1||Patient1-CellType2||Patient2-CellType2", features="d8||d8||d9")
sharedall4 <- data.frame(both="Patient1-CellType1||Patient2-CellType1||Patient2-CellType2", features="d10||d10||d9")
sharedall5 <- data.frame(both=c(rep("Patient1-CellType2||Patient2-CellType1||Patient2-CellType2",3)), features=c("d11||d11||d11","d12||d13||d13","d12||d14||d14"))
sharedall6 <- data.frame()
sharedall7 <- data.frame(both=c(rep("Patient1-CellType2||Patient2-CellType1",2)), features=c("d15||d16","d17||d17"))
sharedall <- rbind(sharedall1, sharedall2, sharedall3, sharedall4, sharedall5, sharedall6, sharedall7)
knitr::kable(
nonshared,
caption = "Patient, cell type and feature"
)
patient | cell.type | feature |
---|---|---|
Patient1 | CellType1 | a1 |
Patient1 | CellType1 | a2 |
Patient1 | CellType1 | a3 |
Patient1 | CellType1 | a4 |
Patient1 | CellType1 | a5 |
Patient1 | CellType1 | a6 |
Patient1 | CellType1 | a7 |
Patient1 | CellType1 | a8 |
Patient1 | CellType1 | a9 |
Patient1 | CellType1 | a10 |
Patient1 | CellType1 | a11 |
Patient1 | CellType1 | a12 |
Patient1 | CellType2 | a13 |
Patient1 | CellType2 | a14 |
Patient1 | CellType2 | a15 |
Patient1 | CellType2 | a16 |
Patient1 | CellType2 | a17 |
Patient1 | CellType2 | a18 |
Patient1 | CellType2 | a19 |
Patient1 | CellType2 | a20 |
Patient2 | CellType1 | a21 |
Patient2 | CellType1 | a22 |
Patient2 | CellType1 | a23 |
Patient2 | CellType1 | a24 |
Patient2 | CellType1 | a25 |
Patient2 | CellType1 | a26 |
Patient2 | CellType2 | a27 |
Patient2 | CellType2 | a28 |
Patient2 | CellType2 | a29 |
Patient2 | CellType2 | a30 |
# library(circlize)
library(data.table)
library(magrittr)
library(stringr)
library(RColorBrewer)
# Split and pad with 0 ----------------------------------------------------
fun <- function(x) unlist(tstrsplit(x, split = '||', fixed = TRUE))
nonshared %>% setDT()
sharedcells %>% setDT()
sharedpats %>% setDT()
sharedall %>% setDT()
nonshared <- nonshared[, .(group = paste(patient, cell.type, sep = '-'), feature)][, feature := paste0('a', str_pad(str_extract(feature, '[0-9]+'), 2, 'left', '0'))]
sharedcells <- sharedcells[, lapply(.SD, fun), by = 1:nrow(sharedcells)][, .(group = paste(patient, cell.types, sep = '-'), feature = features)][, feature := paste0('b', str_pad(str_extract(feature, '[0-9]+'), 2, 'left', '0'))]
sharedpats <- sharedpats[, lapply(.SD, fun), by = 1:nrow(sharedpats)][, .(group = paste(patients, cell.type, sep = '-'), feature = features)][, feature := paste0('c', str_pad(str_extract(feature, '[0-9]+'), 2, 'left', '0'))]
sharedall <- sharedall[, lapply(.SD, fun), by = 1:nrow(sharedall)][, .(group = both, feature = features)][, feature := paste0('d', str_pad(str_extract(feature, '[0-9]+'), 2, 'left', '0'))]
dt_split <- rbindlist(
list(
nonshared,
sharedcells,
sharedpats,
sharedall
)
)
# Set key and self join to find shared features ---------------------------
setkey(dt_split, feature)
dt_join <- dt_split[dt_split, .(group, i.group, feature), allow.cartesian = TRUE] %>%
.[group != i.group, ]
# Create a "sorted key" ---------------------------------------------------
# key := paste(sort(.SD)...
# To leave only unique combinations of groups and features
dt_join <-
dt_join[,
key := paste(sort(.SD), collapse = '|'),
by = 1:nrow(dt_join),
.SDcols = c('group', 'i.group')
] %>%
setorder(feature, key) %>%
unique(by = c('key', 'feature')) %>%
.[, .(
group_from = i.group,
group_to = group,
feature = feature)]
# Rename and key ----------------------------------------------------------
dt_split %>% setnames(old = 'group', new = 'group_from') %>% setkey(group_from, feature)
dt_join %>% setkey(group_from, feature)
# Individual features -----------------------------------------------------
# Features without connections --------------------------------------------
dt_singles <- dt_split[, .(group_from, group_to = group_from, feature)] %>%
.[, N := .N, by = feature] %>%
.[!(N > 1 & group_from == group_to), !c('N')]
# Bind all, add some columns etc. -----------------------------------------
dt_bind <- rbind(dt_singles, dt_join) %>% setorder(group_from, feature, group_to)
dt_bind[, ':='(
group_from_f = paste(group_from, feature, sep = '.'),
group_to_f = paste(group_to, feature, sep = '.'))]
dt_bind[, feature := NULL] # feature can be removed
# Colour
dt_bind[, colour := ifelse(group_from_f == group_to_f, "#FFFFFF00", '#00000050')] # Change first to #FF0000FF to show red blobs
# Prep. sectors -----------------------------------------------------------
sectors_f <- union(dt_bind[, group_from_f], dt_bind[, group_to_f]) %>% sort()
colour_lookup <-
union(dt_bind[, group_from], dt_bind[, group_to]) %>% sort() %>%
structure(seq_along(.) + 1, names = .)
sector_colours <- str_replace_all(sectors_f, '.[a-d][0-9]+', '') %>%
colour_lookup[.]
# Gaps between sectors ----------------------------------------------------
gap_sizes <- c(0.0, 1.0)
gap_degree <-
sapply(table(names(sector_colours)), function(i) c(rep(gap_sizes[1], i-1), gap_sizes[2])) %>%
unlist() %>% unname()
# gap_degree <- rep(0, length(sectors_f)) # Or no gap
# Each "sector" is a separate patient/cell/feature combination
circos.par(gap.degree = gap_degree)
circos.initialize(sectors_f, xlim = c(0, 1))
circos.trackPlotRegion(ylim = c(0, 1), track.height = 0.05, bg.col = sector_colours, bg.border = NA)
for(i in 1:nrow(dt_bind)) {
row_i <- dt_bind[i, ]
circos.link(
row_i[['group_from_f']], c(0, 1),
row_i[['group_to_f']], c(0, 1),
border = NA, col = row_i[['colour']]
)
}
# "Feature" labels
circos.trackPlotRegion(track.index = 2, ylim = c(0, 1), panel.fun = function(x, y) {
sector.index = get.cell.meta.data("sector.index")
circos.text(0.5, 0.25, sector.index, col = "white", cex = 0.6, facing = "clockwise", niceFacing = TRUE)
}, bg.border = NA)
# "Patient/cell" labels
for(s in names(colour_lookup)) {
sectors <- sectors_f %>% { .[str_detect(., s)] }
highlight.sector(
sector.index = sectors, track.index = 1, col = colour_lookup[s],
text = s, text.vjust = -1, niceFacing = TRUE)
}
circos.clear()
# counts of unique and shared features ------------------------------------
xlims <- dt_split[, .N, by = group_from][, .(x_from = 0, x_to = N)] %>% as.matrix()
links <- dt_join[, .N, by = .(group_from, group_to)]
colours <- dt_split[, unique(group_from)] %>% structure(seq_along(.) + 1, names = .)
library(circlize)
sectors = names(colours)
circos.par(cell.padding = c(0, 0, 0, 0))
circos.initialize(sectors, xlim = xlims)
circos.trackPlotRegion(ylim = c(0, 1), track.height = 0.05, bg.col = colours, bg.border = NA)
for(i in 1:nrow(links)) {
link <- links[i, ]
circos.link(link[[1]], c(0, link[[3]]), link[[2]], c(0, link[[3]]), col = '#00000025', border = NA)
}
# "Patient/cell" labels
for(s in sectors) {
highlight.sector(
sector.index = s, track.index = 1, col = colours[s],
text = s, text.vjust = -1, niceFacing = TRUE)
}
circos.clear()
A work by Matteo Cereda and Fabio Iannelli