The CpG sites (i.e. 5’C—phosphate—G3’) are regions of DNA where a cytosine is followed by a guanine nucleotide in the linear sequence of bases along its 5’ → 3’ direction.
Often, the cytosines in CpG dinucleotides are methylated (5-methylcytosines). In mammals, 70% to 80% of CpG cytosines are methylated. Methylating the cytosine within a gene can influence its expression.
CpG islands (also called CG-Rich Islands) are regions with a high frequency of CpG sites. In humans, about 70% of promoters located near the transcription start site of a gene contain a CpG island.
CpGi.table.hg18.csv
(You find it in Datasets folder). This
file was downloaded from the compGenomRData package
and is
a comma-separated file.cpgi
str( )
function, explore the cpgi
dataframe.# for me DATA_DIR= '/Users/tbecchi/Desktop/repository/BDSB/'
cpgi=read.csv(paste0(DATA_DIR,"Exercises/Ex_per_date/Datasets/CpGi.table.hg18.csv"))
str(cpgi)
## 'data.frame': 28226 obs. of 10 variables:
## $ chrom : chr "chr1" "chr1" "chr1" "chr1" ...
## $ chromStart: int 18598 124987 317653 427014 439136 523082 534601 703847 752279 778726 ...
## $ chromEnd : int 19673 125426 318092 428027 440407 523977 536512 704410 753308 779074 ...
## $ name : chr "CpG: 116" "CpG: 30" "CpG: 29" "CpG: 84" ...
## $ length : int 1075 439 439 1013 1271 895 1911 563 1029 348 ...
## $ cpgNum : int 116 30 29 84 99 94 171 60 115 28 ...
## $ gcNum : int 787 295 295 734 777 570 1405 385 673 192 ...
## $ perCpg : num 21.6 13.7 13.2 16.6 15.6 21 17.9 21.3 22.4 16.1 ...
## $ perGc : num 73.2 67.2 67.2 72.5 61.1 63.7 73.5 68.4 65.4 55.2 ...
## $ obsExp : num 0.83 0.64 0.62 0.64 0.84 1.04 0.67 0.92 1.07 1.06 ...
dim(cpgi)
## [1] 28226 10
head( )
function.head(cpgi)
## chrom chromStart chromEnd name length cpgNum gcNum perCpg perGc obsExp
## 1 chr1 18598 19673 CpG: 116 1075 116 787 21.6 73.2 0.83
## 2 chr1 124987 125426 CpG: 30 439 30 295 13.7 67.2 0.64
## 3 chr1 317653 318092 CpG: 29 439 29 295 13.2 67.2 0.62
## 4 chr1 427014 428027 CpG: 84 1013 84 734 16.6 72.5 0.64
## 5 chr1 439136 440407 CpG: 99 1271 99 777 15.6 61.1 0.84
## 6 chr1 523082 523977 CpG: 94 895 94 570 21.0 63.7 1.04
stringsAsFactors=TRUE
in the
reading function? Use again the str( )
function.cpgi=read.csv("Datasets/CpGi.table.hg18.csv", stringsAsFactors=TRUE )
str(cpgi)
## 'data.frame': 28226 obs. of 10 variables:
## $ chrom : Factor w/ 45 levels "chr1","chr1_random",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ chromStart: int 18598 124987 317653 427014 439136 523082 534601 703847 752279 778726 ...
## $ chromEnd : int 19673 125426 318092 428027 440407 523977 536512 704410 753308 779074 ...
## $ name : Factor w/ 439 levels "CpG: 100","CpG: 101",..: 18 222 213 421 439 432 80 382 17 202 ...
## $ length : int 1075 439 439 1013 1271 895 1911 563 1029 348 ...
## $ cpgNum : int 116 30 29 84 99 94 171 60 115 28 ...
## $ gcNum : int 787 295 295 734 777 570 1405 385 673 192 ...
## $ perCpg : num 21.6 13.7 13.2 16.6 15.6 21 17.9 21.3 22.4 16.1 ...
## $ perGc : num 73.2 67.2 67.2 72.5 61.1 63.7 73.5 68.4 65.4 55.2 ...
## $ obsExp : num 0.83 0.64 0.62 0.64 0.84 1.04 0.67 0.92 1.07 1.06 ...
cpgi=read.csv("Datasets/CpGi.table.hg18.csv", nrow=10 )
dim(cpgi)
## [1] 10 10
cpgi=read.csv("Datasets/CpGi.table.hg18.csv", skip =10 )
head(cpgi)
## chr1 X778726 X779074 CpG..28 X348 X28 X192 X16.1 X55.2 X1.06
## 1 chr1 791838 792201 CpG: 24 363 24 243 13.2 66.9 0.79
## 2 chr1 795061 795491 CpG: 50 430 50 316 23.3 73.5 0.87
## 3 chr1 829557 830482 CpG: 83 925 83 525 17.9 56.8 1.11
## 4 chr1 834162 835746 CpG: 153 1584 153 1083 19.3 68.4 0.85
## 5 chr1 844628 844836 CpG: 16 208 16 140 15.4 67.3 0.68
## 6 chr1 848833 851495 CpG: 257 2662 257 1642 19.3 61.7 1.02
header=FALSE
. What
happens?cpgi=read.csv("Datasets/CpGi.table.hg18.csv", header=FALSE )
head(cpgi)
## V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
## 1 chrom chromStart chromEnd name length cpgNum gcNum perCpg perGc obsExp
## 2 chr1 18598 19673 CpG: 116 1075 116 787 21.6 73.2 0.83
## 3 chr1 124987 125426 CpG: 30 439 30 295 13.7 67.2 0.64
## 4 chr1 317653 318092 CpG: 29 439 29 295 13.2 67.2 0.62
## 5 chr1 427014 428027 CpG: 84 1013 84 734 16.6 72.5 0.64
## 6 chr1 439136 440407 CpG: 99 1271 99 777 15.6 61.1 0.84
cpgi=read.csv("Datasets/CpGi.table.hg18.csv", header=TRUE)
head(cpgi)
## chrom chromStart chromEnd name length cpgNum gcNum perCpg perGc obsExp
## 1 chr1 18598 19673 CpG: 116 1075 116 787 21.6 73.2 0.83
## 2 chr1 124987 125426 CpG: 30 439 30 295 13.7 67.2 0.64
## 3 chr1 317653 318092 CpG: 29 439 29 295 13.2 67.2 0.62
## 4 chr1 427014 428027 CpG: 84 1013 84 734 16.6 72.5 0.64
## 5 chr1 439136 440407 CpG: 99 1271 99 777 15.6 61.1 0.84
## 6 chr1 523082 523977 CpG: 94 895 94 570 21.0 63.7 1.04
file="~/filename.rds"
as in linux ~/
denotes home folder (Notice that if you are using Windows you have to
use back slash (\
) instead of slash (/
).saveRDS(cpgi, "cpgi.rds")
txt
file. Make sure to use the
quote=FALSE
, sep="\t"
and
row.names=FALSE
arguments. What do these arguments do?write.table(cpgi, "cpgi.txt",quote=FALSE,sep="`\t",row.names=FALSE)
chr1
. HINT: subset cpg1
using both []
(creating a logical vector with
==
operator) and subset()
function.cpgi=readRDS("cpgi.rds")
chr1=cpgi[cpgi$chrom=="chr1",]
chr1=subset(cpgi, chrom=="chr1")
chr2
with CpG islands only on
chr2. Save both chr1
and chr
2 in an RData
file. Then, remove both chr1
and chr2
variables from R environment using rm()
. What happens if
you try to visualize chr1 now? Inspect the “Environment” tab.chr2=subset(cpgi, chrom=="chr2")
save(chr1, chr2, file = "cpgi_chr1_chr2.RData")
rm(chr1)
rm(chr2)
chr1
load("cpgi_chr1_chr2.RData")
head(chr1) # your object is already named
## chrom chromStart chromEnd name length cpgNum gcNum perCpg perGc obsExp
## 1 chr1 18598 19673 CpG: 116 1075 116 787 21.6 73.2 0.83
## 2 chr1 124987 125426 CpG: 30 439 30 295 13.7 67.2 0.64
## 3 chr1 317653 318092 CpG: 29 439 29 295 13.2 67.2 0.62
## 4 chr1 427014 428027 CpG: 84 1013 84 734 16.6 72.5 0.64
## 5 chr1 439136 440407 CpG: 99 1271 99 777 15.6 61.1 0.84
## 6 chr1 523082 523977 CpG: 94 895 94 570 21.0 63.7 1.04
In your environment you have “chr1” and “chr2” again. Let’s work on them!
length_chr1
that contains values in
column length
in chr1
length_chr2
that contains values in
column length
in chr2
length_chr1
and length_chr2
(HINT: use
quantile()
function).length_chr1
and length_chr2
. Are there
differences between the two? Comment on this, also considering
quantiles.chr1
:
chr1_small
, chr1_medium
and
chr1_large
by using quantiles values (<=30%, >30% and
<=60% and >60%)chr2
:
chr2_small
, chr2_medium
and
chr2_large
by using quantiles values (<=30%, >30% and
<=60% and >60%)name
is repeated into
chr1_smallname
is repeated into
chr2_largecpgi
dataframe, how many CpGs do you
have per chromosome?casual
with numbers normally
distributed whose length is equal to the length of unique values of
chr1_small$name
casual
as the unique values of
chr1_small$name
casual
from the bigger to the smaller valueGaussian
in chr1_small
by
matching the column name
with casual
names.Gaussian
in chr2_small
by
matching the column name
with casual
names.
Are there missing values? If yes, how many? To how many unique names
they correspond?chr2_small
by excluding rows that have
missing values in Gaussian
chr2_small
and chr1_small
into a
Rdata filedat
that contains:
chr1_large
for which length
is major than 25000length
in
chr2_large
that are major than 25000chr1_small
c("CpG: 45","CpG: 52", "CpG: 108")
are
present in the name
column of chr2_medium
name
column in
chr2_medium
that correspond to
c("CpG: 45","CpG: 52", "CpG: 108")
dat
list all columns
except the third and rows that contain perGc
values between
53 and 62name
in
chr2_large
that contain 2lo
evaluating if values in
name
column in chr1_large
contain number
3lo
in a matrix with 7 columns. You will
obtain a warning… why?dat
and
overwrite the file you saved before with the new listlength_chr1=chr1$length
length_chr2=chr2$length
q1=quantile(length_chr1, probs=seq(0,1,0.1))
q2=quantile(length_chr2, probs=seq(0,1,0.1))
mean(length_chr1)
## [1] 764.1153
mean(length_chr2)
## [1] 816.2726
median(length_chr1)
## [1] 562
median(length_chr2)
## [1] 626
sd(length_chr1)
## [1] 1197.294
sd(length_chr2)
## [1] 697.7922
# chr1 has shorter CpG in mean and median, nevertheless has bigger fluctuations (standard deviation in bigger)
# Watching quantiles we can see that minimun length is equal, suggesting a cutoff for identifying islands. Until 90% chr2 contains bigger islands, nevertheless chr1 has some longer islands (100% value is bigger)
chr1_small=subset(chr1, length<=q1["30%"])
nrow(chr1_small)
## [1] 741
chr1_medium=subset(chr1, length>q1["30%"] & length<=q1["60%"])
nrow(chr1_medium)
## [1] 738
chr1_large=subset(chr1, length>q1["60%"])
nrow(chr1_large)
## [1] 984
chr2_small=subset(chr2, length<=q2["30%"])
nrow(chr2_small)
## [1] 506
chr2_medium=subset(chr2, length>q2["30%"] & length<=q2["60%"])
nrow(chr2_medium)
## [1] 503
chr2_large=subset(chr2, length>q2["60%"])
nrow(chr2_large)
## [1] 671
table(chr1_small$name)
##
## CpG: 14 CpG: 15 CpG: 16 CpG: 17 CpG: 18 CpG: 19 CpG: 20 CpG: 21 CpG: 22 CpG: 23
## 2 18 26 30 35 47 48 49 49 58
## CpG: 24 CpG: 25 CpG: 26 CpG: 27 CpG: 28 CpG: 29 CpG: 30 CpG: 31 CpG: 32 CpG: 33
## 50 51 40 38 27 21 28 13 13 18
## CpG: 34 CpG: 35 CpG: 36 CpG: 37 CpG: 38 CpG: 39 CpG: 40 CpG: 41 CpG: 42 CpG: 43
## 14 14 12 8 13 3 2 3 1 2
## CpG: 44 CpG: 45 CpG: 47 CpG: 49 CpG: 50
## 2 2 2 1 1
table(chr2_large$name)
##
## CpG: 100 CpG: 101 CpG: 102 CpG: 103 CpG: 104 CpG: 105 CpG: 106 CpG: 107
## 5 6 6 7 7 9 8 6
## CpG: 108 CpG: 109 CpG: 110 CpG: 111 CpG: 112 CpG: 113 CpG: 114 CpG: 115
## 5 5 5 6 6 6 6 10
## CpG: 116 CpG: 117 CpG: 118 CpG: 119 CpG: 120 CpG: 121 CpG: 122 CpG: 123
## 9 5 6 7 7 6 4 4
## CpG: 124 CpG: 125 CpG: 126 CpG: 127 CpG: 128 CpG: 129 CpG: 130 CpG: 131
## 6 5 4 5 7 4 1 7
## CpG: 132 CpG: 133 CpG: 134 CpG: 135 CpG: 136 CpG: 137 CpG: 138 CpG: 139
## 4 6 2 5 6 2 2 8
## CpG: 140 CpG: 141 CpG: 142 CpG: 143 CpG: 144 CpG: 145 CpG: 146 CpG: 147
## 1 3 3 4 2 1 5 2
## CpG: 148 CpG: 149 CpG: 150 CpG: 151 CpG: 152 CpG: 153 CpG: 154 CpG: 155
## 5 1 3 3 1 5 2 3
## CpG: 157 CpG: 158 CpG: 159 CpG: 160 CpG: 161 CpG: 163 CpG: 164 CpG: 166
## 2 1 2 2 4 4 2 3
## CpG: 167 CpG: 168 CpG: 169 CpG: 170 CpG: 171 CpG: 172 CpG: 173 CpG: 174
## 1 1 4 2 2 3 1 4
## CpG: 175 CpG: 176 CpG: 177 CpG: 179 CpG: 180 CpG: 181 CpG: 182 CpG: 183
## 1 4 2 2 4 2 2 3
## CpG: 184 CpG: 185 CpG: 186 CpG: 187 CpG: 188 CpG: 190 CpG: 191 CpG: 192
## 3 5 3 2 1 2 1 1
## CpG: 194 CpG: 195 CpG: 196 CpG: 197 CpG: 198 CpG: 201 CpG: 203 CpG: 206
## 1 1 2 2 4 1 1 1
## CpG: 207 CpG: 208 CpG: 210 CpG: 212 CpG: 213 CpG: 214 CpG: 221 CpG: 222
## 1 2 2 2 1 2 1 1
## CpG: 223 CpG: 224 CpG: 227 CpG: 228 CpG: 229 CpG: 230 CpG: 235 CpG: 236
## 1 1 1 2 1 1 1 1
## CpG: 238 CpG: 242 CpG: 243 CpG: 244 CpG: 246 CpG: 248 CpG: 249 CpG: 251
## 3 1 1 1 1 1 1 1
## CpG: 253 CpG: 256 CpG: 257 CpG: 263 CpG: 266 CpG: 275 CpG: 284 CpG: 285
## 1 1 1 1 1 1 1 1
## CpG: 288 CpG: 289 CpG: 290 CpG: 295 CpG: 300 CpG: 308 CpG: 309 CpG: 329
## 2 1 1 1 1 1 1 1
## CpG: 331 CpG: 334 CpG: 339 CpG: 341 CpG: 354 CpG: 406 CpG: 423 CpG: 468
## 1 1 1 1 1 1 1 1
## CpG: 50 CpG: 504 CpG: 518 CpG: 53 CpG: 55 CpG: 56 CpG: 57 CpG: 59
## 1 1 1 1 1 2 1 3
## CpG: 60 CpG: 61 CpG: 615 CpG: 62 CpG: 63 CpG: 64 CpG: 65 CpG: 66
## 4 2 1 2 2 4 3 2
## CpG: 67 CpG: 68 CpG: 69 CpG: 70 CpG: 71 CpG: 72 CpG: 73 CpG: 74
## 3 4 5 4 4 5 3 7
## CpG: 75 CpG: 76 CpG: 77 CpG: 78 CpG: 79 CpG: 80 CpG: 81 CpG: 82
## 10 7 8 5 10 4 8 7
## CpG: 83 CpG: 84 CpG: 85 CpG: 86 CpG: 87 CpG: 88 CpG: 89 CpG: 90
## 7 10 5 4 8 16 10 5
## CpG: 91 CpG: 92 CpG: 93 CpG: 94 CpG: 95 CpG: 96 CpG: 97 CpG: 98
## 5 8 9 7 12 10 4 7
## CpG: 99
## 5
table(cpgi$chrom)
##
## chr1 chr1_random chr10 chr10_random chr11
## 2463 31 1150 1 1371
## chr11_random chr12 chr13 chr13_random chr14
## 3 1221 605 2 788
## chr15 chr15_random chr16 chr16_random chr17
## 787 20 1491 3 1622
## chr17_random chr18 chr19 chr2 chr2_random
## 67 508 2544 1680 10
## chr20 chr21 chr21_random chr22 chr22_random
## 799 356 19 716 5
## chr3 chr3_random chr4 chr4_random chr5
## 1159 2 1019 21 1227
## chr5_h2_hap1 chr5_random chr6 chr6_cox_hap1 chr6_qbl_hap2
## 13 2 1251 142 137
## chr6_random chr7 chr7_random chr8 chr8_random
## 19 1552 23 1028 8
## chr9 chr9_random chrX chrX_random chrY
## 1230 17 891 42 181
casual=rnorm(length(unique(chr1_small$name)))
names(casual)=unique(chr1_small$name)
casual=casual[order(casual, decreasing = T)]
chr1_small$Gaussian=casual[match(chr1_small$name, names(casual))]
chr2_small$Gaussian=casual[match(chr2_small$name, names(casual))]
unique(is.na(chr2_small$Gaussian))
## [1] FALSE TRUE
length(unique(subset(chr2_small, is.na(Gaussian))$name))
## [1] 5
chr2_small=subset(chr2_small, is.na(Gaussian))
chr2_small$Gaussian<-NULL
save(chr2_small, chr1_small, file = "chr1Small_chr2Small.RData")
dat=list(subs=subset(chr1_large,length>25000 ), num=sum(chr2_large$length>25000), mat=as.matrix(chr1_small[, (ncol(chr1_small)-3):ncol(chr1_small)]))
saveRDS(dat, "Lista.rds")
typeof(dat$num)
## [1] "integer"
c("CpG: 45","CpG: 52", "CpG: 108")%in%chr2_medium$name
## [1] TRUE TRUE FALSE
match(chr2_medium$name, c("CpG: 45","CpG: 52", "CpG: 108"))
## [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [26] NA NA NA NA NA NA NA NA NA NA NA 1 NA NA NA NA NA NA NA NA NA NA NA NA NA
## [51] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [76] NA 2 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 1 NA
## [101] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [126] NA NA NA NA NA NA NA NA 2 NA 2 NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [151] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [176] NA NA NA NA NA NA 1 NA NA 2 NA NA NA NA 1 NA NA NA NA NA NA NA NA NA NA
## [201] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 1 NA NA NA NA NA NA NA
## [226] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [251] NA NA 2 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [276] NA NA NA NA NA NA NA NA NA NA 2 NA NA NA NA NA 2 NA NA NA NA NA NA NA 1
## [301] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [326] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 1 NA NA NA NA NA NA NA
## [351] NA NA NA NA NA NA NA NA NA 1 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [376] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [401] NA NA NA 1 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [426] NA NA NA NA NA NA NA 1 NA NA NA NA NA NA NA NA NA NA 1 NA NA NA NA NA NA
## [451] NA NA NA NA NA NA NA NA NA NA NA NA 2 NA NA NA NA NA NA NA NA NA NA NA NA
## [476] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [501] NA NA NA
dat$mat[which(dat$mat[,"perGc"]>=53 & dat$mat[,"perGc"]<=62),-3]
## perCpg perGc Gaussian
## 10 16.1 55.2 1.03209945
## 25 17.0 61.6 0.23240908
## 47 16.9 61.1 0.76342535
## 48 19.7 58.4 -1.23314906
## 80 17.9 56.1 1.29513448
## 88 17.9 60.4 0.76342535
## 110 17.9 60.0 1.13433839
## 115 14.2 57.3 -0.05187486
## 129 15.7 61.7 0.80137753
## 135 15.7 61.7 0.80137753
## 155 14.6 61.7 -0.95276891
## 169 15.8 60.0 1.05810217
## 194 14.3 61.5 0.23240908
## 207 17.0 60.8 -0.59044507
## 228 15.4 58.7 -0.05187486
## 230 16.2 56.8 1.74230946
## 231 13.8 59.9 -0.28435228
## 238 14.9 60.7 -0.95276891
## 242 18.2 59.7 1.13433839
## 244 15.9 60.3 0.80137753
## 245 16.2 62.0 1.28946852
## 289 15.3 61.9 1.74230946
## 293 14.4 61.3 1.13433839
## 297 15.2 61.2 1.74230946
## 314 16.6 59.3 1.29513448
## 317 15.5 58.6 1.05810217
## 364 14.6 59.2 -0.95276891
## 407 17.4 58.9 1.74230946
## 421 14.2 56.3 1.13433839
## 468 14.7 60.8 1.13433839
## 472 15.0 62.0 1.29513448
## 490 18.8 60.8 0.80137753
## 491 15.4 61.8 -0.82738477
## 509 14.8 61.9 1.74230946
## 543 17.1 61.9 1.74230946
## 544 17.3 61.7 -0.59044507
## 564 17.9 54.7 0.23240908
## 662 13.0 54.6 -1.32768870
## 711 21.0 54.8 -0.82738477
## 752 16.8 61.1 0.80137753
## 754 17.9 58.3 0.80137753
## 761 22.0 61.4 -0.95874310
## 779 14.8 61.7 -0.28435228
## 822 20.9 59.5 0.03592213
## 825 22.7 61.8 0.80137753
## 837 18.9 60.6 0.76342535
## 876 15.9 59.7 -0.05187486
## 902 16.0 59.1 0.23240908
## 950 14.4 60.6 1.29513448
## 956 17.2 61.3 -1.32768870
## 991 14.4 56.5 -0.95276891
## 1081 14.7 60.8 1.29513448
## 1092 16.1 61.8 1.29513448
## 1156 15.2 59.8 1.03209945
## 1183 15.4 56.3 -1.32768870
## 1188 15.3 56.8 -0.74235710
## 1221 16.6 60.0 0.80137753
## 1226 16.6 60.8 -0.59044507
## 1227 17.7 62.0 1.13433839
## 1235 16.2 53.5 1.13433839
## 1239 13.5 55.6 1.29513448
## 1243 16.0 58.7 1.03209945
## 1253 18.0 60.9 -0.28435228
## 1259 15.2 54.6 -0.28435228
## 1274 15.6 57.5 0.76342535
## 1286 21.7 61.3 -0.28435228
## 1293 15.2 61.0 1.29513448
## 1310 15.0 61.0 1.29513448
## 1311 14.9 56.9 -0.74235710
## 1314 14.6 59.4 0.23240908
## 1324 15.0 60.6 -1.32768870
## 1326 16.4 61.2 -0.82738477
## 1341 16.1 58.8 0.76342535
## 1350 19.3 59.0 1.28946852
## 1420 17.9 62.0 -1.23314906
## 1427 15.2 61.4 -0.05187486
## 1493 18.5 61.0 0.23240908
## 1541 15.2 57.7 -0.74235710
## 1546 15.3 60.6 0.23240908
## 1549 14.4 60.5 -0.74235710
## 1558 18.4 61.5 -0.74235710
## 1559 16.1 61.2 1.74230946
## 1575 14.4 60.5 -0.74235710
## 1584 18.4 61.5 -0.74235710
## 1585 16.1 61.2 1.74230946
## 1594 14.1 60.4 0.23240908
## 1595 14.5 61.1 -0.05187486
## 1608 15.6 59.9 -0.82738477
## 1609 16.8 58.0 1.03209945
## 1611 15.6 59.9 -0.82738477
## 1616 15.1 60.0 1.05810217
## 1620 14.9 62.0 0.23240908
## 1657 16.2 58.5 -0.82738477
## 1659 16.4 55.1 0.76342535
## 1668 19.1 61.4 1.13433839
## 1678 20.8 58.9 1.13433839
## 1682 17.0 62.0 -0.28435228
## 1697 17.2 61.1 0.23240908
## 1706 18.0 55.9 0.23240908
## 1727 16.2 56.8 1.74230946
## 1734 15.1 60.8 1.13433839
## 1737 14.6 55.2 0.23240908
## 1761 19.4 58.8 1.13433839
## 1765 18.5 61.3 -1.23314906
## 1772 16.8 61.5 0.80137753
## 1806 15.5 60.6 0.80137753
## 1823 19.3 61.1 -0.74235710
## 1831 17.4 60.3 1.13433839
## 1858 14.7 54.3 1.05810217
## 1884 17.8 60.0 1.28946852
## 1907 14.1 59.2 -0.95276891
## 1913 14.7 60.5 0.76342535
## 1917 16.1 60.5 1.29513448
## 1972 15.1 61.8 1.74230946
## 1994 23.3 60.5 -0.74235710
## 2028 18.3 61.5 -1.32768870
## 2056 17.8 61.3 -1.32768870
## 2074 14.2 58.4 -0.05187486
## 2089 13.2 61.0 -0.95276891
## 2137 18.1 57.5 0.80137753
## 2172 21.2 57.6 0.76342535
## 2173 15.6 58.7 1.05810217
## 2183 22.1 57.1 0.76342535
## 2190 16.3 59.5 0.76342535
## 2196 19.2 59.4 1.13433839
## 2198 18.4 61.4 1.13433839
## 2209 14.4 57.2 -1.32768870
## 2212 19.0 61.1 1.13433839
## 2218 17.7 55.8 -0.82738477
## 2221 18.6 61.8 0.23240908
## 2223 21.5 58.4 -0.08993420
## 2286 22.5 61.7 -1.11316424
## 2330 15.2 60.1 1.13433839
## 2333 13.8 61.7 -1.32768870
## 2355 16.2 61.4 1.05810217
## 2372 14.4 58.7 -0.95276891
## 2416 14.4 59.0 -0.05187486
## 2427 16.1 61.1 1.05810217
grep("2", chr2_large$name)
## [1] 1 4 6 24 25 26 37 43 45 47 51 52 53 55 60 63 66 68
## [19] 69 75 79 86 91 94 96 108 113 115 119 122 126 130 133 141 142 143
## [37] 150 152 156 157 182 183 186 190 191 201 208 215 222 226 232 239 248 258
## [55] 261 262 263 264 266 267 269 273 274 279 285 287 289 292 298 303 306 313
## [73] 315 320 321 323 324 326 327 339 345 348 351 352 357 359 365 367 368 369
## [91] 377 384 398 406 417 421 425 426 430 433 436 437 443 452 457 458 467 475
## [109] 481 484 489 492 503 507 524 527 535 541 544 546 552 556 558 569 571 574
## [127] 576 577 584 587 589 592 595 599 604 607 608 614 619 623 628 632 637 640
## [145] 646 651 661 666 669
lo=grepl("3", chr2_large$name)
lo=matrix(lo, ncol = 7)
## Warning in matrix(lo, ncol = 7): data length [671] is not a sub-multiple or
## multiple of the number of rows [96]
dat$lo=lo
saveRDS(dat, "Lista.rds")