Create a dataframe age_weight
from vectors:
Age=c(0,1,2,3,4,5,6,7,8,9)
and
Weigth=c(3.6,4.4,5.2,6,6.6,7.2,7.8,8.4,8.8,9.2)
Draw a scatterplot (using geom_point
) of the Age vs
Weight. Hint: when defining your aesthetics the Age will be the x and
Weight will be the y.
Color border of all points with the same color by fixing it outside the aesthetic and fix the size of point at 3
You can notice that a relationship exists between the two
variables. Change the geometry to geom_line
to see another
way to represent this plot.
Combine the two plots by adding both a geom_line
and
a geom_point
geometry to show both the individual points
and the overall trend. Add a title to the plot.
Load the iris
dataset from R (included into
ggplot2
package) and inspect the relationship between the
sepal length
and the sepal width
. Which kind
of plot you can use? Make it! If the type of plot you chose allows that,
try changing colors, shapes and sizes.
If you haven’t already done it, rename axes labels and add a proper title.
Explore the distribution of petal width across different species of iris. Try:
notches=TRUE
. Color boxes according to species.What type of information each plot gives to you?
See the 2d distribution by using petal length as x, sepal length as y and applying:
geom_hex()
stat_density_2d()
For some examples and suggestions you can always consult R graph gallery. In this case https://r-graph-gallery.com/2d-density-chart.html.
Create a dataframe df
from the following
vectors:
person=c("Thomas", "Lisa", "Thomas", "Lisa", "Thomas", "Morris", "Morris", "Lisa", "Thomas", "Colin", "Colin", "Myrtha", "Colin", "Chloe", "Thomas", "Myrtha")
sport=c("yoga","yoga","tennis","crossfit","judo","football","ski","ski","weight_training","weight_training","power_lifting","pilates","nordic_walking","nordic_walking","nordic_walking","nordic_walking")
and then:
table()
function make a summarized data frame
df2
in which to each person is associated the number of
sports he/she plays (Hint: use as.data.frame()
function)+coord_flip()
df
the column Times
with the
following commandsset.seed(1234) # ensure reproducibility across randomization steps
df$Times=sample(c(1,2,3,4),nrow(df),replace = TRUE) # choose a random number in the range 1:4 for each row in df
Using person
as x, Times
as y and
sport
for fill
, make a stacked
barplot, a dodged
barplot and a percentage
barplot. Use col = "black"
to better highlight the
different groups
iris
dataset, make a scatterplot with
the sepal length as x, the sepal width as y and the petal width as point
size. Also:
fix shape=21
relate fill color to petal length (using
fill=
)
relate border color to iris species
fix a manual scale for fill colors (use
+scale_fill_viridis()
you have to load viridis
library). Notice: there are lots of color functions adapted for ggplot2
or you can fix your own palettes using scale_fill_manual()
.
You will see some examples.
Choose a manual scale for border colors. Hint: you have to use
+scale_color_manual()
with three values, as the iris
species are three (for example,
+scale_color_manual(values=c("magenta", "orange", "cyan"))
. You can also decide to associate a specific color to a specific value,
like that:
+scale_color_manual(values=c("virginica"="magenta","versicolor"="orange", "setosa"="cyan"))
). Make some examples to take confidence!
Add a fixed alpha value (transparency). Alpha accepts values between 0 and 1.
Apply theme_bw()
make a scatterplot using Sepal.Length
and
Petal.Length
as variables. Then add the correlation line
using geom_smooth
and method ="lm"
make the same plot using facet
on the variable
Species
(try with both facet_wrap
and
facet_grid
)
which are the differences if you use
scales="free_x"
, scales="free_y"
and
scales="free"
inside facet_wrap()
?