The base package in R allow nice graphs to be drawn but more advanced packages allow better control and still nicer graphs to be created. Two packages are mainly used lattice and ggplot2, I will here present to you the basics of ggplot2 and the way it works.
The best way to understand the ggplot way of functionning is in this:
“ggplot2 is designed to work in a layered fashion, starting with a layer showing the raw data then adding layers of annotation and statistical summaries. [..]” H.Wickham, ggplot2, Use R
More on the different components of the grammar of graphics
The most sensitive way to work with ggplot is to creat a basic object and then add layer and complexity to it as we go:
#load the data of fuel comsumption of cars
#plotting mpg (miles per gallon) and wt (weight)
q<-ggplot(data=mtcars,aes(x=mpg,y=wt)) #store the basics informations into an object “q”
#then add layers, basic scatterplot
The per default layout of the ggplot automatically set the backgroung to grey and add x and y axis labels, the axis title are the same as the variable names used.
Everything can be changed and customize in this graph, let’s start with the axis and main title.
#change x and y axis title and main title using the layer labs
q+geom_point()+labs(title=”Scatterplot of weight and miles per gallon”,x=”Miles per Gallon”,y=”Car weight (tonnes)”)
#if we long title we can set the break line using “\n”
q+geom_point()+labs(title=”Scatterplot of weight\n and miles per gallon”,x=”Miles per Gallon”,y=”Car weight (tonnes)”)
Now if we wanted to see the effect of another variables and vary the size, shape, color of this variable here is what to do:
#if the variable is continuous
#colors are default ones, we can set them using the “scale_colour_gradient” command and also set the legend title
#in R we can access all available colors using “colors()”, then we can call them using their numbers
#if the variable is categorical
#setting the colors and the legend title
#then when there are less than 6 categories we can use “shape” to differentiate between the different class of the gear variable
Now I will present a few other possibilities in ggplot like making boxplots, histogramms, barplot, adding regression line, all the stuff that we do very often:
#Simple boxplots of the distribution of values of Horse Power depending on the number of cylinders
#Histogramm of the weight variable, binwidth sets the break values, fill the colour filling the bars and colour the outlying colour
#Barplot with two different format
#Adding a regression line to the graph
#plot one regression line per group of gear
#separate each group in a different window
It is of course possible to play around with all these functions for example in the facet.grid if we put the variable first then the tile and the point, the different windows will be horizontal instead of vertical.
A last point concerning this introduction to ggplot2 is the “%+%” function which allows you to update the content of a previously saved graph:
#using the update function
p %+% geom_point(shape=2)
I hope this short tutorial gave you the feeling that ggplot can really help you making nice graphs using R. Next will come more advanced stuff like setting theme elements or using ggplot in maps.