r

Other pages about at this site: Notes about R from a former SPSS user UTF-8 and importing SPSS files Installing FactoMineR

Other sections in this document: Using R

Installing R

For a long time I have wanted to start using R for statistical computation and visualisation. Here I will document the steps needed to get a R environment with extra functionality (functions not included in the precomplied binaries for R that Debian provide) installed.

In the example below, I avoid installing binaries in /home and give write permissions to "hans" in /usr/local/lib/R.

The extra functionality is in the package ca which I wanted for correspondence analysis.

# apt-get install r-base r-base-dev r-cran-rgl
# chown -R hans:staff /usr/local/lib/R
$ R
> install.packages("ca","/usr/local/lib/R/site-library")
> install.packages("FactoMineR","/usr/local/lib/R/site-library")
> install.packages("pgfSweave","/usr/local/lib/R/site-library/", repos = "http://cran.r-project.org")

Packages installed by apt-get install r-cran-cmdr

r-cran-abind r-cran-car r-cran-effects r-cran-lmtest r-cran-multcomp r-cran-mvtnorm r-cran-rcmdr r-cran-relimp r-cran-sandwich r-cran-sm r-cran-strucchange r-cran-zoo

However, a package required by Rcmdr is not automatically installed: r-cran-rodbc. Having the data in a RDBMS is cool, and I already have postgresql installed on this machine, so the package odbc-postgresql looks promising (I haven't installed it yet, though).

Packages installed by apt-get install r-base-dev, which can be removed when the extra packages are installed

build-essential gfortran gcc g++ libncurses5-dev libreadline5-dev libjpeg62-dev libpcre3-dev libpng12-dev zlib1g-dev libbz2-dev refblas3-dev atlas3-base-dev

And for package rgl:

libglu1-mesa-dev

Updating R

update.packages(lib.loc = "/usr/local/lib/R/site-library/", repos = "http://cran.r-project.org")

Removing custom addons

remove.packages("ca")

Configure R

     local({
       # add MASS to the default packages, set a CRAN mirror, set a directory into which local packages will be installed
       old <- getOption("defaultPackages"); r <- getOption("repos")
       r["CRAN"] <- "http://ftp.sunet.se/pub/lang/CRAN/"
       options(defaultPackages = c(old, "pgfSweave"), repos = r)
       ## options(repos = r)
       ## set the target dir for installation of local packages
       lib.loc = "/usr/local/lib/R/site-library/"
       ## set the width
       cols <- 145
       if(nzchar(cols)) options(width = as.integer(cols))
     })

Special cases

building add-on packages with a non-default compiler

If your system default compiler cannot be used (e.g. the current version of it has bugs), you can define another one to use with the $CC variable in a Makevars-file.

The R system and package-specific compilation flags can be overridden or added to by setting the appropriate Make variables in the personal file HOME/.R/Makevars-R_PLATFORM (but HOME/.R/Makevars.win or HOME/.R/Makevars.win64 on Windows), or if that does not exist, HOME/.R/Makevars, where ‘R_PLATFORM’ is the platform for which R was built, as available in the platform component of the R variable R.version.

(From http://stat.ethz.ch/R-manual/R-devel/doc/manual/R-admin.html#Customizing-package-compilation)

R.version

_ platform i486-pc-linux-gnu [...] version.string R version 2.13.0 (2011-04-13)

$ cat .R/Makevars-i486-pc-linux-gnu CC=gcc-4.4

Now, gcc-4.4 rather than the default gcc-4.6 will be used

install.packages("lme4","/usr/local/lib/R/site-library/", repos = "http://cran.r-project.org")

... gcc-4.4 -I/usr/share/R/include -I"/usr/lib/R/library/Matrix/include" -I"/usr/lib/R/library/stats/include" -fpic -std=gnu99 -O3 -pipe -g -c init.c -o init.o gcc-4.4 -I/usr/share/R/include -I"/usr/lib/R/library/Matrix/include" -I"/usr/lib/R/library/stats/include" -fpic -std=gnu99 -O3 -pipe -g -c lmer.c -o lmer.o

Using R

A sample session in R:

$ R
> library(ca)
> data(smoke)
> plot(ca(smoke))

Here is how to read data from an SPSS-file.

library(foreign)
myobj <- read.spss("yrken.sav", to.data.frame=TRUE)

Starting a graphical user interface:

library(Rcmdr)

Basic statistical functions:

.Table (- myobj$KONSANDE)

Using only a subset of a dataframe:

mysmalltable <- data.frame(myobj[1:2], myobj[5:5])

Simple correspondence analysis on a subset of a dataframe:

plot(ca(na.omit(data.frame(myobj[1:1], myobj[56:58], row.names = "YRKE"))), what = c("all", "all"), labels = c('2','2'))

Add supplementary points to the graph:

plot(ca(na.omit(data.frame(myobj[1:1], myobj[5:5], myobj[11:11], myobj[13:13], myobj[56:58], row.names = "YRKE")), supcol = 1:3), what = c("all", "all"), labels = c('2','2'))

Save the graph to a file by encapsulating the plot command within a pair of device setting commands:

png(file="politik-enkel.png", width=1600, height=1200)
plot(ca(na.omit(data.frame(myobj[1:1], myobj[56:58], row.names = "YRKE"))), what = c("all", "all"), labels = c('2','2'))
dev.off()

png() opens the file for writing, but the content of the file is written by dev.off().

# A better way of importing, this names the rows with the variable YRKE, which can then be used in correspondence analysis (if rows are named with a number and "YRKE" is a variable, it can not be used in simple CA since it would appear as a factor). myobj <- data.frame(read.spss("yrken.sav", to.data.frame=TRUE), row.names = "YRKE")

On the other hand, when using multiple joint correspondence analysis, only factors are allowed so there is no need for row.names

i486-pc-linux-gnu

comments powered by Disqus

Back to the index

Blog roll

R-bloggers, Debian Weekly

Last modified: 2007-10-17