Monitoring Statistics
Monitoring
I & M
NPS.gov
 

Installation & Configuration Notes for R

 

Downloading R

{If you want screen by screen instructions, read this web page, then use Paul Geissler's page}

1. Start at http://cran.r-project.org: Select your operating system under "Download and Install R".

2. On the next page:
a. Select base and follow link to download R. If you are running 64-bit MSwindows, you have a choice of installing the 32-bit version of R, which will run slightly faster, or the 64-bit version, which can handle very large data objects, or both versions. I use large raster datasets, so I will switch to 64-bit when my computer migrates to win7-64. More information is available in the link on the CRAN windows/base page.

3. FAQs for R can be found under Documentation in the left navigation panel: Select FAQs, and open the relevant FAQ.

4. Do not download R contributed packages at this time.

Running Setup

The defaults are ok, but I have a few suggestions:

1. Download the R-2.11.1.exe file from CRAN, then run the file when downloaded.

2. For MSwindows, Install in c:/R/ not in c:/Program Files/R/ (Make sure you have full write permissions to c:/R and subdirectories.)

Spaces in directory paths, like spaces in file names, can occasionally bite you. The core of R and most contributed packages can handle spaces embedded in file names under MSwin, but some contributed packages from unix users don't bother. It is a good idea to keep spaces and special characters such as most symbols other than dash and underscore out of directory and file names. Also, agency IT folk often lock down the c:\Program Files directory so users can't write to it. Whenever you download and install a new contributed package, it will need to write files to directories beneath R/. You don't want to rely on your IT support every time you install a new package, and they don't want to have to handle those helpdesk calls.

2. Select Components: add docs for packages grid and Matrix

When selecting components, add the technical manuals and pdf help pages.

3. "Startup options: Do you want to customize the startup options?"

Choose 'Yes (customized startup)'

4. "Display Mode: Do you prefer the MDI or SDI interface?"

If you will use the Rcommander GUI (covered in Paul Geissler's GUI version of this course) ) or the Tinn-R editor/GUI, you must choose SDI. Graphics packages built with tcltk also require SDI. Otherwise, I prefer MDI on a laptop with a small display to keep all of my R windows within a single large window so I can minimize the containing window to clear all of them off the desktop temporarily). With either a huge monitor or dual-display, I preferred SDI, where each R window was a completely independent window on the monitor. This is an option you can easily change later by editing the etc/Rconsole file and switching the commenting (#) between the MDI = yes and MDI = no lines.. [This option may be dropped from the latest installer.]

5. "Help Style: Which form of help display do you prefer?"

I choose html help so I can bookmark, open multiple tabs, etc.

6. Leave everything else as default, and finish installation.

Either go with the defaults or choose the following (note that not all of these options appear in the 2.11.1 installer):

Standard internet access
Start menu shortcut in R
Create quicklauch icon if desired

Tinn-R editor

You will want a good text editor when writing R code. Word processors such as MS word or OpenOffice writer are very poor choices: they save files with non-ACSII content, formatting, etc.

You can use NotePad or WordPad, which come with MS windows. However, several other text editors are more powerful, and colorize syntax, handle indentation, etc. The most commonly used editor may be Tinn-R, which integrates with the R command line window. Tinn-R is available from SourceForge at http://sourceforge.net/projects/tinn-r/

I use TextPad, because I have used it for editing text data files for ove a decade, and I'm an old dog. TextPad has syntax definitions for R (as well as a few hundred other scripts). Editors are a matter of personal preference, and sometimes personal pride; see xkcd:

Real programmers

Installing Related Programs (tcl/tk, ggobi, DCOM, and gdal)

I strongly recommend that you install 3 open-source programs for use with R. First, tcl/tk is a toolkit for windowing and GUI interfaces, required for Rcmdr and rggobi. Second, ggobi is a system for visualization of multi-dimensional data that can be run from R with thr rggobi package. Third, DCOM allows R to be run within MS products: either as a plugin inside Excel (adding R graphical capabilities inside Excel) or as embedded, live R objects in Word documents (automating the generation of the bulk of annual reports). If you use geospatial data, I strongly recommend installing gdal as well.

1: Go to the ggobi web page http://www.ggobi.org/.

Go to the downloads page. For MSwindows download and install GTK and then GGobi, for OS X download and install Gtk2 and then GGobi; for Linux grab the Debian package or Fedora rpm.

2: Go to the STATCON web page http://rcom.univie.ac.at/download.html.

Download statconnDCOM, then SWord, then RExcel if you want (I will show SWord but not RExcel).

3: If you use geospatial data, especially raster (grid) files, you should load the gdal tools for reading and writing various geospatial file formats.

For 32 bit MSwindows and 32 bit linux, the easiest approach is Frank Warmerdam's FWTools: http://fwtools.maptools.org/. This step can wait, as we won't use raster data for several months, until the advanced topics sessions. However, if you have to let your IT support do the installations, you might as well have them do it all at once.

Installing Packages

There is no need to download all of the contributed packages you think you will need while you are performing the initial installation process unless there are firewall issues involved. Both the R GUI and Rcmdr include menu items that let you choose from a pull-down list of packages, and then automatically download and install the selected packages from a CRAN mirror. The etc/Rprofile.site file has a place to define a default CRAN mirror, or the pulldown menus let you choose among cryptically-named mirrors.

1. Run R [If you have an IT person with admin permissions doing your installation, have them do this step].

2a. From the menu, select "packages", then install packages. If prompted: choose a mirror close to you. A list of packages will appear: Control-click to select more than one.

2b. Alternative: Instead of selecting packages from the menu, you can copy and paste the following lines into the R Console window at the command prompt (see the packages section below for more sets you may want to install):

install.packages(c("car", "conf.design", "corrgram", "DAAG",
"effects", "ellipse", "faraway", "gplots", "lattice",
"reshape", "plyr", "gdata", "leaps", "nlme", "lme4", "lmtest",
"MASS", "Rcmdr", "latticedl", "RcmdrPlugin.HH",
"RODBC","foreign", "reshape", "plyr", "gdata",
"sciplot", "tree", "rggobi","R2wd"), dep=TRUE)

* Note: What is labeled "MASS(VR)" in the pull-down is named "MASS" in install.packages.

2c. Recommended Alternative: Download my file Install.R (remove the .txt from the file name), then edit your local copy with a text editor (textpad, notepad, etc., not MSword or OpenOffice Writer). Look at each block, and remove the '#' that comments out the lines with install.packages() and packages you want to install, for example, changing this:

##############################################
######## Survey design & analysis (spsurvey does GRTS)
# install.packages(c("ars","survey","spsurvey","RSurvey","memisc"),dep=TRUE)

#analysis of panel designs
# install.packages(c("pcse","plm"),dep=TRUE)

to this:

##############################################
######## Survey design & analysis (spsurvey does GRTS)
install.packages(c("ars","survey","spsurvey","RSurvey","memisc"),dep=TRUE)

#analysis of panel designs
install.packages(c("pcse","plm"),dep=TRUE)

 

Then, copy and paste the block into the R > prompt.

3. Exit R via File | Exit in the menu at the top, or by typing into the command line:

q()

 

 

Additional Comments

Often the biggest difficulty is finding the packages that do what you need. If your need is ecological, a good place to start is the CRAN Task View for Environmetrics, an attempt to list common ecological analyses and the R packages available for each. [One obvious omission is mark/recapture analyses.] There are 25 task views for various fields. If that doesn't help, you can go to the main CRAN page for contributed packages and use search in your browser to search for key words for your topic.

Documentation

There are a large number of pdf documents available from CRAN at:

http://cran.r-project.org/manuals.html (main R reference manuals)
http://cran.r-project.org/other-docs.html (contributed documentation, from 300+ page books to 2 page cheatsheets)
http://cran.r-project.org/faqs.html

Note that the docs folder in your R installation directory (e.g., c:/R/R-2.10.0/docs) has a manual subdirectory with pdf copies of the main R manuals available from the first of these links. I recommend creating a directory such as c:\R\docs (distinct from c:/R/R-2.10.0/docs) and downloading documents you are most likely to need, as those documents don't change with new versions of R. For more information, follow the "Good References" link in the navigation panel on the left.

Configuring R

Like many well-behaved programs, R allows configuration options to be set in several ways: OS environmental variables can be set. Global options that apply to all users and all instances of R can be set in the etc/Rprofile.site file. Project or user specific configurations may be set in .Renviron and .Rprofile files in the home directory.

The most useful system environmental variables are:

R_USER overrides the default to "My Documents" for user files
R_HOME overrides the default to "My Documents" for user files
R_LIBS points to an alternate location for installed packages (for when you can't obtain write permission to c:\R\R-2.11.1/library) Set R_LIBS=d:\R\library if you have write access to d:\R.
R_DEFAULT_PACKAGES sets the list of packages you want automatically loaded at startup in R.

The .Rprofile file allows you to set different options for different projects. You can put a .Rprofile file in your home directory that will apply to all of your R sessions, or separate .Rprofile files in each project subdirectory to have project-specific configurations.

Start-up Libraries

Certain libraries are loaded every time RGui is started. The following steps explain how to launch libraries on the start of RGui.

1. Open: c:/R/R-2.11.1/etc/Rprofile.site, using NotePad, TextPad or any other text editor other than MSword.

2. Append the following to the bottom of the file (making sure that the lines break exactly like this example):

local({
old <- getOption("defaultPackages")
options(defaultPackages = c(old, "car", "RODBC", "foreign", "DAAG", "MASS", "lattice", "latticedl", "sciplot", "tree", "lme4"))
})

3. If you want the Rcmdr GUI to start each time you start R, append the following to the bottom of the file:

local({
old <- getOption("defaultPackages")
options(defaultPackages = c(old, "car", "RODBC", "DAAG", "MASS", "lattice", "latticedl", "sciplot", "tree","nlme", "lme4", "RcmdrPlugin.HH", "Rcmdr"))
})

4. Save Rprofile.site and exit the text editor.

You can grab Tom's Rprofile.site here. (again, you have to delete the .txt appended to the file name).

Setting the CRAN Mirror

1. Open: c:/R/R-2.11.1/etc/Rprofile.site, using NotePad, TextPad or any other text editor other than MSword.

2. In lines following "set a CRAN mirror", replace my.local.cran with the url you selected from the list of mirrors.

4. Remove the #'s in front of the lines following the "set a CRAN mirror" comment.

3. Save Rprofile.site and exit the text editor.

Computer Configuration

The key computer configuration issue for R is that unlike SAS (which can read arbitrarily large datasets serially on tapes and perform analyses in 2 passes), R keeps all data in memory. Thus, the amount of real and virtual memory you have limits the size of the analyses that can be performed, and on MSwindows machines it cannot work with more than 2.5-4GB (depending on the version of windows). If you can, add memory to your machine to get it to 2-4GB. If you can't, configuring your swap space (pagefile.sys) can have a large impact on performance. If you can, putting pagefile.sys on it's own partition with fixed size (min & max the same, don't let windows manage it) will prevent it from becoming fragmented, which can slow your computer to a crawl. If you have more than one hard drive, put pagefile.sys on a different drive than the operating system and programs if you do lots of computations on small data files, put pagefile.sys on a different drive than your data files if you process large files are are generally i/o bound.

As noted above, if you are running a 64-bit operating system, you have the option of using 32-bit or 64-bit versions of R. The 32-bit version usually runs faster; the 64-bit version allows larger (virtual) memory and files larger than 2GB.

For more specific issues, look at the R faq for your operating system:

R for Windows FAQ
R for Mac osX FAQ

Also, read the README (win) or NEWS (osX) file that was available on CRAN when you downloaded the binary installation file.

Permissions and Security

Permissions

Like most software installation, local admin rights are required to install R. [If you are running Vista, you will be prompted for admin credentials.] You do not need local admin rights to run R, except that you are likely to need to write to the c:\R\R-2.11.1\library and c:\R\R-2.11.1\etc directories for installing packages and tweaking your configuration.

Depending on the version of windows and knowledge & marching orders of your IT administrator, they should be able to keep everything in c:\Program Files locked down and give you permissions to c:\R and subdirectories. The advantage of this solution is that the help system will be automatically installed when you install packages. The alternate solutions require that you run link.html.help from the R console prompt after each package installation.

If you can't get R installed outside of c:\Program Files and can't get write permissions to c:\Program Files\R, create a different library directory somewhere that you do have write permissions (e.g., d:\R\library), and have the IT administrator edit the etc/Rprofile.site file to point R to the different library location (the .Library.site <- file.path() line). While they are at it, have them edit the CRAN mirror location, and add whatever packages you want in the defaultPackages line (see below). See section 6.2 of the R-admin.pdf guide for details.

If you are using linux or unix, you only need to have write permissions to r/etc and r/library, so you can chmod those directories, or chmod -R from the R directory. If this is gibberish to you, please get help from your local unix wizard.

Firewall Issues

It appears some agency IT policies block any access to the web via port 80 (via non-browsers). This prevents users from using the R menu to download and install new packages, as R cannot contact the CRAN repositories. I can't solve this. If your location requires a proxy server, the solution may be to add a command-line parameter http_proxy=. as suggested in the R-win FAQ at 2.19. If you know what you are doing and you know what your proxy server settings are, you might be able to do this yoruself. Right click on your R icon on your desktop and select properties. In the shortcut tab, in the target field, add http_proxy= to the value so that it looks something like:

c:\R\R-2.10.0\bin\Rgui.exe http_proxy=http://user:password@proxyserver:80

One alternative is to look at using internet2.dll; again, as documented in the R-win FAQ. The other alternative is to go to a cran mirror via your prowser or an ftp client, download all of the packages you might want to a local directory, then have R install packages from that local directory.

Updating to a New Version

New versions of R seem to come out right after we start our R course (2.8.2 in 2008, 2.10.0 in 2009). In my experience, it has never been the case that staying with the older version was better, but most new versions have little effect on what we use. because new versions of base R usually require new versions of each package, brute force updating can be a pain. But, the developers behind R and the power users are lazy folks, so there are a couple of tools that make the process simpler. My directions are (substitute your old version for 2.9.2 and your new version for 2.10.0 in these):

1. Create a new directory for the new version of R: c:/R/R-2.10.0

2. Copy the old library directory to that new directory (c:/R/R-2.9.2/library > c:/R/R-2.10.0/library), and the old Rprofile.site file from c:/R/R-2.9.2/etc to c:/R.

3. Uninstall the old version of R: Start menu > R > uninstall 2.9.2 is a bit faster than administrative tools > add/remove software

4. Install the new version of R from cran to the new directory (c:/R/R-2.10.0) by selecting installation target c:/R.

5: Copy Rprofile.site from where you stashed it to the new etc directory (from c:/R to c:/R/R-2.10.0/etc). You probably shouldn't do this for the update from 2.92 to 2.10.0, as the CHM option for the help system is no longer supported as of 2.10.0; instead, edit the Rprofile.site file the installer builds for you by adding the local mirror and the list of packages you want loaded at startup (see above)..

6: Start R, and paste the following command at the prompt:

update.packages(checkBuilt=TRUE, ask=FALSE)

Alternatively, step 6 can be replaced by starting the new version of R, then selecting update packages from the packages menu.

I have perhaps 10 packages installed, and the entire update process requires less than 5 minutes of my time, and then another 45 minutes of unattended running to download the updated packages (I have a slow internect connection at work).

update on 01/28/2010  |      |   Webmaster of NPS R pages
This site is best viewed with Any (modern) Browser