R-Project for Statistical Computing

The R-Project is extensive and rigorous Open Source software for statistics and graphics. It has become the tool of choice for professional statistical analysis, and runs on UNIX, Microsoft-Windows and Mac operating systems.

Table of Contents

  • Introduction to R
  • Tutorials
  • Statistics, Spreadsheets and R
  • Examples of the Versatility of R
  • Bibliography
  • Introduction to R

    R was initially written by Robert Gentleman and Ross Ihaka of the Statistics Department of the University of Auckland, New Zealand. It is freely distributable under the GNU general public license *.

    There is a wide variety of contributed Library Packages associated with R, which are available from the Comprehensive R Archive Network (CRAN). These packages include routines for statistics, graphics, solving equations, signal processing, the analysis of complex surveys, statistical process control, and packages designed for use in spectroscopy and chemometrics. The availability of these packages is one of the major strengths of R, together with its extensibility and flexibility.

    Tutorials

  • Flash video tutorial, released by Decision Science News.
  • Statistics, Spreadsheets and R

    In order to be easy to use, a spreadsheet, like Excel, Open Office or Gnumeric, does not separate functions from data. Usually, this does not matter when the data is simple and straightforward.

    When it comes to serious statistical calculations, however, a spreadsheet is not suitable. Numerical errors are likely to arise, and, furthermore, the user is often totally unaware that this has happened.

    Spreadsheets are great if the data is simple - they are positively dangerous if the data is complex.

    These issues do not apply to R since its functions and data are separated within the software.

    Examples of the Versatility of R

    To show what R can do, we give a few simple examples.

    Two-Dimensional Brownian Motion Example

    The first example plots a two-dimensional Brownian motion in R.

    Entered into R, this code outputs the Brownian motion as a graph, with B1(t) along the abscissa and B2(t) along the ordinate. Since we are modeling a radom process, each run of the code produces graphs of differing appearances. Here is one of mine.

    Dirac Delta Function

    The second example plots the Dirac delta function as a Fourier series

    The Fourier series for the Dirac delta function is rapidly built up and you end up with this graph.

    Bibliography

    [1] Nicholas J. Horton, Elizabeth R. Brown and Linjuan Qian, “Use of R as a Toolbox for Mathematical Statistics Exploration”, The American Statistician, 58, 343 (2004). DOI:10.1198/000313004X5572

    [2] Katharine M. Mullen and Ivo H. M. van Stokkum, “An Introduction to the Special Volume ‘Spectroscopy and Chemometrics in R’”, Journal of Statistical Software, 18, 1 (2007). Download

    *     A different version of R, called Arc has been proposed by the University of Minnesota. Arc stands for Applied Regression Including Computing and Graphics. The authors of Arc claim that their approach is more general than R and corresponds to the way people supposedly “think about data”. In my opinion, this is an unhelpful development. If R had been found wanting, as the authors of Arc claim, then they would have been better employed in developing R further, in the same spirit of the evolution of LaTeX, rather than producing different software.