R
Contents
Definition
R is a programming language for statistical computing and data visualization. It has been adopted in the fields of data mining, bioinformatics and data analysis.
R on Euler
R version | Module command |
---|---|
4.3.2 | module load stack/2024-06 r/4.3.2 |
4.4.0 | module load stack/2024-06 r/4.4.0 |
Package installation
To install new packages, run
install.packages("<package_name>")
which will dispay a warning and will ask Would you like to use a personal library instead? (y/n), which you want. This installs packages into $HOME/R.
To display installed packages, run
installed.packages()
Packages that require additional modules to be loaded for dependency libraries
R package | Module command |
---|---|
XML | module load stack/2024-06 r/4.4.0 libxml2/2.10.3-xbqziof libiconv/1.17-uiaqkl2 |
terra, raster | module load stack/2024-06 r/4.4.0 cmake/3.27.7 udunits/2.2.28 openssl/3.1.3-zhfub4o gdal/3.4.3 geos/3.9.1 proj/9.2.1 sqlite/3.43.2 |
packages that require zlib | module load stack/2024-06 r/4.4.0 zlib-ng/2.1.4-xgiegbt |
Interactive session
Execute
module load stack/2024-06 r/4.3.2
to make R available in your command line. Then
R
launches an interactive session. You should see
>
and you can try a simple command
print("Hello, World!")
which should print
[1] "Hello, World!"
Example program
Create a file hello.r, with the content
print("Hello, World!")
Bring R and Rscript to your command line with
module load stack/2024-06 r/4.4.0
Run the program via
Rscript hello.r
and the program should print
[1] "Hello, World!"
to your terminal.
Compute-Intensive jobs
Compute-Intensive jobs must be submitted to the batch system (Slurm).
sbatch [sbatch options] "R --vanilla --no-echo < input_file.R > output_file"
They can be parallelized with a variety of packages. Here's a good overview on the topic: https://cran.r-project.org/web/views/HighPerformanceComputing.html