install.packages("future")
Parallel programming
and dealing with large data…
Pre-lecture activities
Important
In advance of class, please
future
- this provides a unified parallel framework in R consistent
You can do this by calling
And load the package using:
library(future)
In addition, please read through
- Strategies for dealing with large data
- https://www.futureverse.org/packages-overview.html (just the
future
R package)
How much should I prepare for before class?
You should have future
installed and be familiar with the three basic functions - plan()
, future()
, and value()
.
We will learn more about these functions in class.
Lecture
Acknowledgements
Material for this lecture was borrowed and adopted from
Learning objectives
Learning objectives
At the end of this lesson you will:
- Understand the basics of parallel computing
- Become familiar with basic functions in the
future
package - Recognize different file formats to work with large data not locally
- Implement three ways to work with large data:
- “sample and model”
- “chunk and pull”
- “push compute to data”
Slides
Class activity
For the rest of the time in class, we will practice using the future
package. There are two tutorials for you to work through own your own developed by Henrik Bengtsson from the UseR! 2024 conference:
Also, if you would like to try out the three strategies we learned about in class today for dealing with large data, try working through the pre-reading: