Learning Analytics
workshop support site
Intro 03 - save your data

lastmod: 11 May, 2020


Getting and saving the data

In a previous post (intro 02), we learned about a toy data file, located here, and we even viewed the data using the ‘read.csv’ command. But we only knew to use that command because we knew what type of file it was (i.e., csv) and because we knew where it was located (i.e., its URL). However, we did not actually save the data in our R project dedicated folder, albeit we did construct it (note: if you followed the instructions in the above mentioned post, the directory ‘data’ is located inside your ‘learning-analytics’ directory).

For the keen observer inside you, it must be already obvious that we’re still using the Console to command R to do our bid. Hang in there, we’re almost done with the Console, albeit you’ll not only soon miss it, but rather feel compelled to return to the source, to paraphrase the infamous Agent Smith of the Matrix.

Getting our the data from its Internet location

Let’s get our data, again, and this time, let’s save it for future use. Write the code from the code box below in the Console (leaving out the comments).

# read the data from its URL using `read.csv`
read.csv("https://learning-analytics.dorinstanciu.com/post/intro-rstudio/data/toy-data.csv", fileEncoding = "UTF-8-BOM")

In executing the above command, we only replicated our action for the last post, and we’re now viewing the data, but it is not yet saved in our computer. In order to do so, we must assign it to an R object, i.e. a specific structure in R which holds our data. This is nothing more than giving it a name, actually.

# assign the data to an `R object` 
toydata <- read.csv("https://learning-analytics.dorinstanciu.com/post/intro-rstudio/data/toy-data.csv", fileEncoding = "UTF-8-BOM")

However, if you look now under the tab Environment, you should see an ‘object’ called ‘toydata’. No more results listed under your command in the Console, though. This is because this time, the output of your command is listed in the Environment. That’s what you asked for, i.e., that R get the data from its source and stores it somehow within your working project. So, R did exactly what you asked.

Fig. 1: Using the console to get the data from source and save it in an R object

Fig. 1: Using the console to get the data from source and save it in an R object

Note: The keen observer that you are must have already noticed a difference between your R Studio screen and mine. Mine shows some extra text in the upper-left window, whereas yours should only show three main windows (or four), but definitely nothing else other than your commands in the Console and the toydata object in the Environment window (if all commands were given correctly). That’s perfectly alright and it happens because I’m building this website while constructing the examples at the same time.

Saving the data as an ‘R RDS object’

For now, we have an ‘R object’ in which we stored our data. But that object exists only because the connection between it and the data from the Internet exists. What if the data is removed from its URL at some point in the future, what if our connection to it is not possible all the time? Since we have it, we could just as well store it locally for future use.

# use the 'saveRDS' command to store the toydata as an RDS object 
saveRDS(toydata, "data/toydata.RDS")

If your did everything correctly, and you are now browsing to your ‘data’ folder within the ‘learning-analytics’ folder, either using your computer’s files and folders explorer or using R Studio’s build in navigator, you will see an ‘RDS’ object inside the ‘data’ folder. The only case in which this didn’t happened is if your either wrote the command wrong (in which case you should have gotten an angry message from the Console) or if you didn’t place your ‘data’ folder immediately inside your ‘learning-analytics’ folder.

Saving the data as a CSV file

While we have our data as an RDS object and we don’t fear anymore having our connection malfunctioning for whatever reasons or the data being removed from its Internet source, we might want to have it in a format that can be read with other software as well. And, it just so happens that CSV files can be opened with the old trusted Microsoft Excel (and all text editors for that matter).

# use the 'write.csv' command to store the toydata file as a CSV file
write.csv(toydata, "data/toydata.csv")

If you’ve written the commands above correctly, your data folder should now show two objects, that is, a toydata.RDS object and a toydata.csv file.

Fig. 2: Your saved data files inside the ‘data’ directory
So, here you are, a master of saving files. More power to you. Let’s move on to scripts and libraries…


Last modified on 2021-04-07