The goal of this assignment is to provide a “bridge” between the first two weeks of lectures and assignment 1 for those either new to R or struggling with how to approach the assignment.
This guided example, will not provide a solution for programming assignment 1. However, it will guide you through some core concepts and give you some practical experience to hopefully make assignment 1 seems less daunting.
To begin, download this file and unzip it into your R working directory. http://s3.amazonaws.com/practice_assignment/diet_data.zip You can do this in R with the following code: dataset_url <- "http://s3.amazonaws.com/practice_assignment/diet_data.zip" download.file(dataset_url, "diet_data.zip") unzip("diet_data.zip", exdir = "diet_data")
If you’re not sure where your working directory is, you can find out with the getwd() command. Alternatively, you can view/change it through the Tools > Global Options menu in R Studio.
So assuming you’ve unzipped the file into your R directory, you should have a folder called diet_data. In that folder there are five files. Let’s get a list of those files: list.files("diet_data") ## [1] "Andy.csv"
"David.csv" "John.csv"
"Mike.csv"
"Steve.csv"
Okay, so we have 5 files. Let’s take a look at one to see what’s inside: andy <- read.csv("diet_data/Andy.csv") head(andy) ##
##
##
##
##
##
##
1
2
3
4
5
6
Patient.Name Age Weight Day
Andy 30
140
1
Andy 30
140
2
Andy 30
140
3
Andy 30
139
4
Andy 30
138
5
Andy 30
138
6
It appears that the file has 4 columns, Patient.Name, Age, Weight, and Day. Let’s figure out how many rows there are by looking at the length of the ‘Day’ column: length(andy$Day) ## [1] 30
30 rows. OK.
Alternatively, you could look at the dimensions of the data.frame:
1
dim(andy)
## [1] 30
4
This tells us that we 30 rows of data in 4 columns. There are some other commands we might want to run to get a feel for a new data file, str(), summary(), and names(). str(andy) ## 'data.frame':
## $