This post is notes from the Coursera Data Analysis Course.
Here are some basic R commands that should useful for obtaining data and looking at data in R. Ideally these commands are useful for steps 4, 5, and 6 of the 11 Steps to Data Analysis.
Load the data and just look at it
download.file('http://location.com', 'localfile.csv')
data <- read.csv('localfile.csv')
dim(data)
names(data)
quantile(data$column)
hist(data$column)
head(data)
summary(data)
str(data)
unique(data$column)
length(unique(data$column))
table(data$column) - count of how many times each value appears in the column
table(data$column1, data$column2)
any(data$column < 100)
all(data$column > 100)
colsums(data)
colmeans(data, na.rm=T)
rowMeans(data, na.rm=T)
Look for missing values
is.na(data$column)
sum(is.na(data$column))
table(data$column, useNA="ifAny")
For more information on any R command, just type ? in the R console. For example, if you want to know more about the dim command, just type ?dim
Leave a Reply