# Data Manipulation

## Learning Objectives

• tidyr Functions

• Wide to Long Example

# Tidyr

## tidyr Functions

A set of functions that will tidy up a data set such that:

• Every Column is a variable

• Every Row is an observation

• Every Cell is a single value

## pivot_longer()

• The pivot_longer() function grabs the variables that repeated in an observation places them in one variable

## pivot_wider()

• The pivot_wider() function then converts long data to wide data.

## separate()

• The separate() function will separate a variable to multiple variables:

# Example

## Wide to Long Data Example

We work on converting data from wide to long using the functions in the tidyr package. For many statistical analysis, long data is necessary.

Use the read_csv() to read data_3_4.csv into an object called data1;

data1 <- read_csv(file="http://www.inqs.info/files/hiss_3/data_3_4.csv")

## pivot_longer()

• The pivot_longer() function grabs the variables that repeated in an observation places them in one variable:
df1 <- data1 %>%
pivot_longer(cols=v1/mean:v4/median,
names_to = "measurement",
values_to = "value")
#> pivot_longer: reorganized (v1/mean, v1/sd, v1/median, v2/mean, v2/sd, …) into (measurement, value) [was 1000x13, now 12000x3]

## separate()

• The separate() function will separate a variable to multiple variables:
df2 <- data1 %>%
pivot_longer(cols=v1/mean:v4/median,
names_to = "measurement",
values_to = "value") %>%
separate(col=measurement,into=c("time","stat"),sep="/")
#> pivot_longer: reorganized (v1/mean, v1/sd, v1/median, v2/mean, v2/sd, …) into (measurement, value) [was 1000x13, now 12000x3]

## pivot_wider()

• The pivot_wider() function then converts long data to wide data.
df3 <- data1 %>%
pivot_longer(v1/mean:v4/median,
names_to = "measurement",
values_to = "value") %>%
separate(measurement,c("time","stat"),sep="/") %>%
pivot_wider(names_from = stat,
values_from = value)
#> pivot_longer: reorganized (v1/mean, v1/sd, v1/median, v2/mean, v2/sd, …) into (measurement, value) [was 1000x13, now 12000x3]
#> pivot_wider: reorganized (stat, value) into (mean, sd, median) [was 12000x4, now 4000x5]