Hands-On Exploratory Data Analysis with R

上QQ阅读APP看书，第一时间看更新

The gather() function

There are times when our data is considered raw and unstacked (not in chronological order) and a common attribute of concern is used across the columns. To reformat the data so that these common attributes take up a single variable, the gather () function will take multiple columns and break them into key-value pairs, duplicating all other columns if needed.

The following illustration will help us to better understand the implementation of gather() function. The syntax for implementing the gather() function is as follows:

gather(data, key, value, ..., na.rm =   FALSE, convert = FALSE)

Here, the parameters of the function are as follows:

data: Data frame
key: Name of the key
value: Name of the value
na.rm: If TRUE, it will remove rows from the output
convert: If TRUE, it will automatically convert the specified key column

Suppose we need to gather information relating to the manufacturer and model and display other attributes in same way. In this case, there is a need to present only manufacturers and models in a systematic manner. We can achieve this with the help of the gather() function, demonstrated as follows:

> mpg2 <- mpg %>% gather(mpg, "Year   of Establishment", "year", -manufacturer)   
> View(mpg2)

The output generated is displayed as follows:

It is clearly visible that the key-value pair is generated for the year of establishment of each and every model included in the dataset.