Reshaping and Joining Data Frames


In this lesson, you will learn to:


Time Estimates:
     Videos: 20 min
     Readings: 20-60 min
     Activities: 60 min
     Check-ins: 3



Reshaping Data


Decomposition


Required Reading: Computational Thinking


Please read the Decomposition and Hidden Assumptions sections only - stop when you reach the beginning of the Turn it into a recipe (let’s make an algorithm) section.

Tidy Data and Reshaping


Required Video: Reshaping Data (Pivoting)




Recommended Reading: R4DS 12.1-12.3: Tidy Data



Optional Reading: R4DS 12.4-12.5: Separate/Unite and Missing Values


(A few more tricks for data cleaning/wrangling, if you’re interested.)



Check-In 1: Pivoting


Question 1: Create a new dataset called cereals_3, that has three columns:

  • The name of the cereal

  • A column called “Nutrient” with values protein, fat, or fiber.

  • A column called “Amount” with the corresponding amount of the nutrient.

Question 2: Why didn’t we have to add a rowid to pivot wider in this case?


Check-In 2: Decomposition


Cereals in this dataset are placed on shelf 1, 2, or 3. We would like to know if these cereal placements correspond to different nutritional values; for example, perhaps sugary cereals made for children are on a lower shelf.

Create a new dataset called cereals_4, that has four columns:

  • The name of the manufacturer

  • The mean amount of sugar in cereals on shelf 1.

  • The mean amount of sugar in cereals on shelf 2.

  • The mean amount of sugar in cereals on shelf 3.


Joining data



Required Reading: Mutating Joins



Recommended Reading: R4DS Chapter 13: Relational Data