Assignment 2

Graphics and Data Munging Practice: 10 pts

Due: February 27th, 2022 - No penalty for late submissions, but due no later than May 8th.


For this assignment, you will build upon the skills you learned in the first reproducible R Markdown document created in the first assignment. In this assignment, you will explore some data graphically to inform a few research questions using data from the fivethirtyeight package. The source file (the .Rmd) file will be turned in as well as the compiled version (html). Note, please create a new Rmd document for this assignment rather than continue the one from the first assignment. Submit completed assignment, including Rmd and html to ICON.

All graphics should be of high quality, this includes formatting of axes, axes labels, etc. If none of the graphics are of high quality, a 2 pt penalty will apply over and above any item-specific reductions.

Research Questions

The following research questions will be used to guide the assignment, but you do not need to answer these directly. The questions below will reference these questions.

  1. Using the college_recent_grads data from the fivethirtyeight package, which majors are the most popular?
  2. Using the college_recent_grads data from the fivethirtyeight package, which major categories (not individual majors, but bigger major categories, major_category) are most unisex (i.e., have an equal number of males/females in them.)?
  3. Related to #2, which major categories have the most disproportionate number of males or females with that major category?
  4. Is there any evidence of a relationship between unemployment rate and the popularity of a major? What about median salary (shown by median) and the popularity of a major?

Questions

  1. Using an appropriate verb from dplyr, which majors are the most popular? Don’t print all the data in the output file to answer this question, keep this summary concise. 1 pt

  2. Which majors are the least popular? Don’t print all the data in the output file to answer this question, keep this summary concise. 1 pt

  3. Explore the distribution of the variable/attribute sharewomen visually. Summarize characteristics of this variable in a few sentences. Be sure to include any figure(s) or statistics as evidence to support your description. 1 pt

  4. Which major categories are the most unisex and which major categories are the most disproportionate? Similar to #1 and #2, please don’t print all of the majors, just highlight a few in each category. 1 pt

  5. Create a figure that effectively shows which major categories are the most unisex and disproportionate in a single figure. Discuss briefly why this figure is effective at answering research question 2 and 3. 1 pt

  6. Create a figure that explores if there is a relationship between the popularity of a major and the unemployment rate. Discuss briefly how you defined popularity and why this figure helps to show the relationship between the two attributes. Be sure in your discussion to also state your interpretation of the figure. 1 pt

  7. Building off of #6, create a figure that explores if there is a relationship between the popularity of a major and the median salary (shown with the median attribute). Discuss how you defined popularity and why this figure helps to show the relationship between the two attributes. Be sure in your discussion to also state your interpretation of the figure. 1 pt

  8. Create a figure that explores if there is a relationship between the popularity of a major category and the unemployment rate. Discuss why this figure helps to show the relationship between the two attributes and also discuss how this figure may differ from the one created in #6. What additional features did you need to consider to make an effective visualization of this relationship given the data structure. Be sure in your discussion to also state your interpretation of the figure. Note, you may wish to use dplyr and ggplot2 to help with this question. 1 pt

  9. Create a figure that explores if there is a relationship between the popularity of a major category and the median salary (shown with the median attribute). Discuss why this figure helps to show the relationship between the two attributes and also discuss how this figure may differ from the one created in #6. What additional features did you need to consider to make an effective visualization of this relationship given the data structure. Be sure in your discussion to also state your interpretation of the figure. Note, you may wish to use dplyr and ggplot2 to help with this question. 1 pt

  10. Identify a new research question from the college_recent_grads that interests you. State this research question, then create a figure that highlights/explores the research question. Discuss briefly why this figure does a good job of exploring the research question. What challenges did you have creating the figure? 1 pt

Previous
Next