Portfolio Part 2: Data Cleaning and Rough Draft
1 Portfolio: Data Cleaning and Rough Draft
The Portfolio component is a place for you to put your R skills into action on a problem you are interested in with the goal of have a project you could share with future employers.
It should have the following qualities:
- It is a real-world application of R that has not exactly been worked out before (e.g. it isn’t a demo from some package or blog).
- It is interesting to you.
- It involves data and analyzing or presenting that data. The data may be data you have from a lab, or something you have retrieved from the web, some examples of good sources: FBI database on crime statistics, National Oceanic and Atmospheric Administration, World Health Organization, Twitter, Yahoo finance data, etc. If you are having problems finding a dataset, see the resources at the end of the project description.
- The analysis and presentation is useful in the real-world.
These are real-world projects, but they are also class projects and there can be unforeseen unknowns, if you find that it is going to be impossible to finish what you set out to accomplish, please contact your instructor to find a solution.
1.1 Portfolio Expectations
The final product will be a website page, hosted on your personal website, the includes the following content:
- Description of the proposed research questions
- Description of the data and data source(s)
- Description of data cleaning
- 2-3 data visualizations with commentary that answer the research questions
Though no code should be visible on the website, the quarto document used to create the final product should contain all code for data cleaning and data visualization, commented and following a coding style conventions.
1.2 Part 1 - Data Cleaning
Begin the process of cleaning your data. Be sure to comment on the code with the reason why you are making your modifications. It is possible you will need to transform your data in a few different ways to make the data visualizations you planned to create, so you might have 1-3 different ‘final’ clean data sets.
Each clean data set should be exported and saved in a folder within the project.
Remember there needs to be at minimum 3 data transformations to meet the requirements of the project.
1.3 Part 2 - Data Visualization
Attempt to create a rough draft of at least one of your proposed visualizations. Identify anything you want to modify in your visualization, but do not know how to do.