About This Course
This course focuses on the concepts and tools behind reporting modern data analyses in a reproducible manner. Reproducible research is the idea that data analyses, and more generally, scientific claims, are published with their data and software code so that others may verify the findings and build upon them. The need for reproducibility is increasing dramatically as data analyses become more complex, involving larger datasets and more sophisticated computations. Reproducibility allows for people to focus on the actual content of a data analysis, rather than on superficial details reported in a written summary. In addition, reproducibility makes an analysis more useful to others because the data and code that actually conducted the analysis are available. This course will focus on literate statistical analysis tools which allow one to publish data analyses in a single document that allows others to easily execute the same analysis to obtain the same results.
What You’ll Learn
➤ Organize data analysis to help make it more reproducible
➤ Write up a reproducible data analysis using knitr
➤ Determine the reproducibility of analysis project
➤ Publish reproducible web documents using Markdown
Skills You’ll Gain
➤ Knitr
➤ Data Analysis
➤ R Programming
➤ Markup Language
What you will learn from this course
Module 1 – Concepts, Ideas, & Structure
➤ What is Reproducible Research About?
➤ Reproducible Research: Concepts and Ideas
➤ Scripting Your Analysis
➤ Structure of a Data Analysis
➤ Organizing Your Analysis
Module 2 – Markdown & knitr
➤ Coding standards
➤ Markdown
➤ R Markdown
➤ R Markdown Demonstration
➤ knitr
➤ Introduction to Course Project
Module 3 – Reproducible Research Checklist & Evidence-based Data Analysis
➤ Communicating results
➤ RPubs
➤ Reproducible Research Checklist
➤ Evidence-based Data Analysis
Module 4 – Case Studies & Commentaries
➤ Caching computations
➤ Case Study: Air Pollution
➤ Case Study: High Throughput Biology
➤ Commentaries on Data Analysis
➤ Introduction to Peer Assessment