Reproducible research

About This Course

This course focuses on the concepts and tools behind reporting modern data analyses in a reproducible manner. Reproducible research is the idea that data analyses, and more generally, scientific claims, are published with their data and software code so that others may verify the findings and build upon them. The need for reproducibility is increasing dramatically as data analyses become more complex, involving larger datasets and more sophisticated computations. Reproducibility allows for people to focus on the actual content of a data analysis, rather than on superficial details reported in a written summary. In addition, reproducibility makes an analysis more useful to others because the data and code that actually conducted the analysis are available. This course will focus on literate statistical analysis tools which allow one to publish data analyses in a single document that allows others to easily execute the same analysis to obtain the same results.

What You’ll Learn

Organize data analysis to help make it more reproducible

Write up a reproducible data analysis using knitr

Determine the reproducibility of analysis project

Publish reproducible web documents using Markdown

                                                 Skills You’ll Gain

Knitr

Data Analysis

R Programming

Markup Language

What you will learn from this course

Module 1 – Concepts, Ideas, & Structure

 What is Reproducible Research About?

 Reproducible Research: Concepts and Ideas

 Scripting Your Analysis

 Structure of a Data Analysis

 Organizing Your Analysis

Module 2 – Markdown & knitr

Coding standards

Markdown

R Markdown

R Markdown Demonstration

knitr

Introduction to Course Project

Module 3 – Reproducible Research Checklist & Evidence-based Data Analysis

Communicating results

RPubs

Reproducible Research Checklist

Evidence-based Data Analysis

Module 4 – Case Studies & Commentaries

Caching computations

Case Study: Air Pollution

Case Study: High Throughput Biology

Commentaries on Data Analysis

Introduction to Peer Assessment