Exploring Data with the Tidyverse
David Robinson

The tidyverse is a powerful collection of packages following a standard set of principles for usability. During this workshop David will demonstrate an exploratory data analysis in R using tidy tools. He will demonstrate the use of tools such as dplyr and ggplot2 for data transformation and visualization, as well as other packages from the tidyverse as they're needed. He'll narrate his thought process as attendees follow along and offer their own solutions.

Geospatial Statistics and Mapping in R
Kaz Sakamoto

Geospatial expert and Columbia Professor Kaz Sakamoto is leading this class on all things GIS. You'll learn how about map projections, spatial regression, plotting interactive heatmaps with leaflet and working with shapefiles.

Git for Data Science
Dan Chen

Daniel Chen, author of Pandas for Everyone, has given multiple talks at the New York R Conference about the data science workflow. In this workshop he'll teach how to use Git and project management for better organization and faster iteration.

Introduction to Survival Analysis
Elizabeth Sweeney

Time-to-event outcomes are common in a variety of statistical applications, but the statistical techniques needed to appropriately analyze data in the presence of censoring or when predictor variables are not observed at baseline are not always taught as part of a standard statistics curriculum. This workshop will introduce the statistical techniques needed to address common questions in the context of time-to-event outcomes. Topics covered will include types of censoring, the Kaplan-Meier estimator of the survival function, Cox proportional hazards regression, analysis of time-dependent covariates, and interval censoring methods to handle situations where the exact event time is not known. All common statistical analyses will be demonstrated in R, including use of the survival and ggsurvplot packages.