Analyzing Data | Data Journalism Resources

A tutorial on locating structured data, reviewing it in a text editor and then finding the arithmetic mean in Excel. Video 1 of 3.

Calculating Median, Quartiles and Bounds

A tutorial on calculating the median, quartiles, upper and lower bounds on a data set in Excel. Video 2 of 3.

Calculating Standard Deviation and Z Score

A tutorial on calculating the standard deviation and z score on a data set. Video 3 of 3.

Measures of Central Tendency

An introduction to measures of central tendency: mean, mode and median. Calculating these values will help you understand better your data: is it normally distributed or skewed? Which is the value that appears most frequently?

Introduction to OpenRefine

An introduction to OpenRefine, an open source tool for manipulating large, messy data sets. This tutorial explores facets and transformations to interview and analyze data. Video 1 of 3.

Introduction to Regular Expressions

Slides with a brief introduction to Regular Expressions, how they work, some grammar and examples. RegEx are useful to search for and match a pattern in your data and clean it up from there.

Regex on OpenRefine

An introduction basic regular expressions in OpenRefine. Use these search patterns to edit your data in seconds. Video 2 of 3.

More OpenRefine: Extract, Apply, Cluster

Learn how to edit data massively using the clustering feature. See how to apply/extract commands to step backwards and forwards through your work, as well as apply it to new or revised data sets. Video 3 of 3.

More OpenRefine and Regex

A tutorial with more tips for using OpenRefine to analyze, share and document data sets. And a bit more on regular expressions.

Quantification and Statistical Inference

An overview class on what to count, how to “interview the data,” statistical models, the uses of multi-variable regression in journalism, and correlation vs. causation.

Randomness and Significance

A class on causality, p-hacking, reproducibility, and triangulation

Visualization and Network Analysis

A class on visualization (built upon design principles from user experience considerations, graphic design, and the study of the human visual system) and social network analysis in journalism

Merging two tables in OpenRefine

Read this tutorial to learn how to merge two data sets and analyze the results using OpenRefine.

The Curious Journalist's Guide to Data

A book about data journalism with few equations and no code that traces where data comes from, what journalists do with it, and where it goes after.

The ProPublica Guide to Bulletproofing Data

The ProPublica guide offers useful tips to make sure you're interviewing a data set comprehensively: create work logs, pull random samples, duplicate your work, show finding early, etc.

First Python Notebook

A step-by-step guide to analyzing data with Python and the Jupyter Notebook: create filters, merge cells, add values, sort or group by specified conditions, all with pandas, an open-source library.

Workbench

A platform that allows journalists to scrape, clean, analyze and visualize data without requiring any coding

Overview

A visualization and analysis tool designed for sets of documents, from dozens to millions of pages of material. Originally built for investigative journalists, it’s also used for legal work, training machine learning models, and research of all types.

OpenRefine

A downloadable software that helps you sort and sift dirty data, cleaning it to the point where you can start your actual analysis

TimelineJS

An open-source tool that enables anyone to build visually rich, interactive timelines

Pandas

A high-performance data analysis tool for Python

NLTK

A Python library built to process large amounts of text. Whether you’re analyzing Congressional bills, Twitter outrages or Shakespearean plays, NLTK has you covered.

scikit-learn

A Python package for machine learning and data analysis. It’s the Swiss Army knife of data science: it covers classification, regression, clustering, dimensionality reduction, and so much more.

I WANT TO

Phone

Contact Us