A platform that allows journalists to scrape, clean, analyze and visualize data without coding. Use it to clean messy data, finding and fixing typos in seconds.
Learn how to edit data massively using the clustering feature. See how to apply/extract commands to step backwards and forwards through your work, as well as apply it to new or revised data sets. Video 3 of 3.
Slides with a brief introduction to Regular Expressions, how they work, some grammar and examples. RegEx are useful to search for and match a pattern in your data and clean it up from there.
Chapter 5 of the handbook walks you through simple but effective steps to initially verify the quality of your data. Is the data complete? Are there duplicate records? There is no such thing as a completely reliable source when it comes to using data to make meticulous journalism.
A guide by Quartz that delves into the kinds of problems that reporters encounter when working with (messy) data. Are values missing? Duplicated? Spelling inconsistent? This guide suggests solutions to these common issues.
A ProPublica guide that walks you through basic (and elaborate) checks for every dataset, from making sure you know how many records you should have to assuring you have them all.