What is Messy Data?
Inconsistent formats, unnecessary white space, extra characters, typos, etc… Messy data is the bane of analysis! Each column contains exactly the same info:
|Oct 14, 2015||1000 dollars||idaho|
|Wed, Oct 14th||US$1000||Idaho,|
Multi-valued cells limit ability to manipulate, clean, and use the data:
|“Using OpenRefine by Ruben Verborgh and Max De Wilde, September 2013”|
|“University of Idaho, 875 Perimeter Drive, Moscow, ID, 83844, p. 208-885-6111, firstname.lastname@example.org”|
Luckily, Refine provides powerful visualizations and tools to discover these types of data issues, then isolate and fix them.