Home / Open Source / OpenRefine

OpenRefine

/
/
/
4717 Views

OpenRefine (formerly Google Refine) is a powerful tool for working with raw data: cleaning it up, converting it from one format to another, and extending it with web services and external data. OpenRefine helps you explore large data sets with ease.

The real world is rarely lucky enough to work with “clean” data (i.e. usable for analysis without preprocessing). Sometimes the process of cleaning data takes more time and effort than analyzing it. But with tools like OpenRefine, cleaning data becomes much easier.

OpenRefine allows you to upload data in different formats – Google tables, information from a SQL database, a link to a table.

Before we download the file and start working with it, let’s look at the problems with it. First, the numbers in the “number” column are in different formats, which means we can’t calculate their sum or, say, the average. So we need to put this column into a single numeric format. Second, the dates in the “date” column are also in different formats (somewhere the month is written in words, somewhere in numbers, somewhere the separator is a dot, somewhere the sign of a fraction). To be able to sort this column, we need to put all the dates in the same format.

  • Facebook
  • Twitter
  • Linkedin
  • Pinterest

This div height required for enabling the sticky sidebar
Ad Clicks : Ad Views : Ad Clicks : Ad Views : Ad Clicks : Ad Views : Ad Clicks : Ad Views :