Skip to content

Categories:

Google Refine Gets Fusion Tables Import and More

Google Refine, formally known as Freebase Gridworks, has been updated to version 2.1. Refine is an open source tool for cleaning up messy data sets before linking them into systems such as Freebase. The update includes new HTML parsing functions, the ability to import Google Fusion Tables and more.

Freebase Gridworks was one of the tools included in Google’s acquisition of Metaweb. We covered its last big update here.

Sponsor

New features include:

  • HTML parsing functions (based on JSoup)
  • Metaphone3 (American English) & Cologne Phonetic (German) coders & clustering
  • Google Fusion Table import support
  • Facet for exact duplicates
  • Ability to star favorite expressions for reuse later
  • Latest Apache POI library including a number of Excel bug fixes

As we’ve noted before, Google Refine can be used with other Google tools to create a fairly powerful stack for working with big data.

Refine competes with other data cleaning tool such as DataWrangler.

Discuss


Posted in Uncategorized.


0 Responses

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.



Some HTML is OK

or, reply to this post via trackback.