Fuzzy Lookup Add-In for Excel

Data cleaning is often a big challenge when working with textual data. The Fuzzy Lookup Add-In for Excel is a new tool from Microsoft Research and BI Labs that helps with the problem of identifying and matching textually similar string data in Excel. It is robust to spelling mistakes, synonyms, missing or added words and a number of other data quality problems frequently encountered in the real world. It has support for most languages and works well across a wide variety of data domains. Common uses include cleaning up lists of names, addresses, products or other entity descriptions which contain fuzzy duplicates. It can also be used to fuzzy join two different tables together. For instance, you might clean and augment a table of dirty city, state data with a zip code by matching it against a clean reference table of city, state and zip codes.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: