Data cleaning is often a big challenge when working with textual data. The Fuzzy Lookup Add-In for Excel is a new tool from Microsoft Research and BI Labs that helps with the problem of identifying and matching textually similar string data in Excel. It is robust to spelling mistakes, synonyms, missing or added words and a number of other data quality problems frequently encountered in the real world. It has support for most languages and works well across a wide variety of data domains. Common uses include cleaning up lists of names, addresses, products or other entity descriptions which contain fuzzy duplicates. It can also be used to fuzzy join two different tables together. For instance, you might clean and augment a table of dirty city, state data with a zip code by matching it against a clean reference table of city, state and zip codes.
Fuzzy Lookup Add-In for Excel