Apples and Oranges

December 1st, 2008 by Stefanos Damianakis

The average computer user is already well aware that computers are limited in how they “look” at data. Ask yourself how many times you looked up a contact in Outlook (or whatever app you use to store your contacts) only to get zero results back… why? Because what you typed was off by just a single letter.

It seems pretty silly to us humans but as far as computers are concerned the result of comparing

“Damianakis” and “Damanakis”

is exactly the same result of comparing

“Damianakis” and “Smith”

or

“apples” and “oranges”

or

“Rahiem A Griffin” and “Rahiem Griffin”

when computers do these comparisons, the result is always “zero” – they are not equal. Which is clearly wrong for the first and last examples.

This problem has been around since the invention of the computer – comparing data elements (bytes) is built into every modern CPU. The comparison is, of course, limited to testing for equality or inequality.

This works well in a world where data is always perfect. But that’s not the real world – in the real world, data is never perfect – there are always small differences, inconsistencies, and errors that humans can easily handle but computers can’t.

So, why does this inability to compare data matter?

Because it is the root cause of many problems – problems that are created when we rely on computers to help us. A few examples,

  • airlines want to be sure they don’t let passengers on the no-fly list board planes
  • banks want to be sure they don’t transact business with people and organizations they’re not allowed to
  • hospitals want to be find the right patient record
  • companies want to know who their customers are
  • state agencies want to know when citizens are repeatedly exposed to health risks (ie, pesticides, lead, etc)
  • law enforcement agencies want to scan their data to find particular criminals
  • and the list goes on and on

You can build a database of patients or drivers or criminals or products, etc… but after the data is collected how to do you use it effectively? If you rely on exact matching you will run in problems. Unfortunately, exact matching is all that database systems offer.

The bottom line is that this core limitation translates in a broad range of problems that affect all of us.

Where does “Rahiem Griffin” come from? A while back David Rubinstein covered a tragic story that might have been avoided if computers were better able to handle real-world (ie, imperfect) data.

Our thoughts and prayers go out to Officer Kenneth Baribault and his family.

Tags: ,
Posted in Business, Technology | No Comments »

Leave a Reply

Pages

RSS Netrics HD

About Netrics HD

Data matching is a fundamental operation in many applications, from improving data quality to implementing master data management. Stef Damianakis, CEO of Netrics, a world leader in matching technology, shares his thoughts on the state of the technology and business of data matching.

Brought to you by...

Netrics Logo

Calendar

September 2010
M T W T F S S
« Nov    
 12345
6789101112
13141516171819
20212223242526
27282930  

Tag Cloud

Categories

Recent Posts

Recent Comments