Tag: Business

Data Quality Enlightenment

November 30th, 2009

After years of neglect, data quality is slowly moving to the forefront of business technology as both a discipline and a thriving industry.

However, given data quality license revenues are estimated at a relatively minuscule $400 million for 2009 (compared to $17 billion for DBMS license revenues), data quality is not quite center stage yet.

Therefore, in this post I want to discuss the increase in awareness by organizations that is necessary to give data quality its due.  I describe it as the three levels of Data Quality Enlightenment (DQE).

DQE Level 1 – Unaware

Organizations at Level 1 are blissfully unaware that slight discrepancies in their data create the potential for their business processes to fail.

Sometimes, the resulting failure is immediately visible.  Other times, it eventually becomes visible in a downstream application or after some period of time has passed.  Either way, the organization feels the impact of the failure as increased costs or decreased revenue, or both.

Upon finally recognizing the root cause of the problem to be data quality, organizations typically progresses to Level 2.

DQE Level 2 – Aware

Organizations at Level 2 have come to realize that they must implement data quality measures to avoid the costs of “bad data.”

The logic usually goes like this – if data is not perfect our business processes can fail, therefore we must make sure that our data is always perfect.  What wonderfully flawed logic!

No matter how hard an organization tries, their data can never be perfect.  Why?  Because by its nature, the data, and access to it, changes over time.

Existing records are updated.  New records are created.  Both of these actions can be performed by existing or new people and by existing or new systems.

With the additional reality that these people and systems can be both internal and external to the organization, the complexity grows exponentially.

Therefore, is it realistic to expect that all data throughout the enterprise will always be kept perfect and standardized the exact same way?

Will humans accessing the data know and use the standard methods?  Will humans always know the exact and correct data they want?  Will multiple applications (within and between organizations) that need to share data use the same standards for data perfection?

Of course not.  Simply put, perpetually perfect data is not possible.  Don’t believe anyone who tells you otherwise.

Yet despite these facts, the majority of the data quality industry is still focused on attempting to achieve data perfection.

The common belief is that the way to data Utopia is by writing rules to parse, standardize and match data.  Of course the different rules have fancy technical names like “deterministic” and “probabilistic” but they all boil down to manual, static rules that need to be created, maintained, and updated in perpetuity.

The rules an organization has in place today for “perfect data” will have to change (update old rules and add new rules) as the data changes.

Unlike Level 1, where organizations quickly realize they must change and progress to Level 2, most organizations at Level 2 get stuck here and never progress to Level 3.

DQE Level 3 – Enlightened

Organizations reach Level 3 when they achieve enlightenment via the “eureka moment” when they realize that getting and keeping data perfect at all times and forever is, fundamentally, an insane idea.

These organizations then seek to find a better way.

That better way is to enable all enterprise applications to function correctly despite the fact that the underlying operational data they use is not perfect.  And to do it without constantly updating and creating rules to parse, standardize, and match data.

The enlightened phase has only just begun with a select few organizations reaching Level 3.

Enlightenment is Inevitable

As is often the case,  enlightenment comes from a simple yet powerful idea that breaks away from the constraints of conventional thought.

It’s only a matter of time before every enterprise application will no longer assume and require “perfect” data in order to function correctly.

When this finally happens, and it will, everyone will benefit.

Tags: , ,
Posted in Business, Innovation, Technology | 1 Comment »

Data Sherpas Needed

October 12th, 2009

In the recent New York Times article Training to Climb an Everest of Digital Data, Ashlee Vance reported on the challenges associated with managing – and deriving value from – massive repositories of data.

“Researchers and workers in fields as diverse as bio-technology, astronomy and computer science,” reports Vance, “will soon find themselves overwhelmed with information.  The next generation of computer scientists has to think in terms of what could be described as Internet scale.  Facebook, for example, uses more than 1 petabyte of storage space to manage its users’ 40 billion photos.  (A petabyte is about 1,000 times as large as a terabyte, and could store about 500 billion pages of text).”

According to Gartner Research, the volume of enterprise data is doubling every 18 months.  This rapid data proliferation is causing day-to-day business challenges to evolve faster than the existing applications (or new applications under development) can react.

“Science these days has basically turned into a data-management problem,” said Jimmy Lin, an associate professor at the University of Maryland, at a recent technology conference.

From the beginning of civilization, mathematics (the language of science) has been central to our advancement.  But our relatively new found ability to collect massive amounts of digital data has ushered in a new era for leveraging and benefiting from mathematics.

Advancements in machine learning technology using sophisticated mathematical algorithms are providing the capability to not only rapidly process large volumes of data, but more importantly, enable enterprises to make better data-driven business decisions.

According to Vance, companies large and small, as well as universities and government agencies, are “looking for big data experts” capable of scaling today’s digital data mountains.

Perhaps tomorrow we will even see a listing in the classifieds (or more likely in a Twitter status update) that simply reads:

Data Sherpas Needed

Related Posts

The Growing Importance of Mathematics

Adaptive Software

Drowning in Imperfect Data

A Sisyphean Task…

Tags: , ,
Posted in Business, Technology, Trends | No Comments »

The Challenges of Data Transparency

October 5th, 2009

In the recent Federal Computer Week article Stimulus spenders race to the finish line, Alice Lipowicz reported on the efforts of state and local recipients to prepare detailed reports on how they are spending federal stimulus money.

Among the common concerns cited in the article was dealing with the sheer volume of data, and more importantly, the quality of the data.

“Transparency advocates predict that data quality,” reports Lipowicz, “will be as much of a concern at Recovery.gov as it is at USAspending.gov, which Congress established in 2006 to provide visibility into federal spending.  That site has been plagued with problems such as errors, missing data and mislabeled data.”

Data transparency is definitely a laudable goal, and not just for the government.  Organizations in every industry and of every size need to do a better job of making available for review, the data that was used to drive critical business decisions, especially financial decisions.

Missing or incomplete data is a common problem, but transparency can not simply mean a massive dump of all available data.

Completeness without any regard for accuracy could possibly do more harm than good.  Data frequently contains numerous variations caused by different conventions, lack of standards, omissions, and other inconsistencies.

An excellent question raised in the article was:

“Data quality has been a problem for years, so why do we keep getting [more data] instead of addressing these priorities?”

I think that this question represents one of the most significant challenges for data transparency.

Should data be concealed until it has been verified to be of sufficient quality?  Or should data be provided as soon as it becomes available without regard for quality?

Please share your thoughts.

Related Posts

Drowning in Imperfect Data

A Sisyphean Task…

Tags: ,
Posted in Business, Economy | No Comments »

From IT to BT (Business Technology)

May 18th, 2009

A lot has been written (and rightfully so) about the need to better align and to foster collaboration between the Business and Information Technology (IT). Traditionally this has been necessary because of the typically disparate skills and backgrounds of the people working on the business and technical sides of the enterprise.

In the early days of information technology when computers represented a brave new world that had such people in it that were typically electrical engineers and mathematicians, they spoke a language that the business world didn’t understand. They were so consumed with the difficulties and intricacies of the emerging technology that they had no time to dedicate to day to day business matters.

Today, rapid advancements in technology are becoming so commonplace that the next generation of college graduates will never have lived in a world without the Internet, laptop computers and cell phones that have more processing power and functionality than even the best desktop computers had less than two decades ago.

Our brave new world is seeing a dramatic increase in number of people with skills and knowledge in both business and technology and I believe that in the near future Information Tecnology (IT) will transition simply to Business Technology (BT).

If you ask me, it’s long overdue.

Tags: , ,
Posted in Business, Technology, Trends | 1 Comment »

Innovation – Do More with Less

December 15th, 2008

For individuals, families, businesses and organizations the theme for 2009 will undoubtedly be “Do More with Less.” The Google count is a mere 794,000 right now – I’m curious to see where it ends up 12 months from now.

For business and organizations doing more with less in these fraught times is imperative for survival. On the IT side, technology professionals at all levels must stay focused on delivering business value more efficiently and effectively than ever before.

One of the key enablers for doing more with less is innovation. Innovation delivers advantage – in the the form of saving costs, saving effort and improving results.

As we are all asked to do more with less let’s not forget that driving innovation is one way to make it happen.

Netrics is all about delivering innovative enterprise software! I’ll write about some specific examples in upcoming posts.

Tags: , ,
Posted in Business, Economy, Technology | No Comments »

The Cloud brings Commoditization

December 8th, 2008

“The Cloud” buzz continues to grow and will certainly be one of the most talked about topics in 2009. In fact, it’s probably already on the top 10 list for 2008 – a quick Google search shows nine million matches for “cloud computing.”

I think it’s interesting to look at first wave of changes that The Cloud will bring. As more and more relatively low-value but important IT services are delivered by SaaS providers, internal IT resources will be better able to focus on higher value projects: dealing with business transformation, working on effectiveness and efficiency projects and, of course, trying once more to deal effectively with increasingly high mountains of imperfect data.

What are some examples that are already happening? CRM and corporate email immediately jump to mind. Both are critical to organizations but at the same time both are being commoditized by The Cloud.

Case in point, Netrics – both our CRM (Salesforce.com) and our corporate email (Microsoft Exchange Server) are handled for us by outside vendors. At a fraction of what it would cost us to do it – and if allows us to better allocate our technical resources – keeping them focused on enhancing and improving our products!

Tags: ,
Posted in Business, Technology | No Comments »

Apples and Oranges

December 1st, 2008

The average computer user is already well aware that computers are limited in how they “look” at data. Ask yourself how many times you looked up a contact in Outlook (or whatever app you use to store your contacts) only to get zero results back… why? Because what you typed was off by just a single letter.

It seems pretty silly to us humans but as far as computers are concerned the result of comparing

“Damianakis” and “Damanakis”

is exactly the same result of comparing

“Damianakis” and “Smith”

or

“apples” and “oranges”

or

“Rahiem A Griffin” and “Rahiem Griffin”

when computers do these comparisons, the result is always “zero” – they are not equal. Which is clearly wrong for the first and last examples.

This problem has been around since the invention of the computer – comparing data elements (bytes) is built into every modern CPU. The comparison is, of course, limited to testing for equality or inequality.

This works well in a world where data is always perfect. But that’s not the real world – in the real world, data is never perfect – there are always small differences, inconsistencies, and errors that humans can easily handle but computers can’t.

So, why does this inability to compare data matter?

Because it is the root cause of many problems – problems that are created when we rely on computers to help us. A few examples,

  • airlines want to be sure they don’t let passengers on the no-fly list board planes
  • banks want to be sure they don’t transact business with people and organizations they’re not allowed to
  • hospitals want to be find the right patient record
  • companies want to know who their customers are
  • state agencies want to know when citizens are repeatedly exposed to health risks (ie, pesticides, lead, etc)
  • law enforcement agencies want to scan their data to find particular criminals
  • and the list goes on and on

You can build a database of patients or drivers or criminals or products, etc… but after the data is collected how to do you use it effectively? If you rely on exact matching you will run in problems. Unfortunately, exact matching is all that database systems offer.

The bottom line is that this core limitation translates in a broad range of problems that affect all of us.

Where does “Rahiem Griffin” come from? A while back David Rubinstein covered a tragic story that might have been avoided if computers were better able to handle real-world (ie, imperfect) data.

Our thoughts and prayers go out to Officer Kenneth Baribault and his family.

Tags: ,
Posted in Business, Technology | No Comments »

The Monk Factor

May 20th, 2008


I came across this recent article in BtoB Magazine

BtoB Magazine: “Marketers: Clean customer data a priority in 2008
By Carol Krol
March 17, 2008

It’s great to see that simply collecting customer data is no longer good enough – the data needs to be usable to benefit the business.

Refining customer data quality and access to customer data have emerged as two of the top marketing investment priorities of b-to-b CMOs this year. Half of b-to-b marketers plan to put more resources against creating marketing databases, cleaning up customer data, improving sales force automation and CRM integration, according to Forrester Research in its “B2B CMO Investment Priorities for 2008” report.

The article references a recent Forrester Research report, a survey released by Alterian, and quotes several marketing executives at large well-known enterprises – all agree that something must be done. Dealing with imperfect customer data is now a top priority for 2008. Finally!

Now that we all agree on the problem, the question is how should this problem be solved? Of course, it’s often a multi-facetted, cross-organizational solution that will always involve Technology, Process, and People (TPP). But all three components of TPP depend on each other – and any one of the three can be the weak link in the chain.

But in most cases, the weakest link is Technology – which is why more People are thrown at a problem, which in turn requires more Process to keep everyone working together efficiently. Then you get the snowball effect and the solution grows out of control and takes on a life (and expense) of its own.

So, the argument is if we can improve the core Technology, then we can reduce the People part and streamline the Process part as well. The TPP solution will stay nicely under control and perform as it’s expected to perform – while keeping costs in check.

Which leads to this wonderful quote from the article:

“Data always degrade, get dirty, become obsolete or old. You need a department of 20 people like Adrian Monk [from the TV show “Monk”] who live to keep things organized.”

Brett Butler
Director of Global Sales and Marketing Practices, Lexmark International

Yes, I agree whole-heartedly – the nature of customer data (in fact, almost all database data) is that it is (1) never perfect and (2) constantly changing in ways that change.

But needing a department of twenty Adrian Monks? This is the result of Technology failing to delivery and thus requiring the Monk department to compensate.

So I’m curious, what is your organization’s Monk Factor?

The point is that with Technology pulling its weight, the Monk Factor can be very low – but it’ll never be zero. Moreover, the People part shouldn’t require that everyone have those extra-special Monk skills. After all, Adrian Monk is a genius – you can’t build a successful solution to a business problem if you require twenty Monks here and twenty Monks there… the solution won’t scale and neither will your organization. And by the way, don’t expect your Adrian Monks to complete Sisyphean Tasks either!

Use your Monks strategically and leverage the heck out of them – that approach will bring the most benefit to your organization.

The right innovative Technology can help you do just that.

Tags: ,
Posted in Business, Technology | 1 Comment »

A Sisyphean Task…

May 19th, 2008

It never ceases to amaze me just how much (successful) brainwashing is out there… a common misconception is that the way to deal with the pervasive nature of “imperfect data” is to somehow magically keep all of the data “perfect” all of the time.

Like Sisyphus, we’d up end never ever reaching our goal.

Instead, what we need to focus on is enabling software systems that can handle imperfect data – while still doing what we can to keep the data in reasonable shape. This is the solution to strive for – and it really is independent of the software that Netrics makes. Of course, I believe that our software delivers the best such solution but that’s beside the point.

Tags: ,
Posted in Business, Technology | No Comments »

High Fidelity Data

May 16th, 2008

Some folks asked if we watch too much TV here at Netrics… why did we name our blog “Netrics HD”?  Are we launching a high-def channel? Alas, no we aren’t.

We chose the name because we believe in viewing database data – regardless of the application – though a “High Definition” lens. It’s this High Definition lens that enables us to deliver High Fidelity data, despite the fact that the data itself is not perfect (moreover it can never be perfect).

This High Definition lens happens to be powerful mathematics that learns about data. As opposed to rules-based (probabilistic and deterministic) solutions that require a lot of guessing.

Tags:
Posted in Business | No Comments »

Previous page

Pages

RSS Netrics HD

About Netrics HD

Data matching is a fundamental operation in many applications, from improving data quality to implementing master data management. Stef Damianakis, CEO of Netrics, a world leader in matching technology, shares his thoughts on the state of the technology and business of data matching.

Brought to you by...

Netrics Logo

Calendar

September 2010
M T W T F S S
« Nov    
 12345
6789101112
13141516171819
20212223242526
27282930  

Tag Cloud

Categories

Recent Posts

Recent Comments