Tackling Poor Data Quality in the Age of Big Data
Jinfo Blog
11th June 2014
By Andrew Lucas
Abstract
The opportunity for organisations to benefit from big data may be limited by poor data quality. The cost to organisations of poor data is substantial. For marketing data the majority of databases surveyed were found to be lacking basic data. It is more cost effective to prevent data problems rather than trying to fix them.
Item
The benefits of big data are expected to be huge - revenues of $24 billion by 2016 according to IDC, with early adopters gaining a competitive margin over rivals.
But this potential big data bonanza may not be shared by all; poor data quality may undermine the efforts of some.
Data Accuracy is Essential
According to the "Business Intelligence Maturity Audit" (biMA®) carried out by the IT and business services group Steria, "Data quality is the Achilles heel of BI (Business Intelligence) and continues to be neglected, despite being the foundation of all BI analysis."
These findings are evidence that the truism of "garbage in, garbage out" (GIGO) can actually have a real world impact.
Data accuracy, for example, can be vital in avoiding problems in Anti-Money Laundering (AML) and Know Your Customer (KYC) datasets; for example, some of the technical issues with people's names were examined by Victoria Meyer in her FreePint article Combat Transcription Errors with Linguistic Identity Matching.
Impact on The Bottom Line & Beyond
Errors in marketing data also have a negative impact on business. D&B's white paper,"Gaining the Data Edge", quotes the Gartner Group saying that "poor data quality negatively impacts a company's bottom line by an average of $8.2 million annually in operational inefficiencies, lost sales and unrealised new opportunities".
As well as the negative financial impact, poor quality marketing data can reduce the effectiveness of campaigns, cause embarrassment to an organisation by sending inappropriate communications and reduce the efficiency of sales staff.
The Essence of High Quality Data
A recent survey, "The State of Marketing Data" (PDF), from NetProspex, a B2B marketing data services company, highlights some the challenges faced by companies.
The survey analysed 61 million records and found that 84% of marketing databases are barely functional. This included 88% of the records lacking basic data on companies such as industry, company revenue or number of employees; whilst 64% of the records analysed did not include a phone number.
So what does high quality data look like? The IBM "What is Data Quality" postdescribes the characteristics of high quality data as:
- Complete
- Accurate
- Available
- Timely.
A Two-Pronged Approach to Improving Data Quality
The challenges presented by poor data quality can tackled from both the front and the back end depending on the type of data.
For marketing information, where the data structure is relatively straightforward, the problems often arise from the initial input, or failure to input, the data - the GIGO syndrome.
In this scenario, according to D&B, "Data quality is a business issue, not an IT issue". The D&B report "The Big Payback on Quality Data" claims that "it is far more cost-efficient to prevent data issues than to resolve them".
Information Professionals Play a Key Role
For other types of big data the answer is often more to do with the "data about the data" - the metadata. Minimum metadata requirements need to be established for big data quality and management. Taxonomies also need to be defined to enable organisation-wide use of data.
Information professionals with their knowledge of data structures - metadata, taxonomies and indexing, can play a key part in establishing the data standards of an organisation.
They also have a role in ensuring that people within organisation understand the importance of capturing and entering accurate data.
Editor's Note
FreePint Subscribers can log in to read and share more in Andrew Lucas' article, Big Data Bonanza - But Only for Those With High Quality Data.
The FreePint Topic Series: Big Data in Action ran from April to June 2013. Visit the Topic page to find out more and see the links to the published articles.
- Blog post title: Tackling Poor Data Quality in the Age of Big Data
- Link to this page
- View printable version
- Big Data Bonanza - But Only for Those With High Quality Data
Tuesday, 10th June 2014 - Briefing: Public Data in the Context of Big Data
Monday, 5th May 2014 - Big Data - Talent Gap "Not Where You Think It Is"
Friday, 21st March 2014 - Big Data - C-Suite and Research Community Still Fumbling
Friday, 21st February 2014
- Big Data Techniques Afford Valuable Insights into Government Data
Wednesday, 7th May 2014 - Combat Transcription Errors With Linguistic Identity Matching
Tuesday, 8th October 2013 - Defining Big Data: The Four Vs
Friday, 26th April 2013 - Understanding Big Data
Friday, 19th April 2013
Community session
11th December 2024
2025 strategic planning; evaluating research reports; The Financial Times, news and AI
5th November 2024
How are information managers getting involved with AI? Navigating privacy, ethics, and intellectual property
- 2025 strategic planning; evaluating research reports; The Financial Times, news and AI
5th November 2024 - All recent Jinfo Subscription content
31st October 2024 - End-user training best practice research
24th October 2024
- Jinfo Community session (TBC) (Community) 23rd January 2025
- Clinic on contracting for AI (Community) 11th December 2024
- Discussing news and AI strategies with the Financial Times (Community) 21st November 2024