Sunday, November 17, 2013

Data Management and Quality, Why and How?

Most of us are very familiar with the poor data quality that exists in large organization computer systems. How often do you find your name misspelled on bills and letters sent to you? How often do you receive a duplicate mailing due to duplicate record in the mailer database? How often are you required to provide your home or business address for receiving your updated credit card every time your card expires?

Once data is entered or captured in computer systems issues start to appear about it. It rapidly becomes obsolete and needs updating.

Data mastering is another big issue that faces most organizations. You can easily find six different versions of customer records data that all are considered to be ‘master customer data’. In fact you will find 10s of sources for certain data sets such as the list of products or the list of company offices. Applications such as ERP, CRM, production systems, supply chain and corporate intranets all have hundreds of data sets that are all vulnerable to quality issues, duplicated, out of date or out of sync with other similar sets within the organization.

Therefore, data management is a very strategic topic for every CIO of a big organization. The quality of data is as well strategic, because it has been proven that quality management of data is the only way to mastering data. This is referred to as corporate data quality management CDQM.

Several systems, practices, architectures and tools that are used to manage data, ensure its quality can be categorized as:
  1. Master Data Management (MDM): Several frameworks and tools are available in the software market to ensure proper master data management and provide the "single version of truth" of key data entities like customer, employee, product, partner or supplier. These frameworks provide the ability to manage master data and will probably offer various means for making this data available to various systems within or outside the enterprise.
  2. ETL: Tools used to extract, transform and load data across various systems.
  3. Big Data: Tools utilizing big data technologies such as using Hadoop map/reduce methods to analyze large and unstructured volumes and sets of data to come out with findings and reports.
  4. Migration Tools: Corporates often change systems, upgrade from one version of a system to another or introduce new systems to manage business. Effective data migration tools help organizations populate newly introduced system or system revisions with correct and clean data.
  5. Synchronization Tools: Systems come from different vendors and sometimes utilize different technology platforms, even within the same system, similar data sets such as customer data need to be present across different databases. So, out there in the software market, there exist tools that ensure data is synchronized across various systems while applying the transformations required to fit any given schema.
  6. Data Governance: It is a combination of people, processes and technology that ensures accuracy and value of entered data into various organizational systems.
  7. Service Oriented Architecture (SOA): SOA ensures that systems are built in a loosely coupled manner which implies proper embedding of various business processes into their own service logic while providing the necessary channels (service interfaces) between various business processes and modules to interact in order to deliver the end to end business value. This  practice in building systems, by design delivers systems with better data quality and less of data duplication and redundancy.
"In 2011 alone, 1.8 zettabytes (or 1.8 trillion gigabytes) of data will be created, the equivalent to every U.S. citizen writing 3 tweets per minute for 26,976 years. And over the next decade, the number of servers managing the world's data stores will grow by ten times." IDC Study referenced by Computer World Magazine. "The IDC study predicts that overall data will grow by 50 times by 2020, driven in large part by more embedded systems such as sensors in clothing, medical devices and structures like buildings and bridges".


4 comments:

Linda Boudreau said...

Insightful post, thanks for sharing. It's definitely a different time for CIOs with big data and data management. In our business, it's important to link together all areas of business from different data sources & divisions. One does need good data for good reporting and business insight, especially when dealing with data from multiple sources.

Simon Emmitt
Data Ladder

Linda Boudreau said...

Insightful post, thanks for sharing. It's definitely a different time for CIOs with big data and data management. In our business, it's important to link together all areas of business from different data sources & divisions. One does need good data for good reporting and business insight, especially when dealing with data from multiple sources.

Linda Boudreau
Data Ladder

Prologic Corporation said...

This is a good article & good site.Thank you for sharing this article. It is help us following categorize:
healthcare, e commerce, programming, multi platform,inventory management, cloud-based solutions, it consulting, retail, manufacturing, CRM, technology means, digital supply chain management, Delivering high-quality service for your business applications,
Solutions for all Industries,packaged applications,business applications, Web services, data migration
Business intelligence, Business Development, Software Development etc.


Our address:
2002 Timberloch Place, Suite 200
The Woodlands, TX 77380
281-364-1799

prologic-corp

Mathew Stephen said...

There are lots of information about latest technology and how to get trained in them, like Big Data Course in Chennai have spread around the web, but this is a unique one according to me. The strategy you have updated here will make me to get trained in future technologies. By the way you are running a great blog. Thanks for sharing this.