Archives For Reference Data

As we all begin to make our final plans to attend Data Leadership 2015 late in November (http://bit.ly/1YOKrJV), it struck me after reviewing the agenda once again that we have now reached a point where there are now many discrete & different forms of data being used across most enterprises (Public, Private & NFP) on a regular basis. Much of this data now comes from outside the Organization in the form of Open Data, Reference Data, Social Media Data, etc.  All of these data sources are managed to varying SLA’s and Best Practices in respect to quality, veracity, latency, etc., making them extremely suspect at times in my opinion. However, most Enterprises do not question their sources of this external data and simply embrace it for the “Richness” that it provides without consideration of the care & feeding that it has undergone over its lifetime. Why is there such implicit trust here one might ask especially in light of most Organizations’ challenges with their own data in respect to quality, etc.?

The notion of Data Leadership is one where Data, Information & Analytics are treated as core competencies by every organization. As such, they are strategic in their nature and are major leverage points for the Organization to use in creating Competitive Advantage. These core competencies rely on the fact that the data that underpins them is of the highest quality regardless of metric used to evaluate them with. This requirement transcends all industry segments and applies to Government and NGA’s alike. Bad or misleading data in respect to accuracy impacts everyone in a debilitating way. Given this, every Senior Executive has a Data Leadership accountability to make sure that the highest quality standards are maintained, even if the data is sourced from a 3rd party or from the Open Data Community. Herein lies the rub. How do you manage what you don’t control?

As data is monetized and sold by the pound by Reference Data providers,  much less as it is freed up from the government silos that it has been hoarded in for decades by the Open Data Community, it must be made “fit for purpose” and undergo rigorous conditioning to insure that it is “in shape” for consumption regardless of the use case. This is not the case today with the vast majority of what I call 3rd Party Data, most specifically what is sourced from the Open Data portals that now proliferate the landscape. Reference Data & Social Media data are better managed over their lifecycles because there was always a profit motive behind its creation, but it still has its challenges. I will leave that discussion to a future article. For now, let’s focus on the Open Data world.

Open Data now comes from both Government entities (and NGO’s) as well as Commercial interests. Both use these data sets internally to run their Organization and then “hive off” some (or all) of it for sharing with the Open Data Community. In most if not all cases, it is done as a side activity (begrudgingly) by the IT Staff who are always hard pressed to have enough staff, time & other resources to do their “day jobs”. This creates a dynamic that does not foster high quality data in any regard. To overcome this, we must have Data Leadership by those Executives who are accountable for delivering data products to the Open Data Community. They must insure that all data under their watch is representative of what would be acceptable internally by the Org, much less to a higher standard if possible.

We still live in a “Garbage In, Garbage Out” world. You cannot have successful (or believable) Analytics Outcomes without good data as foundation. Forget about creating Competitive Advantage if everyone continues to waste all their cycles on fixing bad data or questioning the source of their truths.

As there will be representatives from both the providers and the users of these 3rd Party data sources at DL 2015, I wanted to impart one basic message to all who are planning on attending; “Every type of data needs Data Leadership”.

As a community of data & analytics professionals we must insist that all data must be guided by some basic Governance principles that affect the useful lifecycle of the data assets that are being created and consumed. I look forward to discussing all of this further with everyone at Data Leadership 2015.

*This article appears in an edited form in the October 2015 issue of Information Age (http://bit.ly/1RCgB6p).