If you’re creating a data governance framework to support an agile infrastructure, you need to ask the right questions to ensure that you’re not just safeguarding data but also setting your culture up for agile innovation and collaboration.
As the world becomes more connected, more data processing is being pushed to the edge. According to Gartner, by 2022, more than half of enterprise-generated data will be created and processed outside of data centers and the cloud. And that’s a prediction that was made before COVID-19, which is accelerating remote working and monitoring outside the confines of an office.
With data being generated in more locations and privacy regulations becoming more stringent, maintaining a balance of both data protection and broad access is becoming more complicated. But innovation can only move forward if the integrity and security of the data fed into all the technologies we use in such diverse and disseminated ways is ensured. For example, some have attributed recent declines in artificial intelligence (AI) investment to a lack of trust in the data feeding these algorithms. Good governance provides visibility and quality that boosts confidence in AI and other advanced innovations.
If you’re creating a data governance framework to support an agile infrastructure, you need to ask the right questions to ensure that you’re not just safeguarding data but also setting your culture up for agile innovation and collaboration. Below are some of the most crucial questions to ask, developed with help from senior account executive Mike Dampier:
1. Who deploys data?
Historically, data protection and governance have been challenging topics of conversation for decision makers as technologists urge business users to take ownership over “their” data. Today those discussions are even more complex. As business teams are called upon to innovate with data across multiple platforms, IT and business now share data deployment and management responsibilities.
But data deployed by IT and business units likely has very different lineage, provenance, and lifecycles. We recommend that IT still owns and deploys enterprise data, which is application-sourced, trusted, transactional, and master data. It’s the business team’s job to integrate this data with other sources, like external public domain data (weather) or purchased data (demographics/psychographics) in their sandboxes, lakes, and other environments.
It’s crucial to know who is using these data types and how frequently they are accessing it. We had this need in mind when we built Teradata Vantage, making it easier to monitor user login sessions and queries. Other platforms may require you to develop and deploy various monitoring methods — some fully automated, others mostly manual — to gain this visibility.
2. Who governs what?
This question can be tricky. If you’d guess that the group that ingested the data should govern it, then you would be partially correct. The business still owns all of data governance from a quality perspective as well as the business-related metadata associated with enterprise data. But now the business also owns the governance of the business context (business and technical metadata) of the data they are ingesting to ensure that it can be integrated (if needed) with enterprise data.
3. Where is data deployed?
Ask this question and you’ll likely generate some vigorous discussion. If you’ve been shopping around for a cloud vendor, you’ve likely heard this line over and over again: “All your data should be with us.” We recommend deploying some of your data in the data lake, whether that’s on-premises or in the cloud. But some of your data should also reside in sandboxes, labs, and data marts. Additionally, your clean, curated, and trusted data should be in your data warehouse.
Most importantly, the platforms should be interconnected via a high-speed fabric that enables physical and virtual projections. IT is responsible for setting up the right data governance framework to support this infrastructure. Deploying centrally managed, scalable data virtualization technologies facilitates data sharing at runtime without having to build complicated data synchronization processes. This has real business value for certain analytical use cases and should be deployed as a standard capability within your analytical ecosystem.
4. Does everyone agree on data definitions?
Business users need to know where to find the data they need and what that data means. As Jeff Burk at Datanami writes, “The amount, variety, and scope of business data available is growing exponentially, making it increasingly difficult to find, understand, and trust.” Burk recommends creating systems that help IT and business speak the same language. For example, a governed data catalog could translate data into business terms and connect and relate various data sets.
5. Are your people dedicated to data governance?
Like any government, setting up the right technologies and processes do not guarantee good leadership. It’s the people that determine whether governance is actually going to work. The tools and processes set up as part of a data governance framework will only work if people keep them up to date.
Communication between IT and business is critical, every step of the way. As IT sets up secure access to enterprise data and self-service analytics, it is business’ job to be open and candid about what’s working and what’s not. The business can lean on IT as a resource to manage processes once they are created. As for IT, they should expect to take on new governance roles in order to create a thriving, self-provisioning, and cost-effective analytical ecosystem.
It’s worth taking a moment to ask yourself these questions as you strive to design best-in-class data governance. The payoff of good governance goes beyond data protection and integrity, driving business value, more informed strategic decisions, and a more agile, collaborative culture.
Curious about how Teradata Vantage can help you with data governance?