February 28, 2020 Feature

Ingredients of a Data Strategy

By Mike Fleckenstein

As our knowledge-driven economy evolves to recognize the focal role that data plays in every human endeavor—learning, governance, commerce, health, social and entertainment, and spiritual, and even extending to conflict—the legal community must develop a much more detailed understanding and nimble familiarity with the nature of that data, its value to the entity, and the many aspects of its existence in use across the array of human transactions.

Success today and in the future depends on this appreciation for the role of data assets in business and government institutions; the fundamentals of that understanding and familiarity with data’s use form the core of a data-aware business strategy. This article lays out essential data strategy elements, misconceptions, and considerations, all of which must be understood in order to be successful in developing such a strategy, whether in a business entity, government agency, an individual legal practice or other capacity advising a data custodian.

Business Strategy as a Driver for Managing Data as an Organizational Asset

Simply put, a business strategy exists to maximize profit, minimize cost, and manage risk. In the public sector and in regulated industries like banking and health care, compliance is also a major driver. This does not necessarily translate into spending the least amount of money possible or making the most amount of money without contextual awareness. If an organization wants to provide superior customer service, there is an associated up-front cost that the company might aim to recoup through things like loyalty and higher-end products and services.

As increasing numbers of organizations, both in the private and public sectors, become data-oriented, business models that show how data is driving value are developing. While these models are still evolving, some have been documented. For example, a Deloitte study1 identifies the following six types of data-oriented business models:

Product innovators enhance their products and services with data. This common model is often the basis for data monetization. In other words, organizations look for new patterns in their data, using analytics, to wring more money out of their existing or related products or services.

  • Systems innovators use data to integrate multiple product types. Here, an organization like a sports apparel company, might expand into a related business, like wearable fitness trackers, and integrate their offerings.
  • Data providers gather and sell raw data without adding too much value to it. One example might be a cellular phone provider that sells anonymized network data to another company that performs crowdsourcing analytics.
  • Data brokers gather and combine data from multiple sources, create additional value with analytics and sell insights. This common model is reflected, for example, in credit rating agencies. These agencies obtain their data from numerous sources, cleanse it, integrate it, and sell subscription services.
  • Value chain integrators share data with system-integrator partners to extend product offerings or reduce costs. A good example of this model is a government agency. For example, the IRS shares its tax information with partner agencies for purposes of law enforcement and health care. The IRS, in turn, also leverages data from these and other agencies to reduce tax fraud.
  • Delivery network collaborators share data to drive deal making, foster marketplaces, and enable advertising. Similar to value chain, here participants share data to spur markets. One example is the sharing of data between airlines, booking agencies, car rental agencies, hotels, and advertisers.

It’s All About Analytics, Right?

A data strategy directly supports some aspect of the business strategy. This is more easily done in some models than in others. For example, enhancing a product or service by the application of data analytics is conceptually relatively straightforward. That is why the product innovator model is so popular. Numerous books have been written on data monetization around this model.2 A belief that organizations can improve their product or service is also a primary reason why organizations long to implement analytics, as data-analytics is the engine behind data monetization.

A similar observation can be made for cybersecurity (i.e., data security). To foster the hunger for analytics and cybersecurity, data science and cybersecurity degrees are significantly on the rise and companies are willing to pay handsome starting salaries to college graduates in these disciplines.3 These are the two most high-profile data management disciplines today—one focused on maximizing profit, the other on minimizing cost (or risk).

However, organizations regularly overlook other, equally important data management domains, such as data quality, data governance, data architecture, or metadata management, when formulating their data strategy.4 That is because it is exceedingly difficult to relate improvement in these disciplines to direct business outcomes.

Data quality is a good example. It may feel somewhat intuitive that high data quality translates into better business insight, but how do you tie improved business performance to an improved data quality metric? I’ve asked this question of multiple chief data officers, and the best answer they gave is that by improving the quality of targeted data for a peer group, they prompt their peers to publicly proclaim a specific cost savings or revenue gain. While such an answer may help fund further data quality initiatives, it is a far cry from a solid metric.

In addition, data quality is multifaceted, with the degree of quality specific to business need. Ralph Kimball demonstrates this in his “data highway,” which depicts multiple caches of increasing quality along with increasing latency.5 Here, loosely coupled data stores each accommodates business needs in an appropriately timely fashion. Represented on the left side of Figure 1, certain situations that are extremely time-sensitive, such as fraud detection or cyber-attack detection, demand immediate access to raw data but sacrifice data organization and quality. At the other end of the spectrum is the enterprise data warehouse with highly structured snapshots of data, collected over time, and the ability to produce repeatable, consistent reports, with significantly higher data quality, albeit at significantly higher latency. Thus, the better question to ask is: What’s the right data quality for my intended use?

Why, you might ask, might data quality be equally important as analytics when attempting to improve a product or service? What is hidden behind analytics is both the preparation it takes to extract meaningful insight from data as well as the stewardship required to manage the data that led to the insight in the first place. Wrangling data so that it can be effectively analyzed regularly consumes 80% of an analyst’s time.6 Do we really want to pay those handsome salaries to data scientists to spend the majority of their time wrangling data?

To further underline the importance of data quality, data preparation tools, like Tableau, that allow individuals with limited technical insight to manipulate, integrate, and visualize data more directly are pushing data wrangling into the hands of business users. This is helping to pique businesses’ interest, a good thing. However, business users can fall prey to inherent limitations: They may only be aware of data from their business unit; they often have limited insight into the underlying data model; and they may not realize other ways in which the same information is already being compiled and reported. This risks significant duplication of effort across the enterprise and a high level of data inconsistency if the data is not adequately governed.7 These are all examples of why a data strategy is needed.

Crafting the Data Strategy

Frequently, data strategies stress key visionary aspects like “data should be secure,” “data must be transparent,” or “data must be accessible.” It is true that these are important underlying principles for a data strategy, and organizations should give some thought to how these visionary principles fit into their data strategy. However, to be effective, a data strategy must outline some path towards realization.

At a minimum, the agreed-upon visionary principles should be underscored with achievable metrics. So, if accessibility is a visionary principle, you might require outward-facing applications to work on mobile platforms. Similarly, if security is a key visionary principle, you might require your mobile applications to have multifactor authentication. Beyond such metrics, it is now up to teams of business and IT professionals to craft the data strategy in more detail.

Assume a business strategy that is responsive to a business goal or problem. Analysis has broken down the goal or problem and identified major pain points and their priority to the business, and we now have some appreciation that data is key to addressing them.

To focus on the best data management technique(s) in addressing one or another pain point, we can begin by breaking the goal or problem into more succinct data management domain-specific questions. This will allow us to narrow the focus and employ—or at least begin with—the right data management domain to address it. The following list of sample business pain points and data management questions offers some insight. Though by no means exhaustive, the list provides examples of key data management questions that, when addressed, are likely to help address stated business problems.

Business Goal/Problem: inability to trace product supply-chain

Data Architecture: What data exists within my organization, where is it located, and how does it flow through the enterprise?

Business Goal/Problem: reduce inconsistent executive reporting

Data Governance: How can my organization reduce the risk of making business decisions based on poor or incorrect data?

Business Goal/Problem: unable to adequately address fraud using the enterprise data warehouse

Data Quality: What is the required degree of data quality for different data in my organization based on how we use that data?

Business Goal/Problem: legal or regulatory need to surface the right documents for eDiscovery or compliance

Metadata Management: What are the demands on my organization to surface historical structured and unstructured data?

Business Goal/Problem: customer care wants to definitively determine what is causing rashes when customers apply their lotion

Master Data Management: How does my organization accommodate a holistic view of product information?

Let’s look at the last example more closely. This is a real-world example from a large consumer-care company. A given beauty product from this company can have different ingredients by country, based on regulatory demands. Additionally, inactive ingredients can vary over time, based on market prices. Such variability forces the right master product definition to quickly address customer concerns.

In this case, the company set out to define a master data profile for product. Internal departments worked together to jointly determine the most important attributes of “product” to effectively meet each department’s business needs. This resulted in a robust product master of about 200 attributes. However, in building the product master, the team also had to take into account several other data management domains. For example, data quality was required to ensure the “right” quality product master; data architecture became important to ensure that attributes in the product master originated from the right authoritative source; and data governance needed to be addressed to determine how the product master can change over time.

Experience shows it is prudent to begin addressing a business goal or problem with a narrow angle when it comes to data strategy. In other words, it is useful to start with a specific data management domain to address a business issue. As the example above illustrates, the total solution typically encompasses multiple data management domains. However, starting with all required data management domains can feel overwhelming. The data domain entry point usually depends on multiple considerations. For example:8

  • What’s the business problem/goal? Do we want to tackle ineffective eDiscovery or a new product offering? The former typically starts by focusing on records management, while the latter might start with analytics. In each case, data quality will surface to play a key role—metadata for eDiscovery and data accuracy and consistency for analytics. Other domains that are likely factors include data architecture (e.g., the location of authoritative sources) and data governance (e.g., how will new sources be integrated?).
  • What are our existing skills, practices, or systems? Organizations often employ informal data stewards, long-time users who take it upon themselves to manage data for their part of the business. In other cases, organizations may already be full of business analysts employing self-service reporting and analytics tools, signaling that data architecture may be a good starting point.
  • What are the known applicable legal constraints on the use of data? If it is health care information in the United States, HIPAA likely applies. If it is personally identifiable information (PII), the Privacy Act applies to any federal agency collection or custody, and many state interactions. The team might address privacy constraints by examining architecture/interfaces, or data governance.

Executing the Data Strategy

So far, we’ve looked at how we might employ the right data management domain(s) to address a given business concern. But to effectively execute a data strategy, there’s more at play. A growing focus on topics like privacy and security are forcing both public- and private-sector organizations to address data management at the corporate level. Both the public and private sectors are trying to determine the right formula. This is evidenced by examples of the increasing variety of data-related leaders in different organizations, as shown in Figure 2. In addition, increasing complexity in the data management landscape raises the prospect that each entity in the data life cycle may be supported by its own legal advisor, each with a potentially unique priority (privacy, security, revenue maximization, consumer protection, or other data-driven concerns unique to the transactional domain). Arguably, the chief information officer position was created to help manage the rising use of “information technology.” However, over time the most pressing responsibilities of the CIO have evolved to become systems and networks. The growing focus on data has led to a tide of related information-management-oriented positions.

Of these positions, the chief data officer is getting particular attention. In the U.S. federal sector, the recently enacted Foundations for Evidence-Based Policymaking Act of 2018 stipulates a CDO for each federal agency.9 The private sector also is assertively hiring for this position.10 The recent “fourth annual” Gartner survey of actual, in-place CDOs reflects the importance of the CDO. For example, for 2018, 49% of CDOs reported to a top business executive and 53% of organizations surveyed increased their data and analytics budget from the past year, both trends that have been on the rise for the past few years.11

The survey also reflects a shift, albeit small, towards a more strategic versus a tactical focus. This is primarily highlighted in organizations’ tying the CDO’s efforts to business value. Much of the CDO’s focus, though, remains in “defensive” versus “offensive” data management. This interesting observation has its basis in a 2017 Harvard Business Review article, “What’s Your Data Strategy?”12 The authors argue that a data strategy can be separated into “offensive” and “defensive.” Data defense, the authors argue, is about minimizing risk and is focused on data management areas such as data privacy, data security, data quality, and data governance. Alternatively, data offense focuses on increasing the organization’s competitive position, resulting in higher revenue, profit, and customer satisfaction.

Hence, an organization that has a business strategy more focused on revenue and profit might develop a data strategy that is more offensive, whereas one with a business strategy more geared toward compliance might develop a data strategy that is more defensive. The authors clearly state that all organizations need both defensive and offensive data strategies but that the balance of where emphasis is placed depends on the business model. One is left wondering, given the Gartner survey responses, whether the across-the-board focus on “defensive” data management is in part based on the need to focus on foundational data management domains, like data quality, in order to effectively execute on more “offensive” data management, like analytics.

No doubt the scurry to hire CDOs is in part driven by organizations’ desire to leverage analytics. However, the Gartner survey of actual, in-place CDOs reflects many other concerns, including the need for a culture willing to adopt change, lack of resources and funding, poor data literacy, and lack of relevant skills, to name the top four. Such responses and the proliferation of related positions indicate that, while organizations realize the need for senior-level data leaders—at least in name—specific roles and responsibilities continue to evolve.

Multiple engagements with active CDOs across many lines of business confirm that all of the above are true. CDOs consistently state the need to actively engage their peers, like the CIO, CFO, COO, and others, to create awareness—a key ingredient in changing culture. CDOs in the public sector even stress the importance of working with lawmakers to ensure upcoming legislation gets it right on data management. Much of this effort to educate others is driven by the difficulty of directly tying data value to business outcomes.

Valuing Data to Underscore Data Strategy

Today, it is common to hear some variation of the phrase “treating data as an asset.” This is an interesting phrase that conveys peoples’ and businesses’ desire to value data like physical assets. This is, of course, because physical assets can be valued at every point in the supply chain and an improvement or fault at any point is readily reflected in terms of revenue and cost. This is harder to do for intangible assets like patents, copyrights, trademarks, brands, and data.

Over the last 20 years, there have been repeated attempts to assign value to intangible assets. One study examined the degree to which value has been tied to organizational intellectual property. It showed that over time, intangible assets increasingly determined corporate value. In studying all nonfinancial, publicly traded firms in the Compustat database, the report showed that in 1978, 80% of a firm’s value was associated with its tangible assets and 20% with its intangible assets. By 1988, the makeup had shifted to 45% tangible assets and 55% intangible assets. By 1998, only 30% of the value of firms studied was attributable to their tangible assets, whereas 70% was associated with the value of their intangibles.13

Another study looked at different approaches previously researched to value information, including the different accounting models based on cost, market value, and revenue potential.14 The report concluded that the best cost approximation of data is based on future cash flow, something not always easy or even possible to accomplish. The report had another interesting conclusion on why data may not be formally treated as an asset on corporate books: It is financially advantageous for companies to treat the cost of information as an expense rather than an asset because companies can avoid the tax implications of showing data on their balance sheet. It is interesting to note that, in the above CDO Gartner survey, 8% of organizations actually kept an internal ledger of key data assets.

The Internal Revenue Service (IRS) guidelines for valuing intangible assets lists “technical data” as one type of intangible asset.15 According to the IRS, the value of an intangible asset can be determined in the same way as for tangible assets: using a cost basis, gauging its value in the marketplace, or basing it on the revenue potential of the asset in question. The rules allow auditors to apply any combination of these approaches.

To that end, new approaches keep popping up. Published in 2018, the book Infonomics attempts to value data using standard formulas.16 I’ve run across initiatives in which organizations evaluated the quality of customer data using the Infonomics model, tied the results back to the amount of revenue generated by a given customer, and projected lost revenue due to lacking data quality.17 On a larger scale, governments have made attempts to value their data as well. For example, in a 2013 study, the UK assessed the value of its public sector information.18 There was even a patent (U.S. Patent No. 10,210,551B1), issued in February 2019, that attempts to provide automated data valuation techniques using data relevance score calculations for a set of specific domains.19

There’s no doubt that a clearer data valuation model will aid any data strategy. As more data valuation models are created, organizations will find they can better manage their “data as an asset,” and our government might take increasing interest in taxing “data as an asset.” Just as with physical assets, such valuation will drive improved data strategies.

Conclusion

Today’s data strategies must reflect business value by being closely tied to the business strategy. That is hard to do with data. It is often done by crunching large amounts of data (i.e., analytics) to improve products or services. As the variety of data-oriented business models and data leadership steadily increases, leaders—and supporting legal staff—will be progressively held to defending their analyses. Basing a data strategy on robust underlying data management practices significantly eases this burden.

Endnotes

1. See Deloitte The Netherlands, New Business Models with Data: Point of View (Oct. 7, 2014), https://ec.europa.eu/futurium/sites/futurium/files/deloitte_pov_-_new_business_models_with_data.pdf.

2. See, e.g., A. Wells & K. Chiang, Monetizing Your Data (Wiley 2017); S. Liozu & W. Ulaga, Monetizing Data—A Practical Roadmap for Framing, Pricing & Selling Your B2B Digital Offers (Value Innoruption Advisors Publ’g 2018).

3. See, e.g., J. Mitchell et al, Which College Graduates Make the Most?, Wall St. J., Nov. 20, 2019, at Computer and Information Sciences.

4. For a more complete list of data management domains, see Data Mgmt. Ass’n, DAMA-DMBOK2 Framework (Mar. 6, 2014), https://dama.org/sites/default/files/download/DAMA-DMBOK2-Framework-V2-20140317-FINAL.pdf.

5. R. Kimball, Kimball Grp., Newly Emerging Best Practices for Big Data (Sept. 30, 2012), http://www.kimballgroup.com/wp-content/uploads/2012/09/Newly-Emerging-Best-Practices-for-Big-Data1.pdf.

6. S. Lohr, For Big Data Scientists, “Janitor Work” Is Key Hurdle to Insight, N.Y. Times (Aug. 17, 2014), https://www.nytimes.com/2014/08/18/technology/for-big-data-scientists-hurdle-to-insights-is-janitor-work.html; A. Ruiz, The 80/20 Data Dilemma, InfoWorld (Sept. 26, 2017), https://www.infoworld.com/article/3228245/the-80-20-data-science-dilemma.html.

7. R. Van der Muelen, Managing the Data Chaos of Self-Service Analytics, Gartner Research (Dec. 17, 2015), http://www.gartner.com/smarterwithgartner/managing-the-data-chaos-of-self-service-analytics/.

8. For a deeper dive into addressing business issues with the right data management domain, see M. Fleckenstein & L. Fellows, Modern Data Strategy, at ch. 6, Implementing a Data Strategy (Springer 2018).

9. See Foundations for Evidence-Based Policymaking Act of 2018, Pub. L. No. 115-435, 132 Stat. 5529, https://www.congress.gov/bill/115th-congress/house-bill/4174.

10. See, e.g., Rethinking the Role of the Chief Data Officer, Forbes (May 22, 2019), https://www.forbes.com/sites/insights-intelai/2019/05/22/rethinking-the-role-of-chief-data-officer/#75c0f0721bf9.

11. See the webinar and accompanying attachment: M. Rollings, The Current State and Future of the Chief Data Officer, Gartner (2019), https://www.gartner.com/en/webinars/25161/the-current-state-and-future-of-the-cdo.

12. L. DalleMule & T. Davenport, What’s Your Data Strategy?, Harv. Bus. Rev. (May–June 2017), https://hbr.org/2017/05/whats-your-data-strategy.

13. M.M. Blair, Brookings Inst., Unseen Wealth (2001). Note: Dr. Blair defines the value of intangible assets as the difference between a company’s market value and the value of the company’s tangible assets. Not everyone agrees on this formula. However, there is agreement that intangible assets are growing as part of overall corporate asset makeup. See, e.g., S.S. Harrison & H.P. Sullivan, Edison in the Boardroom: How New Leading Companies Realize Value from Their Intellectual Property, at A Brief History (John Wiley & Sons 2006).

14. D. Moody & P. Walsh, Measuring the Value of Information: An Asset Valuation Approach, ECIS (1999).

15. IRS, Internal Revenue Manual § 4.48.5, Intangible Property Valuation Guidelines (Autumn 2014), http://www.irs.gov/irm/part4/irm_04-048-005.html.

16. D. Laney, Infonomics (Gartner Inc. 2018).

17. This has been done by Yasith Fernando of EY. Yasith can be contacted at yasith7@gmail.com.

18. See Deloitte, U.K. Dep’t for Bus. Innovation & Skills, Market Assessment of Public Sector Information (May 2013), https://www.gov.uk/government/publications/public-sector-information-market-assessment.

19. See Calculating Data Relevance for Valuation, U.S. Patent No. 10, 210, 551 (filed Aug. 15, 2016) (issued Feb. 19, 2019), http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=%2Fnetahtml%2FPTO%2Fsearch-adv.htm&r=10&p=1&f=G&l=50&d=PTXT&S1=Todd-Stephen.INNM.&OS=IN/Todd-Stephen&RS=IN/Todd-Stephen.

Entity:
Topic:

By Mike Fleckenstein

Mike Fleckenstein is a Principal with The MITRE Corporation. ©2019 The MITRE Corporation. All rights reserved. The author’s affiliation with The MITRE Corporation is provided for identification purposes only, and is not intended to convey or imply MITRE’s concurrence with, or support for, thepositions, opinions, or viewpoints expressed by the author.