A business look at Data Mesh: Promote Data as Products

Disclaimer: ATTOW, Zhamak Dehghani is writing an (awesome and challenging) book about her vision. Therefore, I will keep myself as much as possible far from its (current) content, avoiding any spoilers. That is why I am not expanding myself much on Data Mesh's definition; for example, if you need some more, I recommend (re)read her brilliant blogs, e.g., How to Move Beyond a Monolithic Data Lake to a Distributed Data Mesh.

Before headstart with the content, I would like to shortlist what you’ll go away with after reading this article:

  1. Data Mesh can’t go along without data as a product mindset.
  2. Consider data as products that allow monetizing data (introduced in P&Ls).
  3. Data Mesh encourages the right culture by:
    - Fixing the (so-far inverted) responsibility on data: from consumers to producers.
    - Fostering feedback and transparent communication between consumers and producers.
  4. Data Mesh allows users to be recommended the right data (product) rather than search for it.

Data as a Product

Data Mesh is fundamentally sitting on several pillars, which are not technologies nor methods, but principles. This is why, IMHO, we see such enthusiasm about it. As it goes beyond architecture, it spans several data management areas; it touches many people or roles.

That’s the thing, and data management is a very vast split into several areas as introduced in this report, in which you’ll find the data management wheel proposed by DAMA (see figure below). Albeit those areas help to focus on specific challenges, they also tend to create silos between associated responsibilities — they shouldn’t…

To me, Data Mesh tackles one of the biggest and most bizarre issues resulting from this siloing: in the data world, consumers are responsible for validating the data they use.

© Data Management Wheel (DAMA, What Is Data Governance? Understanding the Business Impact)

Among all the fantastic concepts and principles that Zhamak proposes, I’ll focus in this article on Data as a Product.

In(very very) short, the domain that created or generated data manages it as if it would be like any other products — like a new insurance policy at WholeState, a new movie by Deesnay, or the latest uStuff from Pear.

So, Data Mesh gives back to the domain the responsibilities to deal with the product's quality alongside its availability, robustness, compliance, etc.

The one aspect of products I’ll focus on here is the one that drew my interests as an entrepreneur and product owner: its promotion.

Successful Data as a Product

It is known that no product really sells itself. Even the most brilliant and useful products in our history needed sufficient awareness. Of course, creating awareness for a product is a debate in itself; nevertheless, we can agree that it doesn’t focus on the features, especially innovative products.

A (data as a) product needs to be understood by its primary consumers. They must be told first what its benefits are instead of crafting one by themselves or using another.

Often with data, those primary consumers (internal business, external customers) have requested it; they have primarily defined their needs. The domain concentrates on creating the trust (e.g., defining SLOs, …) about the product it manages — which comes at a certain price.

The good news is, from that point, your data is generating value. Thus, like IT can bill its services and machines, P&L of the domain could also include data products.

However, if you limit your data product to only the first (few?) business users, it will look like what we call “a customized product”, which is the wrong approach for products in general.

Consequently, because of the investments, a domain will have to put in their data products, and their interest in improving their P&L, data products need to be promoted to be successful. In the beginning, mainly the initial use cases will be used; this will increase your base of consumers, which will find other use cases with the data. These new use cases are key to growing your product’s footprint on the market, and this is why the domain must know about them and spread the word about them.

A question that we may ask then is,… how? How do we create awareness about these use cases in our fast-paced world?

We’ll come to that below; however, there is another question to be considered first: how do I know about these use cases?

(How to) Know your customers

Because managing a product requires investments (time, resources, …), the domain will see interest in seeing the return on them. This interest will be even more important as the investments and the value of the data is reported in the P&L.

As usual, to control the ROI, you will either find ways to minimize the cost or maximize the revenue. Let’s discuss the second (as it is funnier). It is pretty common to value data per its usage and its usefulness for a business case, such as marketing segmentation. Therefore, to maximize the revenue of a data product, the domain must find alternative usages to expand the footprint (importance) of their (data) product.

For this, it would be funky to look at some growth hacking techniques, which are dedicated to increasing the AAARRR:

  • Awareness: the number of use cases you know and talk about your data.
  • Acquisition: the number of use cases envisioned for your data.
  • Activation: the number of products that tested your data.
  • Retention: the number of products that tried your data more than once
  • Revenue: the number and amounts paid by-product for your data.
  • Referral: the number of times your data is referred (more on this in the last section).
Intuitive representation of growth hacking for data as a product

Improving those metrics will depend on two important components: your product's overall quality and the feedback from the fields.

Of course, growing the product impact exposes the domain to a broader audience, and therefore the responsibility of its quality. In other words, it is expanding its circle of concern (cf 7 habits of highly effective people) by increasing the number of customers and the diversity of goals.

Unconsciously or not, this can be the biggest blocker or reticence to consider data as a product, as no one likes to be exposed, at least without being appropriately equipped.

Let’s dig this blocker a bit further and unveil the two underlying critical fears:

  • What if my product is misused? (e.g., Cambridge Analytica)
  • What if I screw up? (e.g., I change my data whilst the CFO uses it for the annual report)

Totally fair, right? Yup, because being responsible for a product means you are response-able for it (cf Thinking, fast and slow). The domain must be able to respond to all situations associated with their data.

And you know what’s funny? Well, consumers will have the same concerns 😄.

However, this is not an excuse to give up! My answer to this is to create an open and transparent bilateral communication flow between the domain and the consumers on, at least:

  • SLOs (service level objectives): the consumers know which datastrophes they might have to face (sooner or later).
  • Context (use cases, applications, projects, …): the domain is always aware of new or updated usage of their product and reduces the frustration linked to those unknown unknowns.
Information flow between domain and consumers

Because the number of consumers and usages will scale, the challenge will be to create and maintain this flow manually, increasing the costs and lowering the productivity of both the domain and consumers. Therefore, I recommend this information to be as much as possible generated automatically, persisted, and accessible directly from the applications and tools (contact Kensu or me to know more about this).

At this point, the domain has the opportunity to hack the growth of its product, which is the topic of the following section.

At this point, the domain has the opportunity to hack the growth of its product, which is the topic of the following section.

At this point, the domain has the opportunity to hack the growth of its product, which is the topic of the following section.

How to promote the product efficiently

Earlier, I said that products are not selling themselves, then I explained what communication has to take place to allow the domain to hack the growth of a (data as a) product.

This section will conclude with an alternative way to create awareness about your product by shifting how consumers are directed to valuable data products.

Typical data catalogs are used to find data. As they are usually composed of a descriptive metadata repository (the type of columns, lineage references, …) and glossaries, they consider data features as first-class citizens.

Therefore, with such data catalogs, end-users can get to the data they were looking for.

However, it is not sufficient to find the data they would be searching for.

Let me make an analogy with Google to make this point clearer.

Google search can be used as a shortcut for a specific website you have in mind or a specific section: searching “gmail” or “price Tesla”. Google will get you to this specific content quite easily.

Or, you could use it to find contents (website, PDF, doc, data, …) about the information you are searching for: like “online mailbox” or “costs of electric cars”. In this case, you’ll be recommended the same websites, of course, although you may not have been aware of them yet (OK… GMail and Tesla are not such great examples…). So you discovered them.

What a search engine is really able to do is to recommend (if you’re lucky) the right content matching your needs, but it doesn’t stop here. They can also recommend associated research, which allows the end-users to discover alternative ways to achieve their goals!

Very clever…

This is possible because such catalogs (Google search is a catalog) can leverage the relationship between contents and the queries. More importantly, they are information from the complex graphs it is forming to shed light on the underlying use cases and goals.

This explains why data catalogs are evolving towards the usage of so-called active metadata — the information about “how is the data used and what for”. You have probably recognized what this information is! This is the context information that consumers must provide to the domain from the previous section.

For Data Mesh, composed of several data products, as shown in the below picture, the contextual information is composed of both the information about data intelligence applications (cf this blog) and their participation in larger projects (business cases).

Data Mesh — Data as Products — Applications — Projects

Therefore, a search engine can use this context to recommend (data) products to potential new consumers due to queries having similarities to known use cases at work. Obviously, companies becoming data-driven must enable such capabilities to scale their organization.

As you will read in the forthcoming book, the context (business needs, code, system, …) is one of the important components of Data Mesh — how you leverage it will define your success at its implementation.

What’s next

We are only scratching the surface of Data Mesh's capabilities; I am looking forward to being amazed by what Zhamak Dehghani is thinking about. I’ll try my best to be helpful whenever I can on that topic…

In the meantime, if you are looking for solutions to automate the communication flow discussed here, including the monitoring capabilities, I recommend contacting Kensu and learn about its Data Intelligence Management Platform.

I get energy from using his expertise in mathematics, data technologies to build innovative solutions to help organisations - former Spark Notebook creator