The value of any information or data product is its ability to inform the prediction of future events.

Data is always a measure of the past; for something to be recorded, it must have occurred. For data to be valuable, we must believe that past events can provide a guide to future events.

This means that the value of data is dependent on the available means of interpretation. Data without interpretation may have latent value but it is interpretation that makes it manifest. The method of interpretation can be any sensemaking and predictive process: a human brain, an organisation or a machine learning algorithm.

It’s worth noting that the distinction between data and model is a false dichotomy; data is always a model too. It has assumptions about the environment baked into it. But it’s useful to think of the two as separate and belonging to different orders. Data is a model that attempts to reflect the underlying reality of the past.

If certain data and a certain model can provide accurate predictions, both the data and the model are valuable. If predictions are not accurate, the model or the data or both are flawed. But they may still be valuable.

Anyone or anything that can predict the future accurately is intrinsically valuable. For businesses, being able to predict the future accurately will lead to either increased revenue, reduced costs or both. Anyone or anything that is perceived to be able to predict the future will also be perceived to be valuable—and so will be valuable. Indeed, it’s unclear whether the first category of value exists at all.

It is difficult to tell actual value from perceived value because predictions guide actions and actions change the course of events. And there is never a counterfactual—we cannot see what would have happened if we acted differently. We only have what happened.

When thinking about data products, this gives rise to the question; how can we create products that are perceived to support prediction?