found id Corporate Information Factory (CIF) Resources by Bill Inmon, Inmon Data Systems

Corporate Information Factory

> home > view content

Data Warehouse And Contextual Data: Pioneering A New Dimension

Taurum per cornua prehende.

One of the intriguing aspects of data warehousing is that of the notion of the storage and management of data over a lengthy period of time. Data warehousing calls for the management of data over a five to ten year time frame or even longer.

In the past, classical operational information systems have focused their attention on the very current data that a corporation has. In the operational world the emphasis is on how much an account balance is, right now. Or how much is in inventory, right now.  Or what the status of a shipment is, right now. Of course every organization has a need to know about information that is current. But there is real value in looking at information over the spectrum of time as well. When an organization is able to look at information over a lengthy spectrum of time, trends become apparent that simply are not observable when looking at current information. One of the most important defining characteristics of the data warehouse is this ability to store, manage, and access data over a long period of time.

With the lengthy spectrum of time that is part of a data warehouse comes the awareness of a new dimension of data - that of context. In order to explain the importance of contextual information, an example is in order.

 

A SIMPLE EXAMPLE

Suppose a manager asks for a report from the data warehouse for 1995. The report is generated and the manager is pleased. In fact the manager is so pleased that a similar report for 1990 is requested. Since the data warehouse carries historical information, such a request is not hard to accommodate. The report for 1990 is generated. Now the manager holds the two reports - one for 1995 and one for 1990 - in his hands and declares that the reports are a disaster!

The data warehouse architect examines the reports and sees that one financial statement for 1995 shows $50,000,000 in revenue while the report for 1990 show a value of $10,000 for the same category.  The manager declares that there is no way that any account or category could have increased in value that much in five years time.

Before giving up, the data warehouse architect points out to the manager that there are other relevant factors that do not show up in the report. In 1990 there was a different source of data than there was in 1995. In 1990 the definition of a product was not the same as it was in 1995. In 1990 there were different marketing territories than there were in 1995. In 1990 there were different calculations, such as for depreciation, than there were in 1995. In addition, there were many different external consideration, such as a difference in inflation, taxation, economic forecasts, and so forth. Once the context of the reports is explained to the manager, the contents now appear to be quite acceptable and explainable.

In this simple but common example where the contents of data stand naked over time, the contents by themselves are quite inexplicable and unbelievable. However, when context is added to the contents of data over time, the contents and the context become quite enlightening.

In order to interpret and understand information over time, a whole new dimension of information is required, and that dimension is context. While content of information remains important, the comparison and understanding of information over time mandates that context be an equal partner to content. And in years past, context has been an undiscovered, unexplored dimension of information.

 

THREE TYPES OF CONTEXTUAL INFORMATION

There are three levels of contextual information that must be managed. Those levels are:

  • simple contextual information,

  • complex contextual information, and

  • external contextual information.

Simple contextual information is that contextual information that relates to the basic structure of data itself. Simple contextual information includes such things as:

  • the structure of data,

  • the encoding of data,

  • the naming conventions used for data,

  • the metrics describing the data, such as:

  • how much data is there,

  • how fast is the data growing,

  • what sectors of the data are growing,

  • how is the data being used, etc.

Simple contextual information has been managed in the past by dictionaries, directories, system monitors, and so forth.

Complex contextual information describes the same data as simple contextual information, but from a different perspective.  Complex contextual information addresses such aspects of data as:

  • product definitions,

  • marketing territories,

  • pricing,

  • packaging,

  • organization structure,

  • distribution, etc.

Complex contextual information is some of the most useful and at the same time, some of the most elusive information there is to capture.  It is elusive because it is taken for granted and is in the background. It is so basic that no one thinks to define what it is or how it changes over time. And yet, in the long run, complex contextual information plays an extremely important role in the understanding and interpretation of information over time.

External contextual information is information outside the corporation that never the less plays an important role in the understanding of information over time. Some examples of external contextual information include:

  • economic forecasts,

  • inflation,

  • financial,

  • taxation,

  • economic growth, etc.

  • political information,

  • competitive information,

  • technological advancements,

  • consumer demographic movements, etc.

External contextual information says nothing directly about a company but says everything about the universe that the company must work and compete in. External contextual information is interesting both in terms of its immediate manifestation and in terms of changes over time. Like complex contextual information, there is very little organized attempt to capture and measure this information. It is so large and so obvious that it is taken for granted. But because it is so large and obvious, it is quickly forgotten and difficult to reconstruct when needed.

 

CAPTURING AND MANAGING CONTEXTUAL INFORMATION

One of the reasons that both complex and external contextual information are so hard to capture and quantify is that they are so unstructured.  Compared to simple contextual information, external and complex contextual information is very amorphous. Another mitigating factor is that contextual information changes quickly. What is of interest and relevant one minute is irrelevant and passe the next. It is this constant flux of the state of matters and the amorphous state external and complex contextual information that makes these types of information so hard to systematize.

 

LOOKING AT THE PAST

One can make the argument that the information systems profession has had contextual information in the past. The past attempts at dictionaries, repositories, directories, libraries, et al all are movements towards the management of simple contextual information. For all the good intentions, there have been some notable limitations of these attempts that have greatly short circuited their effectiveness. Some of the shortcomings of past attempts at the management of simple contextual information are:

  • the information management attempts were aimed at the information systems developer, not the end user. As such, there was very little visibility to the end user. Consequently, the end user had very little enthusiasm or support for something that was not apparent,

  • the attempts at contextual management were passive. A developer could opt to use or not use the contextual information management facilities. Many chose to work around those facilities,

  • the attempts at contextual information management were in many cases removed from the development effort. In case after case, application development was done in 1965 and data dictionary was done in 1985. By 1985, there were no more development dollars. Furthermore, the people that could have helped the most in organizing and defining simple contextual information were long gone to other jobs and/or companies,

  • the attempts to manage contextual information were limited to only simple contextual information. There was no attempt to try to capture or manage external or complex contextual information.

The Chinese have a saying - "we live in interesting times." Such an attitude is describes the state of affairs when it comes to the management of contextual information.  The information systems profession has had thirty years of becoming expert in the management of content of information. Because of data warehousing and its emphasis on information management over time, we are now embarking on the exciting new world of the understanding and management of context of information.  We are but in our infancy as a profession in coming to grips with context of information.