REMINISCENCE - DATA BASE

by BERNARD A HODSON

Canada's oil industry pioneered data base development. After oil discovery in Alberta Imperial Oil created the largest data base in the industry, if not the world, consisting of 19 separate categories of data relating to Western Canadian oil and gas wells, covering rock porosity, permeability, chemical analyses, hydrocarbon presence, pressures, and production.

Creation involved overcoming several problems. Similar geological formations had different names within Provinces, a name linkage system was developed to ensure consistency. Complex geophysical data needed storage in a form for retrieval and analysis.

The identification of a well has to be acceptable in a Court of Law. A drilled well can have several hydrocarbon producing formations geographically distant from the wellhead so one well might have several coordinates. Manitoba's survey system was based on the position of the Red River, which has changed flow several times since Canada's birth, Saskatchewan and Alberta have the Legal Sub Division system, which divides the territory in to sections, each one mile square. British Columbia has the BC Centizone system.

A legally acceptable identifier required a search of all Canadian archives to check survey accuracy, several discrepancies being found. Acceptable identification was eventually established, with cross links to the category files, data for which frequently needed a different locator. This involved the creation of a 19th category to identify linkages.

The data base was established without knowing how it would be used. First uses were minimal but as word got around retrieval requests rose dramatically from what was now the WDS, the 'Well Data System'.

Not enough programmers were available to handle all the WDS retrieval requests and other important work. A study was undertaken to determine if there were any standard functions used in developed computer programs, (expecting seven or eight), so that some generalisation could be made. The analysis even fewer standard operations, enabling development of a generalised program, GIRLS (Generalised Information Retrieval and Listing System), which handled retrieval requests from the WDS, user friendly enough to allow non programmers to generate sophisticated retrieval requests. Details of the WDS were presented in the literature and to IBM, who incorporated the concepts in their MIS (Management Information System) offering.

Previously handled by a manufacturer produced Report Program Generator which required people to 'program' each report, GIRLS removed a boring chore from sophisticated programmers. This enabled them to further use their skills in the art and science of finding oil, foreshadowing the later development of a Turing Universal machine, which would subsequently be built with software rather than hardware.

The WDS was expensive to develop but the investment paid off handsomely. Alberta had a glut of oil and companies were limited in their production, using established 'quotas'. Every six months a request to the Court could be made for a quota increase, with a short response time to objectors. A simulation program had been developed, using information from the WDS, which had extensive information about competitor wells. The computer played the competitor quota request against Imperial's interests. In most cases the figures produced by the simulation refuted the competitor claims, saving Imperial hundreds of millions of dollars by not being assigned a reduced quota by the Court.

The WDS was well documented so that users understood the data content. Well locations were defined, the units used (linear,, pressure and flow parameter definition, geological names etc.). The developed documentation standards were distributed to affiliates worldwide.

A huge amount of technical data existed, (technical reports, seismic and geological information, production records, partnership agreements, technical correspondence). A way of meaningfully accessing this data was needed. What transpired was known as a 'double dictionary', forerunner of the later data base 'inverted file' technique. It involved the development of a set of key words, placing alongside them the names and locations of all documents containing that key word.

To search for documents containing several key words one scanned the file for the document list associated with the first keyword and then matched it with the list of documents associated with the second key word, repeating the operation for additional key words. The first computer generation of the 'double dictionary' used magnetic tape to store the files. The same concept was later adapted to magnetic drums, magnetic discs, then to compact discs, and was also transferred to affiliates around the world.