In last week’s blog post (click here to read) we assessed the challenges of migrating diverse Oil & Gas data to corporate data stores.  This week, we will explain the metadata-driven method we use to support the migration of legacy databases to our clients’ standards.

Our Transformation Manager migration tool is used in a once-only ‘Prepare Phase’ to automatically create transforms that will be used during every project migration. These generic transforms will be run during the Extract, Verify and Update phases of the migration process (see the diagram below).

Data migration process

Clients’ standards are defined using two spreadsheets; this is the starting point for each migration of a legacy project.

The standard values spreadsheet (see below) contains a worksheet for each table that includes their standard values. These tables contain a number of verification code tables (_VC), a number of reference data tables (_R), and other tables, e.g. OW_DATA_SOURCE, that contain standards data. Each worksheet consists of a header row specifying the fields defined by the company’s standards, followed by rows which provide the set of permitted values.

Standard-Values-Spreadsheet

The data dictionary spreadsheet, shown below, defines data to be entered into the OW Data_Dict Table to specify default values or data ranges covered by the company standards.

Data-Dictionary-Spreadsheet

 

 

 

 

 

 

 

The data dictionary spreadsheet is used to set default values, which may be null, and a range of values that each field must conform to, expressed either as a numeric value range, e.g. for CASING_SIZE ‘0..10e+4’, or as a set of value strings.

Any project can be updated by running a command line script during an Extract phase to determine any non-standard values used by the project. This automatic process reviews the project tables, determining any values currently used which require updating to conform to the client’s standards. A set of mapping CSV files is created with values currently used which are not to standard.

A mapping file (.CSV) is created for each table and its natural key field, as in the spreadsheet below, which contain values that are not present in the client’s standards.

Example-Mapping-File

 

 

 

 

 

 

 

 

An absent mapping spreadsheet indicates that all current values are consistent with the standard values. A mapping file initially contains the set of existing values used in the project and forms the interface for a user to enter a standards value to be used instead. When mapping spreadsheets have already been created for a previous project, these can be provided and will be updated during the extract process. In this case, the spreadsheet will contain some new standard values as recommendations for the project migration.

One of the principal activities the user undertakes, referred to as the Review phase, is to check the mapping spreadsheets and provide any standard values that are required to be used instead of the non-standard presently used.  Once this has been completed, the user again runs a command line script to verify that once the client’s standard values are applied, the project will be consistent with the standards. Transformation Manager transforms are run to verify that this is the case.  The output is a set of reports; one of these as viewed by Excel is shown below.

Example-Review-Report

 

 

 

 

 

During the Verify stage, the user will correct all non-standard values and data dictionary errors until the reports indicate that a completely consistent project will be produced if migration proceeds.

Up to this stage, the project data has been accessed in read only mode and the project has been available to all other users. It is not until the final Upgrade phase of the project migration that exclusive access to the project is required.

Once verification has been completed, the final Update phase is again run from the command line and completes the migration process.

The Extract, Review and Update phases implemented using Transformation Manager are completely automatic, and provide warnings if the migration process is not completely compatible with the client’s standards. The main user input is in the Review phase, where a user with knowledge of the project specifies which standard values will be used to replace any non-standard values presently used.

Throughout the migration phases, the user operates in a simple user environment, with minimal involvement of system or database administrators. The project remains online until the final Upgrade phase, which requires exclusive access to the project database for a relatively short duration.

In conclusion, if the business wants to leverage the benefits of corporate data which conforms to a defined set of standards, using metadata-driven standardization tools will enable the business to truly implement and maintain the standards. Also, such a programme can be delivered in a fraction of the effort, time and risk that often adversely affects even well supported initiatives.

Download and keep this information in our free white paper: Implementing corporate data standards in the field using metadata-centric tools.

Further reading

    Did you like this article? Get our new articles in your inbox with our occasional email newsletter.

    We will never share your details with anyone else, except that your data will be processed securely by EmailOctopus (https://emailoctopus.com/) in a third country, and you can unsubscribe with one click at any time. Our privacy policy: https://www.etlsolutions.com/privacy-policy/.

    data migration planning guide