Data integration is about creating a single stream of consolidated data. To ensure that this stream of data is accurate, careful planning is required and there are a range of assessments that need to take place before any data integration actually occurs. The steps below illustrate the proven plan that our consultants follow to deliver integrated data successfully to our clients.

1. Define the project

Setting clear objectives for the project ensures that its success can be measured and monitored. Consider what form the consolidated data has to be in to provide maximum usefulness for the organisation. An example objective could be daily movement of 100% of sales data from a company’s retail outlets to the Customer Relationship Management system at head office, with 98 ‘up’ time. See examples of the objectives we set for our data integration projects here.

Define the scope of the project by including its parameters in your objectives. Ensure that all relevant databases, datasets and software are listed.

Discover whether there is a need for real-time access to data. If not, how frequently should the data be transferred?

Mitigate risk by listing all potential issues, both during the project’s implementation and afterwards, alongside plans to mitigate them. Risks can range from unresponsive system custodians to unreliable data feeds. See examples of how we mitigate risks during data integration projects here.

2. Understand the systems

Make sure that you review all of the systems involved with the data, from extraction to every system that uses the final, consolidated output. Ensure adequate connectivity between the systems, including any in the cloud, and identify any configurations necessary for the data to be transferred. Depending on the systems involved, our data integration consultants might plan to deploy data connectors, SFTP ports or APIs to enable data interchange.

Discover whether any manual processes are currently being used, and whether or not these processes can scale. You may also wish to consider whether legacy systems still contribute data and whether these can be replaced by a more modern system.

Data security must be planned for all stages of a data integration. Consider how the transfer of data from each system to another might potentially affect its security. DataHub, our data integration system, ensures first rate security by using 2048 bit encryption before the data is transmitted and transferring data via SFTP.

3. Design the data integration framework

data-integration-guide

Gain a good understanding of the data to be integrated. For instance, is it structured or unstructured? What are its sources? How good is its quality?

Determine the requirements for the final consolidated data. What datasets are required? How will the data be delivered and how frequently? For instance, our DataHub system can deliver data in formats including dashboard, Excel files and cloud systems.

List all the data sources needed to meet the project’s requirements and how they can be accessed.  For instance:

Remote server A: Direct from database
Remote server B: Collect file
Remote server C: Query web API
Local server X: File dropped
Local server Y: Receive push notifications
Local server X: Lookup data, email > Excel

Map out the current data flow, if there is one, then determine the data flow required once the project is implemented. Then you will be in a position to define a framework to manage the data flow on an on-going basis. Our consultants use our DataHub system, which provides a fully-tested framework that’s already in use in a wide range of integration projects. Based on our proprietary Transformation Manager software, DataHub is flexible enough to allow our consultants to create bespoke systems to meet each client’s individual requirements.

This example framework is based on DataHub:

DataHub-flow-for-data- integration-guide

 

You can create a similar system with scripts or off-the-shelf data integration tools. Evaluating alternative data integration tools will require consideration of budget, in-house expertise and your organisation’s readiness to adopt new technology.

4. Define how the data will be processed

Where data is supplied by a third party, ensure that the specification of file contents is defined and signed off early. If the data is already available within your organisation, ensure that sample data is available for testing early on in the project.

Profile all the data, whatever its form, to ensure that it meets requirements and that any issues are identified:

  • Dates
  • Monetary values
  • Length
  • Type
  • Format
  • Gaps in the dataset.

You may also need data quality rules to ensure that the data is fit for purpose. Consider:

  • Validation
  • Cleansing
  • Deduplication
  • Consolidation.

The reporting requirements identified at Stage 1 will determine whether the data is mapped to new structures or converted into a new format.

Consider how the end user is made aware of whether or not the integration system is working. You may require alerts for system-level issues, such as connectivity, or data-specific issues such as invalid entries. Determine who will receive each type of alert.

5. Implement the project

Choose a project champion to ensure that the integration has a sufficiently high profile within the organisation. He or she can ensure that adequate resources are available and not diverted by other projects.

 Identify the stakeholders. Which departments within the organisation use the data or systems and should be involved in the project? Who within those departments will be involved in the design and implementation of the project? Which senior leaders will have input and oversight?

Test the systems thoroughly before implementation, using sample data. This ensures that data quality rules and data mappings have been implemented correctly.

It is important to future-proof the system to avoid dependence on the knowledge of one or two individuals. It’s also important that changes or extensions to the data can be made easily. Long and short-term maintenance will be required, which in the case of DataHub is carried out by our support team. You will also need a recovery plan.

As you can see, successful data integration is all about upfront planning. Find out more about how we implement these steps in the real world.

data-integration-guide

Further reading

    Did you like this article? Get our new articles in your inbox with our occasional email newsletter.

    We will never share your details with anyone else, except that your data will be processed securely by EmailOctopus (https://emailoctopus.com/) in a third country, and you can unsubscribe with one click at any time. Our privacy policy: https://www.etlsolutions.com/privacy-policy/.