General enquiries :
+44 (0)20 7602 6000

Identifying the unique customer in data lakes and data warehouses

Thursday 8 April 2021 Data-Led Marketing

By Jon Ede and David Sealey

How identity resolution enables advanced personalisation in Big Data environments

Most organisations have plenty of data. The issue is not how much you have, it’s what you can do with it. You may be sitting on a goldmine of data insight, but you can’t use it to refine personalised communications and marketing unless you can connect it with the digital platforms and tools that will deliver the personalisation.

Tackling this challenge is key to success in a world where consumers are increasingly intolerant of content and approaches that aren’t directly relevant to them. You need to use all the information you have to create unified identities that can inform personalised customer experiences, if you want to compete effectively.

The problem becomes more acute if you are using a data lake as the primary store of customer data. The concept and architecture of data lakes do not lend themselves to the enforced structure, maintenance and standardised logging that good identity resolution requires. At the opposite end of the spectrum, more traditional data warehouses can lack the structure or ability to capture and use large quantities of digital data which can form the backbone of modern identity resolution. These platforms may also be difficult to integrate with other MarTech and AdTech.


Tackling identity resolution in data lakes

If you work in a big data environment, you may struggle to identify unique customers. It can be risky to rely on insights developed from data lakes, because they have no enforced structure at the point of data capture. This makes it challenging to define a customer view in a consistent manner, with all data points incorporated.

CACI’s identity resolution application ResolvID has a smart way of tackling this problem. It takes in all potentially identifiable information as it’s loaded into the data lake and continues to iterate and develop the customer picture over time. It adds up small quantities of data from sources including transactions, ClickStream data and geolocation to build the identity of a customer. The structure of the customer is defined within the platform: ResolvId makes its keys available for use within any and all resources.


Tackling identity resolution in data warehouses

If your organisation uses enterprise or operational data warehouses, it can be difficult to connect the finance, CRM and order data held within them directly with digital customer data platforms (CDPs) that serve web, automated marketing and mobile app content.

CDPs maintain digital identifiers, but they need a single customer identifier to truly connect the business with digital channels. Because of the complex nature of data held in the data warehouse, it can be challenging to resolve to an individual’s identity.

Using ResolvID as a middle layer, you can create a unified identity and identity graph that can work in both environments. It connects the systems to enable rapid deployment, unlocking value in your data for personalised campaigns. You’ll achieve a single view of the customer and can report crucial unified measures such as lifetime value.


Read the previous blog to find out factors to consider in delivering personalised content and marketing that fuel sales and retention in competitive consumer markets. Get in touch if you’d like to discuss identity resolution in your big data environment with one of our campaign data experts.

Further information

To read and stay updated on our future articles, make sure to follow us on our LinkedIn page.

Download our full Identity Resolution series.

How identity resolution enables advanced personalisation in Big Data environments

Identifying the unique customer in data lakes and data warehouses