Summary
CACI’s Ocean database contains variables relating to consumer attitudes and behaviours of the UK population at individual and household level.
Whilst already providing a market leading solution, a major update gave CACI the opportunity to rebuild many of the associated predictive models using AI techniques to even further improve the modelling, and to make predictions more balanced and “fair” across demographic subgroups such as sex and age groups.
Industry
Technology
Products used
Challenge
Traditional classification techniques optimise “mathematical accuracy,” which measures the number of predicted labels that match the true labels; however, optimising solely for this measure can result in an imbalance in prediction quality across Yes and No labels (as to whether particular behaviours, interests or attitudes are exhibited), and unfairness across demographic subgroups such as sex and age, especially when there is a natural imbalance in the true Yes/No label proportions, i.e. where behaviours have a strong skew towards a particular sex or age group.
Addressing these deficiencies is an area of ongoing research within the AI community.
Ocean enhances clients understanding of their customers by indicating their likely attitudes and behaviours
Traditional modelling methods can be biased in terms of prediction quality for different sexes and/or age groups
The challenge was to remove this bias, achieved by developing new AI based techniques that can optimise across both sex and age groups
Solution
Advances in machine learning science and computational power allow Ocean to use a targeted technique for each variable rather than a one-size-fits-all approach.
CACI has developed new in-house classification techniques that significantly improve standard methods to ensure balanced prediction quality across both Yes and No predictions and demographic subgroups.
For fairness, various measures can be used. CACI specifically optimises its predictions as measured by the Equalised Odds Difference, across sex (Male/Female/Unknown) by default or across age bands or both.
Results
Fairness has been implemented across age and sex to ensure we are more accurately predicting attributes and behaviours whilst eliminating bias.
In addition, a set of insightful driver variables has been added, enabling the modelling to achieve a better understanding of the real world, and over 100 new variables have been introduced for the latest version of Ocean.
