This blog is the third in a series on how we built a comprehensive South African consumer dataset and segmentation. Read Part 1 on how we merged numerous datasets, and Part 2 on building a national consumer segmentation
After our success in creating a combined South African consumer dataset, we thought it would be useful to use it to better understand geographic locations. After looking at several methods we found an approach that we were satisfied with.
We started with Census 2011 to obtain the populations per microsegment for each location, followed by numerous adjustments to account for the Census’s age and other more recent data sources (see our first blog in this series). The microsegment populations were used to create a weighted average of the microsegment information from our combined dataset to create profiles of various locations all the way down to a subplace level.
Our approach to building the combined dataset deals with most major issues, but there remain some current limitations:
- While we chose the overlap variables that are the most predictive of individuals’ attributes across the board, some attributes are not as well predicted by our overlap variables as others. Specifically, location can have a strong influence on the language people speak over and above the overlap variables we chose, but as we do not use location as an overlap variable we cannot account for this.
- This issue becomes more pronounced when considering attributes that are specific to certain locations. For example, almost nobody in Cape Town is going to be reading publications, or listening to radio stations, that are only released in Johannesburg. But our dataset cannot account for this. For this reason, when constructing our location profiles we refrain from reporting on information that we deem to be highly influenced by location, even after accounting for our overlap variables.
In future iterations of the ENS we will be building location, when available, as a defining variable.
While we are very happy with the ENS and believe it is the most comprehensive and accurate view the South African consumer available in the market, we recognise that there is always scope for improvement. We are constantly working on ways to improve this product to provide even more valuable information and use cases. Some items we are working on and should be able to release soon include:
- Enhancing the location profiles with variables from location specific datasets that cannot be merged into the ENS
- Performing a location based segmentation to find areas that are most similar or most different
- Imputing the location specific variables that are contained in the ENS but have been excluded from the current detailed location profiles