Skip to main content

Using Census data to derive a new area-based measure of deprivation - Section 2: Data

Section 2: Data

To derive the new area-based measure, as well as carry out the subsequent analysis, it was necessary to access and link a variety of data sources.

The first of these was 2011 Census data that was available in the public domain. The Census is a UK-wide collection that occurs every ten years and is mandatory for all households to complete. A range of topics are covered as part of the questionnaire, including employment, education, as well as home and vehicle ownership. It is administered by the Office for National Statistics (ONS) in England and Wales, while the Northern Ireland Statistics and Research Agency (NISRA) and National Records of Scotland (NRS) gather the relevant data for Northern Ireland and Scotland, respectively. Alongside there being a very high level of coverage across the population, the UK Data Service (2022) Census forms illustrate that there is general consistency in the way questions are asked across all four nations. Indeed, a report by the ONS (2015) indicates that many of the published outputs on different aspects of the Census are either broadly or highly comparable.

As specified earlier, the smallest geographic domain at which data is subsequently released to the public is at output area level (or small areas in Northern Ireland). In England and Wales, ONS (2021a) highlight the aspiration was for output areas to contain approximately 125 households, while also being as homogenous as possible (based on tenure and dwelling type). In Scotland, no such requirement was set on homogeneity, with NRS (2015) indicating that output areas are expected to contain between 20 and 78 households. NISRA (2019) point out that small areas in Northern Ireland average around 160 households/400 individuals and are intended to be socially similar. Our starting point was to therefore obtain key statistics at output area level from the 2011 Census supplied by ONS (2021b), NRS (2021b) and NISRA (2021) relating to various indicators, which included;

a) Age structure (KS102)

b) Health and provision of unpaid care (KS301)

c) Household tenure (KS402)

d) Household composition (KS105)

e) Qualifications and students (KS501)

f) National Statistics Socioeconomic Classification (NSSEC) (KS611)

As stated in the introduction, qualifications (KS501) and occupation (KS611) were the two variables we used to create our index. Data on age, housing tenure, household structure and self-reported health were utilised to enable us to develop some summary statistics on our index and the extent to which it may be correlated with low income/(material) deprivation. For example, The Health Foundation (2020) demonstrate that poorer self-reported health is correlated with lower household income, while HM Government (2014) note that lone parent households have a higher risk of experiencing long-term poverty. Meanwhile, a report by Welsh Government (2023) highlights that living in social housing, having poor health or being a single parent are all associated with a greater probability of facing material deprivation.

This was followed by ingesting look-up files for each of the home nations. ONS (2018a) locates output areas to an English region, while ONS (2018b) indicates how output areas in England and Wales map to LSOAs, middle layer super output areas (MSOAs) and local authority districts. NRS (2011) supply the output area to data/intermediate zone look-up file for Scotland. It should be noted though that 2011 output areas in Scotland only match perfectly into council areas, with best fit aggregations having to be applied at other levels of geography. For Northern Ireland, NISRA (2013a) disseminate information that highlights how small areas map to larger geographical domains, such as wards and local government districts (LGDs). However, small areas do not nest properly into LGDs, so assignment is commonly determined by the location of the majority of households. As this data source provides the 1992 LGDs/wards, we also utilise an additional file supplied by NISRA (2013b) to obtain the updated 2014 LGDs. Within this, there is supplementary detail on how 2014 District Electoral Areas (DEAs) match up to the 2014 LGDs. The rationale behind linking these look-up files to our Census data was that we would then have the codes needed to bring in the Indices of Deprivation and/or urban-rural classifications, which are formed at a higher level of geography.

One of the drawbacks of the Indices of Deprivation is that they are less useful in capturing deprivation in rural areas. Consequently, we wanted to ensure that our dataset contained information on the urban-rural classification for each nation to assess the extent to which SEISA addresses this limitation. In England and Wales, a grouping has been developed by the Department for Environment, Food and Rural Affairs (DEFRA, 2021) and we examine their detailed 10-fold categorisation when conducting our analysis for these two countries. For Scotland, the Scottish Government (2019a) have developed a file that outlines how data zones map to higher level geographies, with this also containing an urban-rural classification at varying levels of granularity.

The Indices of Deprivation (both the composite measure and individual domains) were then ingested for all four countries. The Ministry of Housing, Communities and Local Government (2019) was the government department responsible for publishing the most recent version for England. As they also release supplementary data relating to the income deprivation affecting children index (IDACI) and given the relevance of this variable to our work, this was also brought into our dataset. 2019 was also the year that the latest version of the Welsh Index of Multiple Deprivation (WIMD) was disseminated by Welsh Government (2019), while the Scottish Government (2020) circulated their updated Scottish Index of Multiple Deprivation (SIMD) data a year later. In Northern Ireland, NISRA (2017) distributed the Northern Ireland Multiple Deprivation Measure (NIMDM), which also included some extra data on the urban-rural nature of an area, as well as an Income Deprivation Affecting Children (IDAC) indicator.

Additionally, as we stated in the introduction to our paper, interest lies in understanding how indices developed from the Census correlate with income. Concerns around sensitivity and the potential impact on non-response have precluded questions on income from emerging in the Census. However, one file we can ingest to enhance our knowledge of how a derived index may be associated with income is the ONS (2020) small area income estimates for England and Wales, with these figures having been derived using a model-based approach based on a dataset comprising of both survey and administrative sources. Data at MSOA level is available for the financial year 2011/12 and contains four different income measures – total and net weekly household income, as well as equivalised net weekly household income (before and after taking into account housing costs). However, similar data is not available for ingestion into our master dataset in either Scotland or Northern Ireland.

Next: Section 3: Deriving a new UK-wide area-based measure based on Census 2011