Skip to main content

XEMPLOCGR_1.1.3

Back to C18072

Field description Field abbreviation Field version Field length Field type
Location of employment (region) XEMPLOCGR 1.1.3 4 Char

Valid entries

Code Label
A North East
B North West
D Yorkshire and The Humber
E East Midlands
F West Midlands
G East of England
H London
J South East
K South West
XK UK region unknown
XF England region unknown
XI Wales
XH Scotland
XG Northern Ireland
Z Guernsey, Jersey and the Isle of Man
GREU Geographic region - Other European Union
GRAF Geographic region - Africa
GRAS Geographic region - Asia
GRAU Geographic region - Australasia
GRME Geographic region - Middle East
GRNA Geographic region - North America
GROE Geographic region - Other Europe
GRSA Geographic region - South America
OS Geographic region - Unknown
NOTK Not known

Dependent fields

  • XMLOCGR
  • XEMPLOCN
  • XWRKLOCGR

Depend upon fields

  • ZEMPPCODE
  • ZEMPAREA
  • ZEMPCOUNTRY
  • EMPPLOC

Additional information

This algorithm uses look-up tables provided from data initially supplied by the Office for National Statistics (ONS). Postcodes (full, outward and area) are mapped to UK government office regions for UK domiciled students where possible. Where the partial postcode matches to a unique UK government office region, these codes should be retained, otherwise set to appropriate country code.

Country codes are returned for other UK and mapped to geographic regions for non-UK domiciled students.

Valid entry Z (Guernsey, Jersey and the Isle of Man) is derived from country entries GG (Guernsey), JE (Jersey), IM (Isle of Man) and XL (Channel Islands not otherwise specified). The smaller Channel Islands of Alderney and Sark are included under the Bailiwick of Guernsey. Officially, the Crown Dependencies of Guernsey, Jersey and the Isle of Man are not part of the UK or the EU. However, for HESA analysis purposes they are often grouped with and generally assumed to be part of the United Kingdom.

Code XF is the National Statistics Country Classification (NSCC) code for England, but is used in this field as England region unknown, when data cannot be assigned to one of the nine England region codes.

The definition of the European Union is as at 1 December in the academic year to which the data relates.

Valid entry NOTK (Not known) includes those for whom location of employment cannot be determined from the information returned and those for whom location information was not provided.

Contains OS data, © Crown copyright and database right 2020
Contains Royal Mail data, © Royal Mail copyright and database right 2020
Source: Office for National Statistics licensed under the Open Government Licence v.3.0

Geographical mappings for Northern Ireland are based upon Crown Copyright and are reproduced with the permission of Land & Property Services under delegated authority from the Keeper of Public Records, © Crown copyright and database right 2020. NIMA MOU577.4

Technical Specification

The algorithm uses the HESA Data Management table D_postcode which is created from the ONS Postcode Directory (ONSPD) and restricted to the November YYYY update (postcode mapping is valid at this date), where YYYY represents the year following the academic year of collection. For graduate outcomes 2017/18, the November 2019 ONSPD data is used. The table includes the following fields: Full Postcode (PostCode, VARCHAR(8)), Outward Postcode (OutwardPostcode, VARCHAR(4)), Area Postcode (AreaCode, VARCHAR(2)), County/Region/Unitary Authority/Local Government District code (DomicileCode, VARCHAR(4)) and Government Office Region code (RegionCode, VARCHAR(2)).

The algorithm uses the HESA Data Management table D_country which is created from an overseas lookup and restricted to the November YYYY update (country mapping is valid at this date), where YYYY represents the year following the academic year of collection. For graduate outcomes 2017/18, the November 2019 ONSPD data is used. The table includes the following fields: Country Code (CountryCode, VARCHAR(2)) and Region Code (GeographicGroupCode, NVARCHAR(4)).

Carry out the following steps, taking a top down approach stopping when criteria is satisfied:

1. If ZEMPPCODE is not 99999999 and the value is found in D_PostCode.PostCode, then return D_PostCode.RegionCode.

2. If ZEMPAREA is not NOTK and the value is found in D_PostCode.DomicileCode and D_PostCode.DomicileCode is not XF, XG, XH or XI return D_PostCode.RegionCode.

3. If ZEMPPCODE is not 99999999 and the outward part of the postcode up to the first space (if there is one - maximum of 4 characters) is a value in D_PostCode.OutwardPostcode and maps to a single value in D_PostCode.DomicileCode (excluding domicile codes XF, XG, XH and XI), then return D_PostCode.RegionCode.

4. If ZEMPPCODE is not 99999999 and the postcode area (maximum of 2 characters) is a value in D_PostCode.PostcodeArea and maps to a single value in D_PostCode.DomicileCode (excluding domicile codes XF, XG, XH and XI) then return D_PostCode.RegionCode.

5. If ZEMPAREA equals one of the following values XF, XG, XH, XI then return ZEMPAREA.

6. If ZEMPAREA equals one of the following values XL, IM then return Z.

7. If ZEMPAREA equals one of the following values A, B, D, E, F, G, H, J, K then return ZEMPAREA.

8. If ZEMPAREA equals Z then return Z.

9. If ZEMPPCODE is not 99999999 and the outward part of the postcode up to the first space (if there is one - maximum of 4 characters) is a value in D_PostCode.OutwardPostcode and maps to a single value in D_PostCode.RegionCode equal to A, B, D, E, F, G, H, J, K or Z then return D_PostCode.RegionCode.

10. If ZEMPPCODE is not 99999999 and the outward part of the postcode up to the first space (if there is one - maximum of 4 characters) is a value in D_PostCode.OutwardPostcode and maps to a single value in D_PostCode.RegionCode not in A, B, D, E, F, G, H, J, K or Z then return D_PostCode.RegionCode.

11. If ZEMPPCODE is not 99999999 and the postcode area (maximum of 2 characters) is a value in D_PostCode.PostcodeArea and maps to a single value in D_PostCode.RegionCode equal to A, B, D, E, F, G, H, J, K or Z then return D_PostCode.RegionCode.

12. If ZEMPPCODE is not 99999999 and the postcode area (maximum of 2 characters) is a value in D_PostCode.PostcodeArea and maps to a single value in D_PostCode.RegionCode not in A, B, D, E, F, G, H, J, K or Z then return D_PostCode.RegionCode.

13. If ZEMPCOUNTRY is GG, JE, XL or IM then return Z.

14. If ZEMPCOUNTRY is not NOTK and in D_Country.CountryCode, then return D_Country.GeographicGroupCode.

15. If ZEMPCOUNTRY is not NOTK and equal in D_Country.GeographicGroupCode, then return D_Country.GeographicGroupCode.

16. If EMPPLOC is not NULL or empty string then match EMPPLOC value to corresponding country code.

17. Else, set to NOTK.

Carry out processing top down:

ZEMPPCODE (CHAR 8) ZEMPAREA (CHAR 4) ZEMPCOUNTRY (CHAR 2) EMPPLOC (CHAR 2) XEMPLOCGR
If ZEMPPCODE is a value in D_PostCode.PostCode D_PostCode.RegionCode
If ZEMPAREA is a value found in D_PostCode.DomicileCode and D_PostCode.DomicileCode is not XF, XG, XH or XI D_PostCode.RegionCode
If the outward part of ZEMPPCODE is a value in D_PostCode.OutwardPostcode and maps to a single value in D_PostCode.DomicileCode (excluding domicile codes XF, XG, XH and XI) D_PostCode.RegionCode
If the postcode area of ZEMPPCODE is in D_PostCode.PostcodeArea and maps to a single value in D_PostCode.DomicileCode (excluding domicile codes XF, XG, XH and XI) D_PostCode.RegionCode
If ZEMPAREA equals one of the following values XF, XG, XH, XI ZEMPAREA
If ZEMPAREA equals one of the following values XL, IM Z
If ZEMPAREA equals one of the following values A, B, D, E, F, G, H, J, K ZEMPAREA
If ZEMPAREA equals Z Z
If the outward part of ZEMPPCODE is a value in D_PostCode.OutwardPostcode and maps to a single value in D_PostCode.RegionCode equal to A, B, D, E, F, G, H, J, K or Z D_PostCode.RegionCode
If the outward part of ZEMPPCODE is a value in D_PostCode.OutwardPostcode and maps to a single value in D_PostCode.RegionCode not in A, B, D, E, F, G, H, J, K or Z D_PostCode.RegionCode
If the postcode area of ZEMPPCODE is in D_PostCode.PostcodeArea and maps to a single value in D_PostCode.RegionCode equal to A, B, D, E, F, G, H, J, K or Z D_PostCode.RegionCode
If the postcode area of ZEMPPCODE is in D_PostCode.PostcodeArea and maps to a single value in D_PostCode.RegionCode not in A, B, D, E, F, G, H, J, K or Z D_PostCode.RegionCode
If ZEMPCOUNTRY equals one of the following values GG, JE, XL or IM Z
If ZEMPCOUNTRY found in D_Country.CountryCode D_Country.GeographicGroupCode
If ZEMPCOUNTRY found in D_Country.GeographicGroupCode D_Country.GeographicGroupCode
01 XF
02 XH
03 XI
04 XG
05 OS
NULL or empty string NOTK

Once the above process has been undertaken, a second level of refinement is carried out where ZEMPPCODE is not a full valid postcode and any of the following hold:

  • ZEMPAREA = NOTK
  • ZEMPAREA is less than 4 characters
  • ZEMPPCODE is a partial postcode and the postcode country contradicts with the country provided in EMPPLOC

A sequence of more complex steps, as detailed below are performed:

Create a mapping file of common area names to the corresponding county / unitary authority code (where possible) or Government office region:

  • County / unitary authority
  • Lower super output area
  • Medium super output area
  • Electoral ward
  • Local authorities
  • Government office regions
  • Parliamentary constituencies
  • Built up areas
  • Scotland common area names
  • Northern Ireland common area names

Clean up mapping file by removing area names which are common short words such as park, town, city, east, north etc. Clean up free text supplied by graduate in a similar manner and remove special characters and additional spaces.

  • Full postcode returned in free text - Use postcode to map to four-character code in D_PostCode.RegionCode. [No further processing required]
  • Partial postcode returned in free text - Use postcode to map to four-character code in D_PostCode.RegionCode provided EMPPLOC is consistent with D_PostCode.CountryCode*. [No further processing required]

*EMPPLOC is considered consistent with D_PostCode.CountryCode if any of the following are satisfied:

  • D_Postcode.RegionCode = A, B, D, E, F, G, H, J, K and EMPPLOC = 01
  • D_Postcode.RegionCode = XH and EMPPLOC = 02
  • D_Postcode.RegionCode = XI and EMPPLOC = 03
  • D_Postcode.RegionCode = XG and EMPPLOC = 04

Otherwise undertake all following steps:

  • Look for a common area name contained within free text
  • Look for free text returned within common area name
  • Look for same free text being returned by other graduates who have provided a postcode (if there are multiple options, look for the most common occurrence)
  • Look for area names within free text with spelling mistakes (match on proportion of letters / beginning and end of area name)
  • Look for Government office region names within the free text. (e.g. Yorkshire) and remove any mappings which result in a Government office region other than the one found
  • Look for "big area names" (based on the standard list of county / unitary codes) within free text. e.g. Where "Leeds" has been written, this is more likely "Leeds" in the north, rather than the south in the absence of any other information being provided

After carrying out these steps, the best match is identified using a scoring system. If there is only one option, use that, otherwise prioritise according to the following order. This process aligns with that used to derive XEMPLOCUC, but picks up the government office region containing the county / unitary authority code derived in XEMPLOCUC. There are some additional steps which are undertaken at the end where the text provided doesn't allow a mapping at county / unitary level, but it is still possible to map to a government office region.

  • Country match
  • Government office region match from partial postcode
  • Matches area name exactly with free text
  • Longest overlap of area name in free text, e.g. if the text contains "Newcastle Upon Tyne", map to "Newcastle Upon Tyne" and disregard "Newcastle"
  • Match on spelling mistake
  • Match on domicile Government office region, e.g. if the student only writes "Newcastle" and was previously living in the North East, map to "Newcastle Upon Tyne"
  • Match on provider Government office region, e.g. if the student only writes "Newcastle" and was previously studying in the North East, map to "Newcastle Upon Tyne"
  • Match on another graduate writing the same text
  • Match on big area name e.g., there are two areas called "Leeds", map to north in absence of any other information
  • Where it has been possible to map to two or more counties / unitary authorities which fall within the same government office region, XEMPLOCUC is mapped to country and XEMPLOCGR to the government office region

Carry out some consistency checks and update as appropriate:

Where a graduate has completed both, compare location of employment with location of self-employment. If the free text is the same for both and there is a postcode provided for one, map to the code obtained from the postcode. Compare across academic year to ensure consistency. E.g., if there is a spelling mistake which has been mapped in one year and not in another, the more detailed information can be used.

Revision history

Date Version Notes
2021-06-14 1.1.3 Additional details for a second level of refinement of the derived data has been added to the technical specification
2020-03-18 1.1.2 Copyright information added

Contact Liaison by email or on +44 (0)1242 388 531.