XBUSLOCUC_1.1.3
Field description | Field abbreviation | Field version | Field length | Field type |
---|---|---|---|---|
Location of Self-employment (county/unitary authority level) | XBUSLOCUC | 1.1.3 | 4 | Char |
Valid entries
Dependent fields
- XMLOCUC
- XWRKLOCUC
Depend upon fields
- ZBUSPCODE
- ZBUSAREA
- ZBUSCOUNTRY
- BUSEMPPLOC
Additional information
This algorithm uses look-up tables provided from data initially supplied by the Office for National Statistics (ONS). Postcodes (full, outward and area) are mapped to UK counties, regions, unitary authorities and local government districts for UK domiciled students where possible. Where the partial postcode matches to a unique county/unitary authority, these codes should be retained, otherwise set to appropriate country code.
Country codes are returned for other UK and non-UK domiciled students.
Contains OS data, © Crown copyright and database right 2020
Contains Royal Mail data, © Royal Mail copyright and database right 2020
Source: Office for National Statistics licensed under the Open Government Licence v.3.0
Geographical mappings for Northern Ireland are based upon Crown Copyright and are reproduced with the permission of Land & Property Services under delegated authority from the Keeper of Public Records, © Crown copyright and database right 2020. NIMA MOU577.4
Technical Specification
The algorithm uses the HESA Data Management table D_postcode which is created from the ONS Postcode Directory (ONSPD) and restricted to the November YYYY update (postcode mapping is valid at this date), where YYYY represents the year following the academic year of collection. For graduate outcomes 2017/18, the November 2019 ONSPD data is used. The table includes the following fields: Full Postcode (PostCode, VARCHAR(8)), Outward Postcode (OutwardPostcode, VARCHAR(4)), Area Postcode (AreaCode, VARCHAR(2)), County/Region/Unitary Authority/Local Government District code (DomicileCode, VARCHAR(4)) and Government Office Region code (RegionCode, VARCHAR(2)).
The algorithm uses the HESA Data Management table D_country which is created from an overseas lookup and restricted to the November YYYY update (country mapping is valid at this date), where YYYY represents the year following the academic year of collection. For graduate outcomes 2017/18, the November 2019 ONSPD data is used. The table includes the following fields: Country Code (CountryCode, VARCHAR(2)) and Region Code (GeographicGroupCode, NVARCHAR(4)).
Carry out the following steps, taking a top down approach stopping when criteria is satisfied:
1. If ZBUSPCODE is not 99999999 and the value is found in D_PostCode.PostCode, then return D_PostCode.DomicileCode.
2. If ZBUSAREA is not NOTK and the value is found in D_PostCode.DomicileCode and D_PostCode.DomicileCode is not XF, XG, XH or XI return D_PostCode.DomicileCode.
3. If ZBUSPCODE is not 99999999 and the outward part of the postcode up to the first space (if there is one - maximum of 4 characters) is a value in D_PostCode.OutwardPostcode and maps to a single value D_PostCode.DomicileCode (excluding domicile codes XF, XG, XH and XI), then return D_PostCode.DomicileCode.
4. If ZBUSPCODE is not 99999999 and the postcode area (maximum of 2 characters) is a value in in D_PostCode.PostcodeArea and maps to a single value D_PostCode.DomicileCode (excluding domicile codes XF, XG, XH and XI) then return D_PostCode.DomicileCode.
5. If ZBUSAREA equals one of the following values XF, XG, XH, XI then return ZBUSAREA.
6. If ZBUSAREA equals one of the following values XL, IM then return ZBUSAREA.
7. If ZBUSAREA equals one of the following values A, B, D, E, F, G, H, J, K then return XF.
8. If ZBUSPCODE is not 99999999 and the outward part of the postcode up to the first space (if there is one - maximum of 4 characters) is a value in D_PostCode.OutwardPostcode and maps to a single value in D_PostCode.RegionCode equal to A, B, D, E, F, G, H, J or K then return XF.
9. If ZBUSPCODE is not 99999999 and the outward part of the postcode up to the first space (if there is one - maximum of 4 characters) is a value in D_PostCode.OutwardPostcode and maps to a single value in D_PostCode.RegionCode equal to Z then return XL.
10. If ZBUSPCODE is not 99999999 and the outward part of the postcode up to the first space (if there is one - maximum of 4 characters) is a value in D_PostCode.OutwardPostcode and maps to a single value in D_PostCode.RegionCode not in A, B, D, E, F, G, H, J, K or Z then return D_PostCode.RegionCode.
11. If ZBUSPCODE is not 99999999 and the postcode area (maximum of 2 characters) is a value in D_PostCode.PostcodeArea and maps to a single value in D_PostCode.RegionCode equal to A, B, D, E, F, G, H, J or K then return XF.
12. If ZBUSPCODE is not 99999999 and the postcode area (maximum of 2 characters) is a value in D_PostCode.PostcodeArea and maps to a single value in D_PostCode.RegionCode equal to Z then return XL.
13. If ZBUSPCODE is not 99999999 and the postcode area (maximum of 2 characters) is a value in D_PostCode.PostcodeArea and maps to a single value in D_PostCode.RegionCode not in A, B, D, E, F, G, H, J, K or Z then return D_PostCode.RegionCode.
14. If ZBUSCOUNTRY is not NOTK and in D_Country.CountryCode, then return D_Country.CountryCode.
15. If ZBUSCOUNTRY is GREU, then return EU.
16. If ZBUSCOUNTRY is not NOTK and in D_Country.GeographicGroupCode, then return OS.
17. If BUSEMPPLOC is not NULL or empty string then match BUSEMPPLOC value to corresponding country code.
18. Else, set to NOTK.
ZBUSPCODE (CHAR 8) | ZBUSAREA (CHAR 4) | ZBUSCOUNTRY (CHAR 2) | BUSEMPPLOC (CHAR 2) | XBUSLOCUC |
---|---|---|---|---|
If ZBUSPCODE is a value in D_PostCode.PostCode | D_PostCode.DomicileCode | |||
If ZBUSAREA is a value found in D_PostCode.DomicileCode and D_PostCode.DomicileCode is not XF, XG, XH or XI | D_PostCode.DomicileCode | |||
If the outward part of ZBUSPCODE is a value in D_PostCode.OutwardPostcode and maps to a single value in D_PostCode.DomicileCode (excluding domicile codes XF, XG, XH and XI) | D_PostCode.DomicileCode | |||
If the postcode area of ZBUSPCODE is in D_PostCode.PostcodeArea and maps to a single value in D_PostCode.DomicileCode (excluding domicile codes XF, XG, XH and XI) | D_PostCode.DomicileCode | |||
If ZBUSAREA equals one of the following values XF, XG, XH, XI | ZBUSAREA | |||
If ZBUSAREA equals one of the following values XL, IM | ZBUSAREA | |||
If ZBUSAREA equals one of the following values A, B, D, E, F, G, H, J, K | XF | |||
If the outward part of ZBUSPCODE is a value in D_PostCode.OutwardPostcode and maps to a single value in D_PostCode.RegionCode A, B, D, E, F, G, H, J, K | XF | |||
If the outward part of ZBUSPCODE is a value in D_PostCode.OutwardPostcode and maps to a single value in D_PostCode.RegionCode Z | XL | |||
If the outward part of ZBUSPCODE is a value in D_PostCode.OutwardPostcode and maps to a single value in D_PostCode.RegionCode not in A, B, D, E, F, G, H, J, K or Z | D_PostCode.RegionCode | |||
If the postcode area of ZBUSPCODE is in D_PostCode.PostcodeArea and maps to a single value in D_PostCode.RegionCode A, B, D, E, F, G, H, J, K | XF | |||
If the postcode area of ZBUSPCODE is in D_PostCode.PostcodeArea and maps to a single value in D_PostCode.RegionCode Z | XL | |||
If the postcode area of ZBUSPCODE is in D_PostCode.PostcodeArea and maps to a single value in D_PostCode.RegionCode not in A, B, D, E, F, G, H, J, K or Z | D_Country.RegionCode | |||
If ZBUSCOUNTRY found in D_Country.CountryCode | D_Country.CountryCode | |||
If ZBUSCOUNTRY = GREU | EU | |||
If ZBUSCOUNTRY found in D_Country.GeographicGroupCode | OS | |||
01 | XF | |||
02 | XH | |||
03 | XI | |||
04 | XG | |||
05 | OS | |||
NULL or empty string | NOTK |
Once the above process has been carried out, a second level of refinement is carried out where ZBUSPCODE is not a full valid postcode and any of the following hold:
- ZBUSAREA = NOTK
- ZBUSAREA is less than 4 characters
- ZBUSPCODE is a partial postcode and the postcode country contradicts with the country provided in BUSPLOC
A sequence of more complex steps, as detailed below are performed:
Create a mapping file of common area names to the corresponding county / unitary authority code (where possible) or Government office region:
- County / unitary authority
- Lower super output area
- Medium super output area
- Electoral ward
- Local authorities
- Government office regions
- Parliamentary constituencies
- Built up areas
- Scotland common area names
- Northern Ireland common area names
Clean up mapping file by removing area names which are common short words such as park, town, city, east, north etc. Clean up free text supplied by graduate in a similar manner and remove special characters and additional spaces.
- Full postcode returned in free text – Use postcode to map to four-character code in D_PostCode.DomicileCode. [No further processing required]
- Partial postcode returned in free text – Use postcode to map to four-character code in D_PostCode.DomicileCode provided BUSPLOC is consistent with D_PostCode.CountryCode*. [No further processing required]
*BUSPLOC is considered consistent with D_PostCode.CountryCode if any of the following are satisfied:
- D_Postcode.RegionCode = A, B, D, E, F, G, H, J, K and BUSPLOC = 01
- D_Postcode.RegionCode = XH and BUSPLOC = 02
- D_Postcode.RegionCode = XI and BUSPLOC = 03
- D_Postcode.RegionCode = XG and BUSPLOC = 04
Otherwise undertake all following steps:
- Look for a common area name contained within free text
- Look for free text returned within common area name
- Look for same free text being returned by other graduates who have provided a postcode (if there are multiple options, look for the most common occurrence)
- Look for area names within free text with spelling mistakes (match on proportion of letters / beginning and end of area name)
- Look for Government office region names within the free text. (e.g. Yorkshire) and remove any mappings which result in a Government office region other than the one found
- Look for "big area names" (based on the standard list of county / unitary codes) within free text. e.g. Where "Leeds" has been written, this is more likely "Leeds" in the north, rather than the south in the absence of any other information being provided
After carrying out these steps, the best match is identified using a scoring system. If there is only one option, use that, otherwise prioritise according to the following order:
- Country match
- Government office region match from partial postcode
- Matches area name exactly with free text
- Longest overlap of area name in free text, e.g. if the text contains "Newcastle Upon Tyne", map to "Newcastle Upon Tyne" and disregard "Newcastle"
- Match on spelling mistake
- Match on domicile Government office region, e.g. if the student only writes "Newcastle" and was previously living in the North East, map to "Newcastle Upon Tyne"
- Match on provider Government office region, e.g. if the student only writes "Newcastle" and was previously studying in the North East, map to "Newcastle Upon Tyne"
- Match on another graduate writing the same text
- Match on big area name e.g., there are two areas called "Leeds", map to north in absence of any other information
Carry out some consistency checks and update as appropriate:
Where a graduate has completed both, compare location of employment with location of self-employment. If the free text is the same for both and there is a postcode provided for one, map to the code obtained from the postcode.
Compare across academic year to ensure consistency. E.g. if there is a spelling mistake which has been mapped in one year and not in another, the more detailed information can be used.
Revision history
Contact Liaison by email or on +44 (0)1242 388 531.