Skip to main content

HESA Student record 2012/13

Back to C12051

Data standards in the HESA Student record


Version 1.0 Produced 2012-09-27

Click here for the information Standards Board web site

The Information Standards Board (ISB) for education, skills and children's services (escs) is the overarching authority and governing body for the management and assurance of information and data standards across the escs system. It is both adopting existing data standards and developing new data standards for this domain and is subsuming the standards developed by the MIAP programme. The Aligned Data Definitions adopted by the ISB define a standard set of data definitions and policies.

This document describes the adoption of relevant data standards in the HESA Student Record.

UK Register of Learning Providers

Adoption of the UK Provider Reference Number (UKPRN) is a key element in the standardisation of data definitions between stakeholders' systems. It is envisaged that, over time, all stakeholders will adopt the UKPRN as the primary identifier for institutions. See the the UK Register of Learning Providers (UKRLP) web site for further information.

Unique Learner Number (ULN)

The allocation of ULNs was piloted with a first major roll out in 2007 covering students who were on the National Pupil Database. Systems will be put in place to issue ULNs to mature students and students from overseas. Students who have been allocated a ULN should be aware of this fact and should have access to their number. However, it is likely to be a number of years before all students entering higher education are able to provide this information.

Fields affected by the ISB Aligned Data Definitions (ADD)

The following table shows those fields affected by the ADD. Fields marked as a ADD field are contained within the current set of ADD; those marked as a ADD standard are not specifically included in the current set of ADD, but are affected by ADD standards and/or data policy. Details of the specification can be found in the documentation for each field.

FieldADD fieldADD standardNotes
Institution.UKPRNYValid UK Provider Reference Number from the UKRLP. This replaces the INSTID field.
Student.ULNYStandard structure with checksum.
Student.BIRTHDTEYISO8601 date format. Representation of not known with reason code.
Student.SURNAMEYStandard character set. Maximum field length of 100 characters.
Student.SNAME16YStandard character set. Maximum field length of 100 characters.
Student.FNAMESYStandard character set. Maximum field length of 100 characters. Representation of not applicable with a reason code.
Student.NATIONYStandard coding frame based on the ISO-3166-1 Alpha-2 list of country codes and adapted by the Office for National Statistics.
Student.TTPCODEYBS7666 format. Representation of not known with reason code.
Instance.COMDATEYISO8601 date format.
Instance.ENDDATEYISO8601 date format. Representation of not applicable with reason code.
Instance.MCDATEYISO8601 date format. Representation of not applicable with reason code.
Instance.PHDSUBYISO8601 date format. Representation of not applicable with reason code.
Instance.SPLENGTHYRepresentation of not applicable with reason code.
Instance.YEARLGTHYRepresentation of not applicable with reason code.
EntryProfile.DOMICILEYStandard coding frame based on the ISO-3166-1 Alpha-2 list of country codes and adapted by the Office for National Statistics.
EntryProfile.POSTCODEYBS7666 format. Representation of not known with reason code.
Course.CTITLEYStandard character set.
Module.MTITLEYStandard character set.
Instance.OWNINSTYStandard character set.
Instance.OWNSTUYStandard character set.

Data policy

In addition to individual field definitions, the Aligned Data Definitions set out areas of data policy that apply to many fields or across the return specification as a whole. The following areas of data policy are adopted in the HESA Student Record. The full set of data policy statements can be found in the ADD documentation on the ISB web site.

  • Use of XML schemas that adhere to the W3C XML Schema Recommendation
  • Use of the Unicode characterset
  • Use of UTF-8 for encoding Unicode characters
  • Representation of "no data" - Where there is not a specific code for "no data", an empty string should be used. This must be accompanied by a reason code to indicate the reason for missing data. This approach should be adopted in cases where field entries are not defined in a closed list.
  • Representing the Reason for Missing Data - Anywhere that a "null" value is used, it must be accompanied by a reason code:

    1: not provided (reason not specified)

    2: not sought (no request has been made for the information)

    3: refused (information requested but not provided)

    8: see supplementary field

    9: not applicable

    If the code is "8" additional information must be provided.

    This code must be encoded in XML schema as optional attributes, with a business rule to indicate when it must be included.

  • Representing Metadata - Additional information (such as the reason for lack of data) shall be represented using attributes in XML Schema. Such data can be represented as elements or attributes. Elements are the more flexible approach, but result in needing another level of hierarchy. Attributes meet the requirements for this data and their use in this context matches the implementation of other e-GIF systems.

Contact Liaison by email or on +44 (0)1242 388 531.