Skip to main content

HESA Collections

Back to C21051

Validation overview

Version 1.0 Produced 2021-03-23

HESA has developed extensive quality assurance procedures and runs a range of automated validation checks (quality rules) against all submissions. This document describes the different stages of quality rules and at what point during the submission process they are applied.

XML files must be encoded with UTF-8 if they contain characters beyond the standard ASCII character set. Providers are advised to specify the encoding used in their XML files (i.e. <?xml version="1.0" encoding="UTF-8" ?>) and to ensure that their files are actually saved with that encoding. Files with an explicit encoding declaration other than UTF-8 will be rejected. Files with undeclared encoding will be assumed to be UTF-8. If encoding is not specified or does not match the actual file encoding, providers are warned that there is a risk that data contained in the files may be changed on submission to HESA.

Business stage validation

Business stage validation (previously INSERT-stage validation) are checks on the structure and logic of an individual submission. There are two types of validation checks carried out at business stage:

Schema checks

Checks that the XML is 'well formed' and that it conforms to the rules of the schema definition (the XSD files)


  • Every element in the file must have the correct opening and closing tags
  • Elements must be correctly nested
  • Elements must contain the correct type of data (strings, dates, valid entries etc.)
  • Elements that include characters with special meaning in XML (such as greater than, & ampersand, ' apostrophe and " quotation mark) must be replaced with the appropriate entity reference. Further information is available at:
  • Elements must be submitted in the sequence defined in the XSD

Business stage quality rules

A set of rules to check the business logic of the submission.


  • Consistency between pairs or groups of elements
  • Elements that are compulsory under certain circumstances
  • Range checks

Note that business stage quality rules can only be carried out when all structural errors (schema checks) have been resolved.

Validation kit

HESA provide a downloadable validation kit to assist providers in the preparation of their data. The kit is a piece of software which uses a number of collection-specific validation 'packages' to provides some basic structural and 'sense' checks prior to data submission with the aim of reducing the number of errors encountered when submitting data to the data collection system. The validation package for a collection performs the two different types of business stage validation.

Business stage quality rules can be switched off when running data through the validation kit, both locally and at HESA. Schema checks can never be switched off.

Continuity validation

In addition to the business stage validation, on submission to the data collection system HESA also apply continuity rules.

UHN is the linking mechanism used by HESA to track continuing student instances of study between HESA reporting years. Continuity validation details potential discrepancies with student instance linking within the inserted data file. For more information on the UHN link and continuity reporting please see Understanding student instance continuity and the UHN link.

Continuity validation includes a check to ensure that for students continuing on an instance, there is a record in a previous year's return that can be linked to the record in the current year's return on the basis of a matching UHN link. Therefore where entry profile data is submitted at the start of the instance and not in subsequent years, these continuity checks ensure that such data can be found and linked to the incoming instance.

Continuity validation also checks that records on the Expected Instance Population list (previously referred to as the HIN target list), generated from the previous year's return are included in the current year's return. Where entry profile data is submitted at the start of the instance as well as in subsequent years, continuity validation ensures that such data does not change and is as on entry to the course (unless corrections are being made or unknown values are being populated).

Exception stage validation

A further stage of validation will run when data is submitted to the data collection system. Exception stage validation (previously COMMIT-stage validation) includes quality rules that require comparisons with data across an entire return and/or against reference data held at HESA. These checks cannot be run within the validation kit for this reason.

Validation kit software

A validation kit capable of schema checks is expected to be made available soon. A validation package incorporating the validation rules for the collection 'rolled-on' from 2020/21 will then follow. This package will be subsequently updated to use any new or amended rules for 2021/22, and re-released.

Contact Liaison by email or on +44 (0)1242 388 531.