Skip to main content

Unistats record 2019/20

Back to C19061

Unistats record 2019/20 - Validation overview

Version 1.0 Produced 2019-04-26

Validation overview

HESA has developed extensive quality assurance procedures and runs a range of automated validation checks (quality rules) against all submissions. This document describes the different stages of quality rules and at what point during the submission process they are applied.

XML files must be encoded with UTF-8 if they contain characters beyond the standard ASCII character set. Providers are advised to specify the encoding used in their XML files (i.e. <?xml version="1.0" encoding="UTF-8" ?>) and to ensure that their files are actually saved with that encoding. Files with an explicit encoding declaration other than UTF-8 will be rejected. Files with undeclared encoding will be assumed to be UTF-8. If encoding is not specified or does not match the actual file encoding, providers are warned that there is a risk that data contained in the files may be changed on submission to HESA.

Business stage validation

Business stage validation (previously INSERT-stage validation) are checks on the structure and logic of an individual submission. There are two types of validation checks carried out at business stage:

Schema checks

Checks that the XML is 'well formed' and that it conforms to the rules of the schema definition (the XSD files)

Examples:

  • Every element in the file must have the correct opening and closing tags
  • Elements must be correctly nested
  • Elements must contain the correct type of data (strings, dates, valid entries etc.)
  • Elements that include characters with special meaning in XML (such as greater than, & ampersand, ' apostrophe and " quotation mark) must be replaced with the appropriate entity reference. Further information is available at: http://www.w3schools.com/xml/xml_syntax.asp
  • Elements must be submitted in the sequence defined in the XSD

Business stage quality rules

A set of rules to check the business logic of the submission.

Examples:

  • Consistency between pairs or groups of elements
  • Elements that are compulsory under certain circumstances
  • Range checks

Note that business stage quality rules can only be carried out when all structural errors (schema checks) have been resolved.

Validation kit

HESA provide a downloadable validation kit to assist providers in the preparation of their data. The validation kit provides some basic structural and 'sense' checks prior to data submission with the aim of reducing the number of errors encountered when submitting data to the data collection system. The validation kit performs the two different types of business stage validation.

Business stage quality rules can be switched off when running data through the validation kit, both locally and at HESA. Schema checks can never be switched off.

Exception stage validation

A further stage of validation will run when data is submitted to the data collection system. Exception stage validation (previously COMMIT-stage validation) includes quality rules that require comparisons with data across an entire return and/or against reference data held at HESA. These checks cannot be run within the validation kit for this reason.

Unistats validation

Following the submission of a Unistats file which has successfully passed Business and Exception stage validation, further checks are applied to confirm the level of aggregation in the UNISTATS dataset.

The Unistats stage quality rules are only run for courses that have triggered the QR.Cyy061.KISCourse.HESACourse.4 rule. This highlights where multiple HESA courses that have been linked to have sets of JACS codes that fall into different CAH level 3 groupings to each other. This may be caused where providers have reviewed the JACS codes for their courses through the years as returned on the Student record, AP student record or ILR, or where genuinely different courses are being linked to.

HESA can approve switches to QR.Cyy061.KISCourse.HESACourse.4 on the proviso that all of the aggregated data items from the Student record, DLHE and NSS can be published on Unistats at course level. The data must be publishable at course level, as data cannot be published at subject level due to the inconsistencies between JACS codes linked to.

Once a validation switch for QR.Cyy061.KISCourse.HESACourse.4 has been applied and once the NSS data is available in July, the data will be exposed to a series of Unistats stage quality rules:

  • QR.Cyy061.KISCourse.KISCOURSEID.1
  • QR.Cyy061.KISCourse.KISCOURSEID.2
  • QR.Cyy061.KISCourse.KISCOURSEID.3

These three rules check separately the aggregation level for the Student, DLHE and NSS data and highlight any courses for which Unistats cannot publish all of the data items at course level. As in the example below, the course has failed because the Student record data cannot be published at course level.

tableunistats.png

Where one or more of the Unistats stage quality rules are triggered, providers will need to amend the HESA or ILR course links for the affected courses. Within the Quality rule report, the switched errors section will include a switch for QR.Cyy061.KISCourse.HESACourse.6. This contains details of the KISCourses with links to multiple courses with different combinations of subject codes. Where these KISCourses also appear in the Unistats rules then providers will need to review the course links made to the KISCourse and remove links to the HESA/ILR courses with the incorrect subject codes. In order to pass validation, all of the linked HESA/ILR courses will need to have matching subject information. Once the relevant HESACOURSEIDs have been removed HESA will need to then remove the original switch that was enacted for these courses in order for the data to be valid.

Validation kit software

A further stage of validation will run when data is submitted to the data collection system. Exception stage validation (previously COMMIT-stage validation) includes quality rules that require comparisons with data across an entire return and/or against reference data held at HESA. These checks cannot be run within the validation kit for this reason.

A validation kit is available for download as an MSI installation file. To install the software you simply need to run the MSI file on a Windows PC. If you do not have access to a Windows PC, please contact Institutional Liaison ([email protected]) for further instructions.

When the validation program is opened it automatically checks the HESA web server for the latest set of rules and updates the package if appropriate.

A quick start guide is included in the application.

The kit utilises the Microsoft .NET framework. This is likely to already be installed on many computers as it is required by much of Microsoft's own software. You can download the .Net framework from Microsoft's website.

If you have any queries with regard to the validation kit then please contact Institutional Liaison ([email protected]).


Need help?

Contact Liaison by email or on +44 (0)1242 388 531.