
XML stands for eXtensible Mark-up Language and can be used to structure, store, and transport data.
XML is now the international standard for data transfer, enabling hierarchical data structures to be transferred in a single file.
The majority of HESA returns are made using XML:
The specification of each record begins with the data model. A data model (sometimes known as an Entity Relationship Model) defines the entities covered by the specification and the relationship between the entities.
Entities have attributes; for example, gender is an attribute of a student. Therefore, a field is an attribute of an entity. HESA documentation uses the structure Entity.SHORTFIELDNAME (e.g Student.GENDER) when referring to specific fields.
An XML file contains elements. An element is identified using tags (defined with 'pointy brackets') and each element must have a start tag and an end tag. Each tag contains the element name and the end tag also contains a / character.Elements can contain data; for example, a birth date element might look like this:
<BIRTHDTE>1975-06-18</BIRTHDTE>
Elements can be nested within other elements to represent hierarchical data structures. Therefore, in the HESA Student record an element can be an entity or an attribute (i.e. a field). For example, this Student entity has attributes of name and date of birth:
<Student> <FNAMES>Joeseph William</FNAMES> <SURNAME>Bloggs</SURNAME> <BIRTHDTE>1975-06-18</BIRTHDTE> </Student>
There are many XML training resources on the web, including the W3C tutorial which can be found at www.w3.org/TR/xmlschema-0.
More information about XML can be found in the Technical Formats document at http://www.hesa.ac.uk/submit-tech.
Terminology |
Definition |
| Entity | A single entity groups together a set of fields which have the same relationship. |
| Field |
A field is an attribute (data item) of an entity. |
| Reason for null |
Used to describe a field requiring an explanation for a null value. This must be accompanied by a reason code, for example, 2: not sought, 3: refused or 9: not applicable. |
| Parent element |
XML is made up of elements. Entities are known as the ‘parent elements'. |
| Child element |
XML is made up of elements. Fields are known as ‘child elements' as they belong to a ‘parent element' (i.e. entity). |
| Nested |
Elements (fields and entities) can belong to one another and thus are 'nested'. |
| Schema |
The schema describes the structure of the XML document (number of elements, whether an element can be empty, default/fixed values, etc.) and valid entries. The schema is defined by the XSD. |
What is schema error?
As with any other data format the file structure must be in the correct order. Schema errors are triggered where data fields are not in the correct order. Validation will be unable to run fully until these schema errors are corrected. The XSD files in the record coding manual show the position of the entities and fields within each data stream.
How do I complete a field that doesn't apply?
The flexibility of the XML model means that fields can have minimum and maximum occurences, so that where the field does not apply or is not required it does not need to be completed. In other formats, fields would need to be completed with the default code in order to pass validation. Within the record decription for each field the minimum and maximum occurences are defined. These should be viewed in conjunction with the coverage of the field, as the field may be required for certain groups.
Can multiple files be submitted?
Yes, institutions can submit multiple XML files to the system however it must be ensured that the data is complete and discreet. A common issue found is where, for example, a student is studying two programmes but each is contained in a separate file. It is therefore important to ensure that any common information is consistent.
The specification of an XML structure is contained in an XML Schema Definition (XSD) file. This defines the elements, their optionality and their structure. Specifically:
HESA have also produced schema trees to assist institutions in interpretation of the XSD and to illustrate the order of the fields. Please note that the schema trees should be used in conjunction with the XSD and not instead of it.
In order to further assist institutions HESA have produced a number of sample files to show how the XML file may look for the different records. The files are designed to show the structure of data and pass schema validation only.