Skip to main content

Unistats dataset

The Discover Uni website provides comparable sets of information about full- and part-time undergraduate courses. It is run by the Office for Students and is designed to meet the information needs of prospective students.

Please note: It has come to our attention that errors were present in the employment table of the 2020/21 to 2022/23 Unistats datasets. The percentage values displayed in the columns ‘study’ and ‘both’ were incorrectly switched for some HE courses. Corrected data for the current Unistats dataset was issued with the weekly update at 09:30 on Wednesday 8th February 2023. Given the availability of current correct data, older versions of data for the two years 2020/21 to 2021/22 will not be corrected and remain affected. For access to time series data on Graduate Outcomes, please visit our HE Graduate Outcomes Data pages.

Unistats dataset

As an additional technical resource for analysts and developers, we have made the raw dataset that underlies the Discover Uni website available for download. This incorporates information from the Unistats record. The download is presented as a *.zip file containing the data in both XML format and multiple *.csv files.

Supporting files, documents and information about updates to the data are provided below.

Longitudinal Education Outcomes (LEO) data

Longitudinal Education Outcomes (LEO) data is available in the LEO3.csv and LEO5.csv files, and also between the GOSalary and Tariff entities in the XML Unistats dataset, as well as the LEO3SEC.csv and LEO5SEC.csv files, which exist within the SectorSal entity in the XML Unistats dataset.

The LEO data in the Unistats dataset is derived from data extracts owned by the Department for Education (DfE). The DfE do not accept responsibility for any inferences or conclusions derived from the LEO data by third parties.

Further information can be found on the OfS website, where you can also provide feedback on this data.

Onward use of Unistats data

We are keen to ensure that any end users of Unistats data are given sufficient information to allow correct use of the data which has been made available to them.

Several of the courses featured in the Unistats dataset have very small numbers. As a result, a change of only one student can make a substantial difference to reported numbers; we would advise caution when reporting on courses with small numbers of students, particularly when making comparisons between courses.

The Unistats dataset contains many datapoints about courses offered by UK providers. In some cases the course is too small, or the response rates to surveys are too low, for the data to be published about the course specifically. In such cases, data describing the course is aggregated either across multiple years, or with other similar courses at the same provider.

Some courses in the Unistats dataset cover more than one subject area – this may be because it is a joint honours course, or the course covers a diverse subject. When the data for these courses is published at subject level, the course will appear several times in each Unistats table: once for each subject.

As a result of the above, we would advise anyone wishing to make onward use of the Unistats data to get in contact, to understand how this is reflected in the dataset, and how to highlight the considerations outlined above. We aim to ensure that there is minimal risk of onward misinterpretation of the Unistats data.

Terms and conditions

The Unistats dataset is free to copy, use, share, and adapt for any purpose. The Unistats dataset is published under the Creative Commons Attribution 4.0 International (CC BY 4.0) licence. You must give appropriate credit (HESA,, provide a link to the licence, and indicate if any changes have been made.

Download the Unistats dataset

When you click the button above, the download will begin. The file is delivered as a compressed archive (*.zip) containing a single XML file, a readme.txt file, and a number of *.csv files.

Supporting files and documents

Unistats record 2022/23: Coding manual for the 2022/23 data collection.

XSD schema file: Unistats output schema - Provides detail of the structure of the data file.

Overview of the dataset: Unistats dataset file structure and description.

Subject codesLookup table for CAH 1.3.4 (applicable to latest data)

Lookup table for CAH 1.3.3 (applicable to data from 25 February 2020 to 14 April 2021)

Lookup table for CAH 1.2 (applicable to data from 18/19 to 25 February 2020)

Lookup table for JACS 3.0 (applicable to data from 12/13 to 17/18.

Lookup table for JACS 2.0 (applicable to data from 09/10 to 11/12).

KISAIM codes: List of KISCourse.KISAIM valid entries.

Data updates

Updates to the Unistats dataset will be made as required, and in parallel with the Discover Uni website, when contributing higher and further education providers wish to update their information, or when HESA or the Office for Students (OfS) make changes to the underlying data. These updates occur weekly on Wednesday mornings. The file name of the *.zip file includes the date and time.

Please see the OfS statement on corrections and revisions to the data presented on the Discover Uni website for a full description of how Unistats data changes throughout the year and how these changes are recorded.

If you have any questions on the dataset, please email HESA’s Official Statistics team, or call +44 (0)1242 388 513

Older Unistats data

If you are using an older version of the Unistats dataset please use the supporting files for the relevant collection year. These are available via the 'collection' links below.

Please note that Unistats data may not be comparable across years.

Methodologies and source data used to compile the Unistats dataset have been updated and improved over time. Individual courses are also subject to changes in student numbers and survey responses. This means data for a course may be aggregated in some years but not others.

Download Unistats dataset 2019/20

Unistats Collection 2019/20 for dataset downloaded 2019-09-11 to 2020-10-13 [Please note, due to updates to the Common Aggregation Hierarchy, there are two versions of the Unistats output schema - one for use prior to 26 February 2020, and one for use from 26 February onwards]

Download Unistats dataset 2017/18

Unistats Collection 2017/18 for dataset downloaded 2017-09-04 to 2018-08-29 [Please note, due to the inclusion of LEO data, there are two versions of the Unistats output schema - one for use prior to 5 July 2018, and one for use from 5 July to 29 August 2018]

Download Unistats dataset 2016/17

KIS Collection 2016/17 for dataset downloaded 2016-09-01 to 2017-09-03 [Please note, due to changes in the TEF, there are two versions of the Unistats output schema - one for use prior to 22 June 2017, and one for use from 22 June to 3 September 2017].