Skip to main content

Graduate Outcomes and other data on graduates

While Graduate Outcomes is the only national survey designed specifically to provide insight into the experiences of higher education graduates, the domains of several other datasets overlap to an extent with the domain of the Graduate Outcomes survey. Graduates in further study at UK higher education providers will be recorded in the HESA Student Record, and linking the two datasets can provide further information about the quality of Graduate Outcomes data. Beyond HESA, both the Longitudinal Educational Outcomes (LEO) study and the Labour Force Survey (LFS) collect data on education and salary, with the LFS also including detailed information on employment and occupation. While the Graduate Outcomes, LEO, and the LFS can provide complementary views of graduates in the workforce, it is important to understand key differences between the three data sources.

Graduate Outcomes and the HESA Student record

In Spring 2021, HESA analysts carried out a quality assurance investigation based on linked data from the HESA Student record and the 2017/18 Graduate Outcomes dataset. A linked dataset was constructed linking all graduates in the 2017/18 target population with the Student records from 2017/18 to 2019/20, and fuzzy matching of data items contained in both Student and Graduate Outcomes was used to identify those members of the Graduate Outcomes population who appeared to be in further study according to the Student record during the relevant Graduate Outcomes census week. By investigating the characteristics of graduates who appeared to be in further study in both datasets, those who recorded themselves in Graduate Outcomes as engaged in a course of further study or training but could not be found in the Student record, and those who appeared to be in further study in the Student record but not in the Graduate Outcomes dataset, we hoped to evaluate the extent to which data on further study in the two datasets is consistent and comparable. 

Our initial investigation of linked Graduate Outcomes and Student data in 2021 suggested that, where individuals can be found in both datasets, the two datasets match quite closely, which in turn suggests that the Graduate Outcomes data is generally robust. The areas of mismatch between the two datasets formed the basis for a set of research questions which we began to investigate in Spring 2022, this time linking data from the first three years of Graduate Outcomes data to the HESA Student Record. 

For those graduates who reported themselves to be in further study in Graduate Outcomes who could also be found in further study in the Student record, we set out to investigate the alignment of the two datasets. For those who could not be found in the Student record, we looked to identify any variation by cohort, patterns in the type of qualification reported, and clusters at particular providers. For those who reported no further study, we set out to investigate how many could nevertheless be found in the HESA student data in census week, how many could be found in the HESA data to have completed interim study, and how close to the census week were the start and end dates of their current or interim study. Turning to those graduates who reported that they were not in study during census week but that they had undertaken some interim study since completing their initial qualification, we looked to see what proportion of that interim study could be found in the linked Student data and how well aligned the Student data was with self-reported data on interim study in terms of provider, level, and mode of study. 

The results of our 2022 investigation into linked Graduate Outcomes and Student data were published in June 2022, following the 2019/20 Graduate Outcomes statistical release[0]. While survey responses on the level and mode of further study generally aligned closely to the Student record, there seemed to be some confusion on the part of graduates regarding the level of vocationally oriented qualifications, with some graduates saying they were studying for professional qualifications where administrative data indicated that they were on taught postgraduate courses. We also found some discrepancies regarding the start and end dates of further study, with some graduates reporting that they were enrolled on qualifications which appeared in the Student record either to end shortly before or to commence shortly after the census week.

When we looked at interim study, most reported interim study could be found in the linked Student data, with a relatively high degree of alignment between the two datasets, although we did see some apparent confusion relating to level of study, particularly where graduates were enrolled on postgraduate taught courses prior to commencing postgraduate research degrees. We also saw some graduates apparently answering the survey questions on interim study with regard to the qualification they had completed fifteen months ago.

We made several recommendations based on our investigation. We recommended that HESA should continue to monitor discrepancies between Graduate Outcomes and Student data with regard to level of study and course start and end dates, as well as monitoring the extent to which the interim study questions appear to be completed about the course completed fifteen months ago. On the basis of further investigation into the issues around level of study, we also suggested some minor changes to the survey wording in order to make it clearer which qualifications count as professional qualifications and which count as taught postgraduate; this wording was implemented for the fifth year of surveying. Ongoing exploration of these areas will help us continue to identify opportunities for improvement of the survey and associated guidance. 

Graduate Outcomes and external data on graduates

The LEO dataset, which was first published in 2017, brings together education data from the Department for Education (DfE) along with employment, earnings, and benefits data from the Department for Work and Pensions (DWP) and Her Majesty’s Revenue and Customs (HMRC). Using these sources, LEO provides earnings and benefits information for graduates one, three, five, and ten years after completion of their qualifications; it also includes data on personal characteristics (gender, ethnicity, and age), university attended, subject studied, qualification achieved, and graduate movement between home region, provider region, and current region.[1]

Unlike Graduate Outcomes, which, as a survey, depends on the individual responses of graduates, the LEO dataset is drawn from administrative data and includes information on all graduates from English providers in paid work in the UK; since LEO earnings data comes directly from HMRC, it is free of some of the risks of inaccuracy inherent in self-reported salary data. LEO does not, however, include data on hours worked, so it is not possible to distinguish between graduates who are in full-time work and those who are working part-time; this can be a particular issue for data on female graduates, who are more likely to be working part-time than their male counterparts.[2] LEO also does not include data on graduates doing voluntary or unpaid work, and, because the LEO earnings data does not include self-assessment earnings, LEO data on graduates in self-employment cannot be entirely representative.[3] LEO includes data on industry of employment, but it does not include more detailed information about occupation. The LEO record tells us what graduates earn and in what industries they are employed, but it gives us only limited information about what graduates do.[4]

Graduate Outcomes and LEO thus provide different pictures of the graduate population in the UK. One of the goals in the design of the Graduate Outcomes survey was to provide statistical outputs which could contextualise data on graduates from other sources, such as LEO, and this goal is reflected in the breadth of information collected in the Graduate Outcomes survey.[5] While the LEO dataset provides data on a small number of variables for most graduates in the UK, and while it, moreover, tracks changes in earnings over time, the Graduate Outcomes survey provides a more detailed picture of each annual cohort at a single point in their post-university careers. The LEO dataset measures graduate outcomes only in terms of whether graduates are in paid employment and, if so, how much they are earning in what industry, while the Graduate Outcomes survey collects a broader range of information about what graduates are doing and how they feel about it.

While LEO is specifically geared towards collecting data about employment outcomes for higher education graduates, the LFS is a household survey designed to collect data about the employment circumstances of the UK population as a whole. It was first run in 1973 as a biennial survey and shifted to an annual survey in 1984; since 1992, the LFS has been collected quarterly, with a switch from seasonal to calendar quarters in 2006. Households participating in the LFS are surveyed for five consecutive quarters, with a fifth of the overall sample being replaced each quarter. Where LEO collects administrative data on all graduates in employment in the UK, the LFS is administered to a systematic sample of approximately 35,000 households in Great Britain, plus approximately 2,500 households from Northern Ireland; conclusions about overall patterns in employment circumstances are thus drawn from a relatively small portion of the UK population.[6]

Unlike the LFS, which is concerned with the entire UK labour force, Graduate Outcomes is concerned only with those who have completed HE qualifications in a given year, and, while there will inevitably be some level of non-response, Graduate Outcomes aims to collect data from the entire target population. With 361,215 responses in the first year 380,980 in the second, 374,885 in the third, 355,165 in the fourth and 351,232 in the fifth year the Graduate Outcomes sample is thus much larger than the annual sample collected by the LFS, despite the narrower focus of the Graduate Outcomes survey.[7]

Although both Graduate Outcomes and the LFS include questions about employment and education, the focuses of the two surveys are different. The LFS is primarily focused on employment, but participants are also asked to respond to the ONS4 SWB questions and to a series of questions about their educational attainment.[8] Since not all LFS respondents have the same educational qualifications, the educational information collected in the survey allows for some comparison of outcomes between people with different educational histories. All Graduate Outcomes respondents, on the other hand, are higher education graduates, so different comparisons are possible; rather than encouraging comparisons between graduates and non-graduates, Graduate Outcomes encourages comparisons between different categories of graduates.

Respondents to the LFS can be at any stage in their careers; for those who have higher education qualifications, this means that they may be selected to participate in the LFS shortly after finishing their qualifications, or they may be selected many years later. Even within the subset of LFS respondents with higher education qualifications, there will therefore be a wider variation in experiences and possible outcomes than is likely to be visible in Graduate Outcomes, where graduates are deliberately surveyed at the same point in their post-university careers. While Graduate Outcomes provides a cross-section of the experiences of higher education graduates 15 months after finishing their qualifications, the LFS can provide glimpses into what their lives may be like at a variety of different points.

If we are looking for a complete picture of what happens to higher education graduates in the UK, Graduate Outcomes, LEO, and the LFS all provide different pieces of the puzzle. Although the datasets could fruitfully be used in conjunction with each other – the use of the same set of SWB questions in Graduate Outcomes and the LFS might, for example, allow for some research into the comparative SWB of graduates and non-graduates – in making any comparison between the three data sources, it will be important to recognise the differences in methodology and coverage between the sources. To return to the example of SWB comparisons, although LFS and Graduate Outcomes respondents answer the same four questions about SWB, they are faced with those questions at different points in their careers, and differences in SWB may depend on a range of factors not necessarily connected to education.

In addition to enabling careful comparisons between graduates and the population as a whole or between different stages in graduates’ careers, the existence of other datasets with overlapping domains is likely to be important in the future development of Graduate Outcomes. When LEO data was first published, the DfE conducted a comparison between the LEO and DLHE datasets; HESA has in the past carried out similar comparisons in order to check the quality of DLHE salary data, and a further, detailed comparison of LEO and Graduate Outcomes would provide useful information about the respective strengths and weaknesses of the two datasets.[9] HESA also hopes in future years to explore the possibility of linking the Graduate Outcomes record with other relevant datasets, including LEO salary data.[10] Doing so will not only allow us to streamline our collection processes, but also, and perhaps more importantly, it will allow us to provide a fuller view of the trajectories of graduates after they leave higher education.

Next: Comparability and time series

[1] Department for Education. 2021. Tax Year 2018/19. Graduate Outcomes (LEO).

[2] Due to the limitations of LEO as a representative measure of female earnings, researchers from the Institute for Fiscal Studies chose to focus on the earnings of sons in their recent report for the Social Mobility Commission, The Long Shadow of Deprivation:

[3] Department for Education. 2017. Employment and earnings outcomes for higher education graduates by subject and institution: experimental statistics using the Longitudinal Educational Outcomes (LEO) data.

[4] Office for Statistics Regulation. 2019. Exploring the public value of statistics about post-16 education and skills in England. Office for Statistical Regulation Systematic Review Programme.

5] Further discussion of the goals which shaped the design of the of the survey can be found in relevant sections of the Graduate Outcomes Survey methodology; see and

[6] Office for National Statistics. 2018. Labour Force Survey User Guide, Volume 1: LFS Background and Methodology. Available from:

[7] HESA 2021. Graduate Outcomes Cohort D Review: C18071 2018/19.

HESA. 2020. Graduate Outcomes Cohort D Review: C17071 2017/18.

[8] Office for National Statistics. 2020. Labour Force Survey User Guide, Volume 2: User guide to the LFS questionnaire.

[9] Department for Education. 2016. Employment and earnings outcomes for higher education graduates: experimental statistics using the Longitudinal Educational Outcomes (LEO) data.

[10] HESA. Key principles of Graduate Outcomes.