Assessing the quality of further and interim study data collected in the Graduate Outcomes survey
- For the most part, where graduates report in the survey that they have undertaken a period of further study at a HESA provider, we are able to find a corresponding link in HESA administrative records.
- Responses to questions in the survey on level and mode of further study also generally align closely to what we observe in HESA data.
- One slight area of confusion relating to further study appeared to be that some graduates who are studying for more vocationally orientated qualifications such as a PGCE select the option ‘professional qualification’ in the survey, despite our records indicating they are on a postgraduate taught course.
- When examining those individuals who reported in the Graduate Outcomes survey that they were engaged in further study at the time of the census week within a provider that submits data to HESA, we found 17% of these graduates appeared to be studying at this provider outside of the census week according to HESA records.
- A high agreement rate is found between survey and administrative data for level and mode of interim study when focusing on those for whom we find a match based on provider prior to census week.
- Of those reporting on study undertaken between the end of their original course and the survey census week, around 10% each year would appear to report on their original course.
- The proportion of graduates (at all levels of qualification) who undertake full-time (significant) higher education level interim study is low at only 8%.
- Recommendation 1: We suggest further monitoring around the issue of some graduates selecting ‘professional qualification’ in instances where our data suggests otherwise. Should this pattern continue to occur, we shall consider whether the design of the question may benefit from being modified or whether any additional guidance might be helpful.
- Recommendation 2: We propose conducting further exploration into the current misalignment we observe between survey and HESA records on whether a graduate was in study during census week.
- Recommendation 3: We recommend retaining the further study questions in the Graduate Outcomes survey while any further assessment of the alignment of survey and administrative records takes place.
- Recommendation 4: We advise continued tracking of the extent to which graduates appear to report about the course they completed fifteen months ago in the interim study section. If this seems to persist, we shall assess options such as whether amending the wording of the question or issuing more guidance could assist with reducing this discrepancy.
- Recommendation 5: Users of the published Graduate Outcomes statistics are currently provided with a filter for those who undertook “significant interim study” - a binary tab that enables users to include or exclude such graduates from the figures. The statistical evidence supporting use of this filter appears to be weak. We suggest engagement with users should be undertaken on the basis of the statistical evidence to ascertain whether the interim study filter serves any useful function. If it is determined that it does not, we should consider discontinuing its use.
Back in November 2021, HESA outlined the approach we were taking to evaluate and continuously improve the Graduate Outcomes survey. One of the elements that we noted would be the subject of future work was additional exploration of those who were in further study at the time of completing the questionnaire or who had undertaken some form of interim study between graduation and the 15-month survey point.
During the NewDLHE review consultation period, respondents indicated their interest in seeing us link HESA Student data to the Graduate Outcomes survey to explore the benefits and efficiencies this might bring. With this goal in mind, we completed a small-scale investigation in Spring 2021 into the extent of mismatch between survey and administrative records (i.e. HESA Student data) for those who reported they were in further study at 15 months in the first year of Graduate Outcomes data, covering 2017/18 qualifiers. Our key conclusion from that research was that there seemed to be a high degree of alignment between responses to the survey and HESA administrative records (see section 184.108.40.206 of the 2021 quality report for more information).
Here, we extend previous work by expanding our linked HESA-Graduate Outcomes dataset to include data from subsequent years of the survey in order to determine whether the patterns we saw in the first year hold true for later collections. We also conduct a more detailed examination on the extent and nature of discrepancies between survey and administrative data, enabling us to identify and understand the most common areas of mismatch between the two sources. Furthermore, for the first time, we also look at graduates who have undertaken a period of interim study, with a particular focus on how well survey data on interim study aligns with administrative records. This insight summarises the results of our supplementary analysis.
2: Why does this matter?
As a producer of statistics, we aim to work in accordance with the Code of Practice for Statistics and to uphold its three pillars of trust, quality, and value. Part of this involves ensuring that the statistics we produce are relevant and take user needs into consideration. There is ongoing policy interest in graduate progression to further study; the percentage of graduates in further study is used in regulation in England, for example, and access to postgraduate study is emerging as a focus for those working on issues around participation in higher education more generally. In designing the Graduate Outcomes survey, HESA therefore made sure that graduates in further study would be identifiable in the data. The 15-month survey point, moreover, was selected in part because it would give an opportunity to learn more about graduate trajectories, including work or study between graduation and the survey point.
Given the level of interest in data on further study, it is important for us to understand the quality of the information we collect on this topic. Since data from the Graduate Outcomes survey is self-reported, there is the possibility that respondents are unable or unwilling to provide correct details about their outcomes, either because they struggle to understand and interpret the question being asked or because they do not wish to disclose their activities after graduation. If such problems with self-reported data should be widespread, they would have the potential to reduce the reliability of statistics disseminated on the destinations of qualifiers. Comparing data on both interim and further study collected through the Graduate Outcomes survey with our own administrative records allows us to assess the accuracy of the survey data being gathered, in line with Q3.3 of the Code of Practice. This principle states that ‘statistics should be validated through comparison with other relevant statistics and data sources.’
User feedback has raised the question of whether a high degree of alignment between Graduate Outcomes survey data and our administrative records could open up the possibility of using administrative data to fill gaps caused by non-response on the part of graduates in further study. While administrative sources are sometimes used as a way of dealing with the issue of missing data as a result of non-response in a national Census (see here for discussion of this approach in New Zealand), such a procedure would not be appropriate in this instance. The Graduate Outcomes activity question asks graduates to identify all of their activities during survey week; since we do not hold administrative data on any activities other than further study, filling in activity based on the HESA Student record would not give us a complete picture of the activities of non-respondents in further study.
We are also obligated to collect and publish data in an efficient and proportionate manner, with V5 of the Code of Practice advising producers of official statistics to assess the suitability of existing data sources and only undertake a new data collection if necessary. At present, we have two different sources of data on graduates engaged in further study at UK providers that submit data to HESA – we have their Graduate Outcomes responses, and we also hold administrative data from the HESA Student record, so there may be room to increase efficiency in this area. By carrying out this investigation, we can therefore examine whether there could be opportunities in future to utilise data linking approaches in place of asking certain survey questions to reduce the burden we impose on graduates.
3: Data and methodology
Our sample of interest in this study was UK-domiciled graduates who qualified and formed part of the Graduate Outcomes population in 2017/18, 2018/19 or 2019/20. In order to examine whether graduates have undertaken a period of study following graduation at a UK higher education provider that submits data to HESA (we do not hold data on those studying at other educational establishments), we need to be able to track individuals across academic years in our Student or Student Alternative records. We do so by developing a HESA person identifier for each student – formed via a complex matching process utilising student keys, names, dates of birth, postcode and other personal characteristics such as sex. Due to the variables we use, this procedure can only be carried out for UK-domiciled graduates, and therefore international graduates do not form part of our final dataset.
We follow individuals from their academic year of graduation to the latest year of data we hold (2020/21) to identify any further or interim periods of study. Where graduates have engaged in multiple periods of study, we identify the single best matched record based on the provider information supplied, prioritising further study over interim and future study. Where there is either no match or multiple matches on provider, we use timing, level and intensity of study to identify the best match. Each row of data relates to a graduate and provides the analyst with the course characteristics (e.g. mode, level, provider, end date etc) of the best match record for their additional study.
Individuals who qualified in May-July of 2020 (cohort D of the 2019/20 Graduate Outcomes population) received the survey from HESA in the period September – November 2021. As a result, we are unable to assess their further study outcomes in this analysis, as this data will only emerge in HESA records from the academic year 2021/22 or after and will therefore not be available until the end of 2022 at the earliest. Given the majority of individuals in our sample graduate in this period (approximately 75%), our analysis relating to further study focuses only on the first two years of Graduate Outcomes data. For our interim study investigation, however, more complete information is available across the whole of the year 3 responding sample and so, where appropriate, we draw upon the full three years of data for this dimension of the work. We note that any study starting on or after 1 August 2021 would not be captured for 2019/20 graduates.
4: Key findings
4.1. Further study at fifteen months
At the start of the Graduate Outcomes survey, respondents are asked about all the activities they are undertaking during census week, with one of the options being ‘engaged in a course of further study, training or research’. Those who tick this box are then asked a series of extra questions about their further study activity later in the questionnaire. Before starting our investigation, we developed a variable that summarised the nature of the response provided by the graduate and what we anticipate we should find (if anything) when linking to HESA administrative records. For example, a graduate who indicates in the survey that one of their activities during census week was further study, training or research at a provider that appears in HESA records should be captured through linking to administrative data. In contrast, it should not be possible to find a link for an individual who states they have not undertaken any form of interim or further study (and are not due to start a course in the near future) at a provider that can be found within the HESA database.
Figure 1: UK-domiciled graduates who reported being in further study at a HESA provider during the census week in the survey – Type of match with HESA student record
Our first area of focus was those individuals who reported in the Graduate Outcomes survey that they were engaged in further study at the time of the census week within a provider that submits data to HESA. While the discussion that follows combines the first two years of data, we see similar trends when examining years 1 and 2 separately (see Figure 1 above). 70% of this particular sample were found in our administrative data to be studying at the provider they indicated in the survey during the census period. Approximately 17% of those found in HESA records had either completed their programme at the provider prior to the census week or were due to begin their course at a later date. 5% appeared to have responded in relation to the course they had completed fifteen months prior to the survey. For 6% of responses, no link could be found, with approximately half of these specifying that they were doing a professional or below higher education level qualification. A further 20% of graduates for whom we could not find a link stated they were in higher education level part-time study. Those studying via this mode will not always be found in our data in consecutive years, as they may choose to take a break before returning to their programme in the future. Indeed, when calculating the non-continuation UK Performance Indicators for part-time students, HESA have considered outcomes two years after entry for this reason.
As discussed above, the process we have used to track students through HESA academic years relies upon a multitude of variables, which makes the linking between years more robust. However, it is still feasible for two different people to have identical entries for fields such as name, date of birth and postcode. Furthermore, some individuals may change their name and postcode across years (e.g. due to getting married or moving house) and so we may be unable to successfully link data for these individuals across academic years. These limitations are another potential reason why no link may have been found.
Figure 2: UK-domiciled graduates who reported further study at a HESA provider during the census week by cohort – Alignment with census week according to HESA student record
When conducting an examination by cohort, it was evident that the issue of graduates referring to a period of study that had either ended or was still due to start predominantly occurred in cohort D (see Figure 2) and related to postgraduate qualifications (generally taught degrees or diplomas/certificates). Cohort D graduates are those who graduate between May and July. Consequently, they tend to be first degree qualifiers, who are surveyed between September and November of the following year.
Figure 3: UK-domiciled graduates who reported further study at a HESA provider during the census week – Number of weeks outside census week that a record was found in HESA data (excludes study found in census week)
Just under four-fifths of respondents who had stated that they were studying during the census period, but were found through administrative data to have actually finished or to be about to start their course, have end/start dates within 5 weeks of that timeframe (Figure 3). When cognitive testing of the questions in Graduate Outcomes was carried out by IFF before the launch of the survey, it was suggested that a prompt about census week should be present across all the questions in the study section. This recommendation was implemented, which reduces the likelihood that this disparity is being caused by graduates referring to a time point other than the census period. Instead, it is possible that graduates are interpreting their start or end date of a course differently to the provider. Graduates who say that they are in study but appear not to have started, for example, may have sent in their registration documents and therefore consider themselves to be enrolled, but they may not appear in their providers’ records until the official start of term.
Peaks can be seen 5 weeks prior to the census week and 3 weeks after this period too. This is likely to be dominated by cohort D - the largest of the cohorts - for which the peaks fall in July (at the end of standard academic year) and late September (when many courses are likely to commence).
Figure 4: UK-domiciled graduates who reported further study at a HESA provider during the census week and were found in HESA data at the same provider in the census week – Alignment of level of study between survey and administrative data
Respondents to the Graduate Outcomes survey who had indicated that they were in further study were also asked questions relating to the mode and level of their qualification. Concentrating on those 70% of graduates for whom there was a match between the survey and administrative records with regards to provider in census week, 98% who indicated full-time study and 88% who indicated part-time study were found on the same mode of study in the HESA records. We also find a high level of agreement (close to or above 90%) for level of study for those on postgraduate research/taught courses or pursuing a first degree. Around half of graduates who reported that they were studying for a Postgraduate diploma/certificate (including PGCE/PGDE) or Other undergraduate diploma /certificate were found to be on that level of study according to the HESA records. The survey, however, provides individuals with an option to select ‘professional qualification’ as the type of award they are aiming for, but no equivalent exists in the HESA Student data. Our analysis found that 67% of graduates who reported that they were studying for a professional qualification were enrolled on a postgraduate taught course according to the HESA Student record. When looking more closely at the nature of these postgraduate taught programmes, we found that a significant proportion were studying for a qualification linked to a profession (e.g. PGCE), which suggests that one of the reasons for the misalignment may relate to the interpretation of ‘professional qualification’.
In each of the first two years, just over 80% of graduates did not select the option of being in further study, training or research during the census week. We therefore examined the proportion of individuals who – despite making this choice in the questionnaire (i.e. slightly above four-fifths of respondents) - appeared to be in higher education study during census week at a provider that submits data to HESA. We found this to be approximately 5% in both years. It should be noted that the extent of mismatch could be higher, as there may be respondents who were in study, training or research during the census week which cannot be observed through HESA records. For example, there could be some graduates who are undertaking job-related study or training (e.g. trainee accountants), where the qualifications are awarded by professional bodies (e.g. Association of Chartered Certified Accountants) rather than higher education providers.
Across the first two years of Graduate outcomes data, 71% of individuals whom we found in administrative data to be in study, training or research, but who did not say that they were in study in the survey, reported that they were in full-time or part-time employment. 11% of graduates in this group said they were unemployed, with 13% in both years noting they were doing something else.
Again, we found that the disparity predominantly occurred in cohort D and among those who were found in the HESA Student record on postgraduate taught courses. Focusing on this sub-group, further investigation of the course end dates for 2017/18 graduates revealed that over 50% were due to complete their programme in September or October 2019. No end date was specified for 13% of graduates – the majority of whom were studying part-time or dormant. Among 2018/19 graduates, the proportion without an end date was 30%, with around one-quarter of these individuals highlighting that they were studying full-time. The remaining sample without an end date were either dormant or in part-time study. As with year 1, those with a valid end date were often found to have a completion date of September or October 2020. Hence, one potential cause of this inconsistency between survey and administrative data could be that students may have completed their dissertation/thesis as of the census week in early September, even though the final submission date has yet to arrive, and hence providers would deem the student to still be in study. Indeed, just under 50% of those we found in further study in the first two years of data, even though they reported otherwise, had indicated in their survey response that they had undertaken interim study. These were often graduates who formed part of cohort D and were still studying for a one-year postgraduate taught course in the HESA Student record.
In summary to this sub-section, we firstly note quite a high success rate in linking survey responses to HESA records, and, secondly, that we can explain a large proportion of instances where no link was found. Furthermore, the vast majority of respondents who state they are not engaged in further study, training or research cannot be found in HESA records. Rather, most of the discrepancies between survey and administrative data seem likely to relate to potential differences in interpretation between graduates and official provider records as to when periods of study begin or end.
4.2. Interim study between graduation and survey point
The original rationale behind the inclusion of questions on interim study and work was to enable users of the data to develop a better understanding of the trajectories taken by graduates up to the 15-month survey point, as noted during the NewDLHE review period. During the evaluation of the survey last year, questions relating to interim work were removed due to their limited onward use, while those on study were retained with the intention that they would be evaluated further at a later date. At present, HESA include a ‘significant interim study’ tab on some of the tables we publish in our official statistics products – defined as those who have completed a full-time higher education level course (including a professional qualification) between graduation and the 15-month survey point. Utilising survey data alone, we find that 8% of respondents across the three years would be categorised as having undertaken ‘significant interim study’.
In this section, we explore the quality of the interim study data in more detail. Across the three years of data, 16% of respondents indicated in the survey that they had undertaken some form of interim study (irrespective of mode and level). Just under half of that 16% did so at a provider that was not identified as one from HESA records, while 44% of the 16% indicated that while they were not in further study at 15 months, they had engaged in interim study at a provider that submits data to HESA. 10% had taken part in interim study at a HESA provider and were also in further study during the census week based on their survey response. We also find that these percentages do not vary greatly by year of graduation.
As we highlight in the ‘Data and methodology’ section, our linking approach attempts to focus on a single instance of study. Hence, those who completed both interim and further study will have had their further study record prioritised where a match on provider has been found. In what follows, we therefore concentrate on those who pursued interim study at a provider that submits data to HESA, but were not in further study during the census week.
Aside from provider, the details requested from the graduate on interim study in the questionnaire relate to mode and level of study. Concentrating on the 68% of graduates for whom there was a match between the survey and administrative records with regards to provider prior to the census week, 97% who indicated full-time study and 88% who indicated part-time study were found on the reported mode of study in the HESA records. As with further study, we also find a high level of agreement for level of study for certain types of qualification (97% for those who reported that they were studying on postgraduate taught courses, 82% for those pursuing a first degree and 63% for those enrolled on a Postgraduate diploma/certificate (including PGCE/PGDE)). Our analysis found that 70% of graduates who reported that they were studying for a professional qualification were recorded to be on a postgraduate taught course according to the HESA Student record.
Where we do see a marked difference, compared with our further study analysis, is for postgraduate research courses. Of those graduates reporting postgraduate research interim study, 53% were found on a postgraduate research course in the HESA records and a further 41% were registered on a postgraduate taught course. Among those found on postgraduate taught courses, the majority were on masters level courses which do not meet the criteria for a research-based higher degree. The mismatch may therefore relate to graduates intending on studying for a research-based doctorate qualification being initially registered on a masters course.
We turn our attention now to those graduates who pursued significant (i.e. full-time) interim study at a provider that submits data to HESA, excluding those in further study during the census week. Of those graduates who indicated periods of interim (but not further) study at a provider that submits data to HESA, nearly three quarters indicated at least one period of full-time interim study. It is also important to remember that those who have done interim study are asked to provide information on up to three courses in the Graduate Outcomes survey, but we shall only capture one of these, based on the best match found.
Figure 5: UK-domiciled graduates engaged in full-time interim study at a HESA provider (and no further study in census week) – Type of match with HESA student record
Of those graduates who said in their survey response that they were engaged in full-time interim study at a HESA provider across the 2017/18 and 2018/19 data, 80% were found in HESA records to be studying at the provider they indicated, either prior to, during or after the census week. The percentage for 2019/20 was slightly higher at 82%, though this slight skew may be explained by the 2021/22 HESA data being unavailable. The percentage of graduates who appeared to have responded in relation to the course they had completed fifteen months prior to the survey varied across the years from 13% in 2017/18, 12% in 2018/19 to 10% in 2019/20. Across all years, no link could be found for around 6% of responses, with around 30% of these graduates specifying that they were doing a professional or below higher education level qualification which may not be included within HESA records.
5. Concluding remarks
Overall, in line with the preliminary investigation carried out in Spring 2021, we find that there is a high level of agreement between the Graduate Outcomes survey and the HESA Student record with regards to graduates in further study. The generally close alignment of the survey and administrative data is reflected in the figure below, which shows the distribution of graduates across categories based on either the survey or administrative data. The 2019/20 data is not included as we are not able to accurately identify those in study during the census week or any future study for cohort D (the largest of the cohorts). Moreover, where we can identify matches between survey responses and administrative data, we also find strong agreement in terms of the mode and level of further study.
Figure 6: UK-domiciled respondents by timing of further study
In most cases where the two records are not entirely aligned, it is possible to identify the reasons behind this. For the most part, these mismatches seem to stem from differences in interpretation. For example, many graduates studying more vocationally oriented qualifications, such as a PGCE, report in the Graduate Outcomes survey that they are engaged in a professional qualification; these students, however, appear in the HESA Student record as enrolled on a postgraduate taught course. We advise continuing to assess the alignment between survey and administrative data concerning the level of further study. If this should indicate that respondents do not always understand how different courses should be categorised, we would recommend exploring whether additional clarification in the survey regarding the distinction between postgraduate programmes and professional qualifications or modifying the question might help to diminish the gaps between survey and administrative data.
Differences in interpretation regarding the start and end dates of courses are another factor that seem to lead to discrepancies between the two records. We would therefore advise continuing to explore the misalignment between survey and administrative data with regards to whether a graduate was in study during census week. Understanding the prevalence of such mismatch and the possible reasons behind it will enable HESA to assess what changes, if any, may be required to the survey to improve the quality of the data on this topic.
At the outset of this project, we considered whether there might be a case for removing the questions on further study from the Graduate Outcomes survey, in line with the guidance published in the Code of Practice for Statistics, which states that producers of official statistics should aim to use existing data sources if they are suitable. Our investigation, however, has led us to conclude that this would not be an appropriate course of action at this time. Instead, we recommend retaining the further study questions in the Graduate Outcomes survey while we continue to assess the alignment of the survey and administrative records. Since we still aim to collect data on graduates in as efficient and proportionate a manner as possible, it may subsequently be appropriate to revisit our need to collect survey data on further study.
When we look at graduates who report that they have completed some interim study and for whom we obtain a successful link between survey and administrative records for interim study at the provider they have listed, we find, as we did for graduates in further study, that there is a high level of alignment between the two sources with regard to mode and level. As was the case with further study, areas of mismatch seemed for the most part to stem from potential misunderstandings or differences of interpretation on the part of graduates. Thus, a considerable proportion who reported interim study at the postgraduate research level appear instead to have been enrolled on postgraduate taught courses, potentially as a stepping stone towards a subsequent postgraduate research degree. Additionally, 10-13% of graduates reporting interim study, depending on survey year, seem to have answered the survey questions about interim study in relation to the qualification they completed 15 months ago. This needs to be tracked in forthcoming years, in order to help ascertain whether further guidance or question amendments are required to ensure graduates do not report data relating to their original course in the interim study section.
As stated in our discussion on interim study, the proportion of graduates who report having undertaken full-time interim higher education level study in the fifteen months after graduation is very low (8%). Figure 7 below shows the proportion of graduates who undertook full-time higher education level (including professional qualification) interim study by reported activity at 15 months. Of the 72% in employment and 5% unemployed among all Graduate Outcomes respondents, 7% and 16% respectively had undertaken this type of interim study. We observe similar percentages when we consider cohort D separately.
Figure 7: UK domiciled respondents by activity and proportion engaged in full-time interim study
We currently include a tab in many of our published open data charts allowing users to filter out graduates who have done significant interim study. When examining topics such as the activities of graduates, the small proportion of graduates in this category means that using this filter leads to only minimal changes in the displayed statistics. We therefore question the utility of this feature. We suggest engagement with users should be undertaken on the basis of the statistical evidence to determine whether the interim study filter serves any useful function. If it is concluded that it does not, we should consider discontinuing its use.
Linking Graduate Outcomes survey data to the administrative records we hold on students who have proceeded into subsequent study gives us important insights into the quality of our survey data. While we found that alignment between Graduate Outcomes survey data and the HESA Student record is high, on the whole, our investigation has allowed us to identify some of the most common areas of mismatch. Ongoing exploration of these areas of misalignment and their likely causes will help us to identify opportunities for improvement of the survey itself, the guidance we issue, and the data we collect.
 This is based on the Graduate Outcomes activity variable and therefore aligns with the field we use in our statistical bulletins.
Lucy Van Essen-Fishman
- Key findings
- 1. Introduction
- 2: Why does this matter?
- 3: Data and methodology
- 4: Key findings
- 4.1. Further study at fifteen months
- 4.2. Interim study between graduation and survey point
- 5. Concluding remarks