Skip to main content

Measurement error

Measurement error occurs from failing to collect the true data values from respondents. Potential sources of measurement error in Graduate Outcomes are: the survey instrument(s); the telephone interviewers, and the respondents themselves. This section of the report covers these aspects, in turn. The mode of data collection is also a source of measurement error, and we cover this in more detail in the next section.

Respondent error

The survey takes the following measures to minimise respondent error. We cognitively tested the survey questions prior to launch, and adapted our questionnaire design in the light of the research findings. Information on cognitive testing is available in a technical report[1] and an outcomes report.[2] The implementation of the survey questions in the survey instrument was undertaken with expert input and testing from HESA and our suppliers, in order to pro-actively identify and overcome potential respondent error issues.

The survey instrument is available in both English and Welsh languages. This allows respondents graduating from providers in Wales to use whichever language they prefer. This should reduce respondent error due to language issues.

The instrument is deployed online, and over the telephone, which offers respondents some choice over how to engage. Details about the implementation of the instrument can be found in the Survey methodology sections dealing with the online[3] and telephone[4] based aspects of our approach, and these materials also contain further information about how we seek to minimise respondent error. Online, we use a series of prompts to encourage the respondent to check the accuracy of their responses. Over the telephone, our interviewers’ script similarly prompts operatives to elicit accurate responses through checking understanding back with the respondent. (We will from now on refer to the computer-assisted telephone interviewing by its widely-accepted acronym – CATI.)

Some examples of respondent error we believe may occur are:

  • Information retrieval may be difficult for those respondents reporting several jobs. They may not remember precisely, or may not have access to, information about, for example, their previous earnings for a job they left months beforehand.
  • Brevity or lack of response to free text questions could lead to differences in SOC codes for graduates in similar jobs. This equally applies to other coded free-text data. However, the SOC coding process would be more sensitive to this sort of issue, than, for example, free text country data, as the input data is more extensive, and there is some degree of semantic overlap between the output codes.
  • Cases where respondents select unemployed and paid work simultaneously. (During the first year of the survey of the respondents in paid work for an employer, 950 had also indicated they are unemployed. Of these, 270 had said that being unemployed was their most important activity). In the second year of the survey, of the respondents in paid work for an employer, 1,085 had also indicated that they are unemployed. Of these, 330 had said that being employed was their most important activity.[5]
  • Acquiescence bias (sometimes called agreement bias, ‘straight-lining’, or alternatively referred to as ‘yea-saying/nay-saying’) is where there is a tendency on the part of respondents to indicate positive (or negative) responses in a routine fashion, perhaps not reflecting their ‘true’ feelings. The design of the survey mitigates this by avoiding questions where this kind of response is easy to offer and HESA is continuously reviewing the impact of survey design on response distribution.
  • Social desirability bias occurs where respondents tend to give socially desirable responses instead of choosing responses that are reflective of their ‘true’ situation. Examples where this could occur might include reporting a higher salary, or a greater sense of subjective wellbeing (SWB). Other studies have indicated that this kind of bias may vary by mode of response.

For details of our investigations into these forms of respondent error, readers are directed to the Reliability of sensitive data section, where we discuss our analysis of the data. While further work is required to investigate the extent of these forms of bias on the survey, we are able to show the current extent of our understanding of their effect.

In the dissemination section of the Graduate Outcomes Survey methodology, details are given about how HESA interprets and publishes responses.[6] In the section of the Survey methodology covering key data concepts and standards, explanations are given around the analysis that has been carried out on a number of key data items. In the section on salary, there is specific information about the approach HESA has taken to handling any potential respondent error. This includes an update to the approach we have taken in trimming the salaries to exclude outliers, and future corrective actions, including improvements to the instrument to reduce the risk of misunderstanding that leads to respondent error.

One limitation on the respondent’s ability to correct their own errors is the unavailability of a ‘back’ button in the online survey. Respondents are therefore unable to go back and change their answers to previous questions. This is done largely for data protection reasons (this is covered at greater length in the section of the Survey methodology on the online survey design);[7] it also reduces the risk of ‘orphaned’ data occurring, where a respondent enters data that is not required when they subsequently return to an earlier point in the survey to make an alternative choice, which consequently alters their survey routing.

During the first few cohorts of year one, we noticed that some respondents indicated they believed they should not be in the sample because they had not graduated. This sometimes occurred when they had gone on to further study or had only completed part of their qualification but were still eligible to take part based on that component. We amended the introduction to the survey to allow interviewers more time to explain the eligibility, if needed. We also made necessary amendments to our emails and other communication highlighting the eligibility criterion as having completed a sufficient component of an HE course. This amendment was implemented in cohort C of year one. Compared with the previous year, in year two 30% fewer respondents said they were not eligible to participate in the survey.

We are aware that more evidence needs to be gathered on whether respondent error represents a significant issue in the survey. For instance, for those who stated in the survey that they were undertaking further study in the UK HE sector, there is the potential to link their response to the HESA student record. This would offer the opportunity to evaluate the extent of measurement error in this part of the survey. Further investigations have been undertaken into this issue, and an interim digest of these is covered in the Graduate Outcomes and the HESA Student record section.

Survey instrument error

Significant effort is invested in reducing opportunities for instrument error, and the first element of this is the choices of platforms, partners, and personnel involved. HESA manages the survey and appoints the suppliers.[8] HESA’s procurement and supplier management approaches seek to ensure that suppliers deliver on process quality requirements imposed by HESA. Confirmit remains HESA’s feedback management solution supplier. Confirmit’s technology is widely used to conduct surveys by leading sector bodies, including the Office for National Statistics, and also in market research contexts. It includes a smartphone compatible online system. HESA’s current contact centre provider is IFF research. IFF has worked with many individual providers, previously, in their delivery of Graduate Outcomes predecessor DLHE. IFF was also the survey contractor for all six iterations of the Longitudinal DLHE survey.

The survey instrument is ultimately HESA’s responsibility, and HESA is an official statistics producer with a track record in delivering the DLHE and LDLHE (Longitudinal Destinations of Leavers from Higher Education) surveys for over twenty years as well as a successful launch of the Graduate Outcomes survey with ‘a range of positive features that demonstrate the trustworthiness, quality and value of the statistics’.[9] HESA’s staff are skilled across the range of statistical business processes, including developing the methodologies, procuring survey and coding services, developing and commissioning software systems, data processing and enrichment, quality assurance, conducting and commissioning research, analysis, dissemination, and undertaking reviews. Users can therefore trust that the survey is being delivered by an organisation with experience and skill in appropriate professional domains.

The instrument was tested thoroughly by staff from HESA, IFF, and Confirmit prior to deployment. However, the complexity of the survey routing meant that some less likely routing combinations were only tested to a limited extent. All problems discovered during testing were fixed prior to launch. We also note that Confirmit nominated HESA the judges’ choice in their ‘Achievement in Insight and Research’ awards in September 2019 in recognition of the high standards, creativity and innovation with which their platform is being used.

HESA demonstrates an evidence-based approach to operational data quality management, backed up by a clear governance approach. A log is kept of all instances of potential instrument error and a process is operated to investigate and assess each issue for the level of its impact. This approach is substantiated by regular progress updates, which explain these same issues to stakeholders.[10]

The survey instrument is generally of high quality, and during the second year of operations, the platform remained stable throughout operations, and performance remained consistent with the levels established for Cohort D of the first year, throughout. A catalogue of issues discovered during the first year of operations, and their treatment, is available in the previous edition of this report.

We summarise the main sources of potential instrument error relating to year two of the survey in the following subsections.

Survey routing issues

As noted above, respondents are not able to return to previous answers to amend them. While this could potentially increase respondent error, it reduces survey instrument error. However, CATI operatives retain access to a ‘back’ button (to maintain a good interviewer-respondent relationship). This means that there is still a small risk of processing error arising; this risk, however, is ameliorated considerably through CATI operative awareness and training, and by increasing the validation checks undertaken either automatically, or through analysis.

An error in survey routing discovered in year 2 has meant that for a subset of graduates in further study, information on the country of their provider is not available if they selected ‘other’ in provider name (instead of a named UK provider from the drop-down list) and did not enter the name of the provider when prompted.[11] This is likely to result in a higher number of records with missing country information, where the provider was ‘other’. This routing error was fixed before the start of cohort C (18/19). All other similar instances in survey routing were checked to ensure this was not replicated elsewhere. Prior to Cohort C there were 6,070 respondents that selected ‘Other’ in their provider name. Of these, there were 1,415 where the country field was blank. No additional issues were found.

A further issue was discovered in the routing of the graduate voice questions, according to which some graduates who reported being in employment were routed to the ACTMEAN, ACTONTRACK, and ACTSKILLS variables, rather than WORKMEAN, WORKONTRACK, and WORKSKILLS.

Graduates are routed to WORKMEAN, WORKONTRACK, and WORKSKILLS only if their ALLACT response includes only one option, and that one option is one of the five possible 'work' options (ALLACT01, ALLACT02, ALLACT03, ALLACT04, or ALLACT05). Graduates who respond to ALLACT with ALLACT07, ALLACT08, ALLACT09, ALLACT10, or ALLACT11 or who provide multiple responses are routed into ACTMEAN, ACTONTRACK, and ACTSKILLS. Graduates engaged in multiple activities and routed to ACTMEAN, ACTONTRACK, and ACTSKILLS may be doing multiple types of activity (work and study, work and something else, study and something else), or they may be doing multiple types of work.[12]

This routing issue poses a problem for analysis, since, when graduates engaged in multiple activities and routed to ACTMEAN, ACTONTRACK, and ACTSKILLS are asked, for example, 'the extent to which their current activity is meaningful', it is not possible to determine with reference to which of their multiple activities they choose to answer. Thus if a graduate should be in paid employment and acting as a carer, or in paid employment and running their own business, we cannot determine whether that graduate’s responses to ACTMEAN refer to the meaningfulness of their paid employment or the meaningfulness of their other activity.

In the Graduate Outcomes publications, responses to all three variations of the graduate voice questions (WORK-, ACT-, STUDY-) are combined and filtered by activity. As described above, the graduate voice data for graduates in employment may be based on responses to either the ACT- or the WORK- variables, and we cannot be certain that graduates in employment responding to the ACT- variables will have been thinking about their employment in their responses.

A review of the implementation of the questionnaire is currently being undertaken and will include consideration of the routing issues described above. The review will also include further testing of survey routing with a specific emphasis on the handling of unexpected or less expected responses. We will announce any further issues and corrective actions identified by this review in due course.

Survey alterations to increase retention

Relatively few alterations were required during the first year of the survey, and these are catalogued in the first edition of this report.[13]

During the second year, as the Coronavirus pandemic hit, we assessed the need for rapid changes to the survey. These are catalogued in detail in an online briefing within our Coronavirus update,[14] but in brief, we focussed on ensuring that graduates could self-administer the survey to reflect their personal situation. To ensure that respondents who are furloughed under the Government scheme (remaining technically employed) select the correct option at ‘What activities were you doing in [CENSUS WEEK]’, we added additional text to code 01 (Paid work) to clarify that this does include furloughed employees. Furthermore, with higher levels of unemployment / volunteering / care-giving / interruptions to further education, we believed a greater percentage of respondents would be likely to skip past the work / study sections and arrive at the wellbeing questions relatively early-on in the survey. We felt that under the circumstances, to some individuals, this might have felt insensitive. We therefore added supportive text to the survey (both on the online and CATI version) which signposts participants to mental health and wellbeing organisations across the world (the Samaritans, Befrienders Worldwide and Mind).

Email and SMS delivery

Where providers have supplied email addresses for graduates on their domain e.g. [email protected][provider], they are advised to be mindful of the expiry period for these addresses. Some providers allow graduates to keep these addresses for life, others expire them after a fixed period (e.g. six month post-course completion). These email addresses should only be returned as valid graduate contact details for Graduate Outcomes when they are still live accounts on providers’ systems. Where providers are satisfied that the provider domain email address will be live at the point of HESA contact, we have suggested that providers allow-list the relevant email sender address which will be [providername] This will help ensure these emails are delivered successfully. It is important that provider domain email addresses are still live as this has an impact on HESA’s IP address reputation. Should provider domain email addresses be shut down at the start of the survey period, this may lead to our emails bouncing and our IP address being deny-listed. This would put a halt to HESA’s email capability thus restricting our surveying to phone or SMS only. Providers are therefore further incentivised to pay attention to this quality factor.

At the start of the year two (18/19) survey, we tested three different email subject lines for invitation emails to identify specific themes that are more likely to encourage graduates to open their emails. Starting with an initial sample of 163,070 three equally sized randomised groups were created, each one receiving a different subject line. The content of the email was the same in all three cases. In order to get a non-biased representation of the population in each subset, graduates were distributed equally across the three groups by the following characteristics: domicile, level of study, and type of provider.

Following analysis of paradata collected alongside the survey, we found one subject line achieving a much higher open rate and click rate, compared with the other two. While there was one clear winner, a sizeable proportion did respond to the other two subject lines suggesting there is no perfect template for what works in this context. The use of multiple subject lines is likely to be the best course of action. This has been further confirmed throughout the rest of the cohort where we changed subject lines every so often which resulted in a sudden rise in open, click and completion rates. We have continued with this approach, subsequently.

Email delivery rates continue to be extremely high in every round of invitations, ranging from 97-99%. SMS delivery rates have also remained high at 89.6% for the first SMS invitation (utilising all available UK mobile numbers). Completion via SMS link was responsible for 40% of all the online survey responses received during cohort D.

Delivery rates are not directly correlated with response rates. Open rates are a more useful indicator of the likelihood of online survey participation. We do not have data for SMS open rates, but we have information on email open rates. It has been observed that open rates for emails reduced in cohort D, compared with previous year, resulting in an overall lowering of online response rates. The online response rate in cohort D this year was 19.5%, compared with 22.3% in the previous year. A similar difference was also observed in cohort C, but not in cohorts A and B. While it is not possible to categorically attribute this observation to the Covid-19 pandemic, it is possible that a significant increase in our online activity for most essential tasks has had a negative impact on an individual’s motivation to participate in what might seem like a non-essential online activity. (We are, however, seeing positive signs in online uptake in the first cohort of year three, alleviating immediate concerns of a downward trend.)

A more tangible explanation for low open rates, and therefore low response rates, can be obtained if we consider the type and quality of email addresses in the sample. Of all the graduates in cohort D, 1.2% did not have an email address. Of all the UK domiciled graduates 7.4% were either missing an email address or a mobile number, therefore reducing their chances of being contacted to complete the survey online. Our research suggests that online response rate is much higher when both modes of contact are available (email and SMS). A review of contact details by providers was undertaken to explain why some providers might have a lower response rate compared with others. Actions were taken with specific providers where our analysis indicated this would be fruitful.

Some graduates do not respond to any of the reminder emails. Of all the graduates with an email address, at least 2% had only a “” address, which has so far proven to be least reliable in contacting graduates. For graduates with just one email address, it takes 3-4 emails on average to achieve a complete response, with a third of all online respondents doing so after the first invitation. There is a clear indication that UK domiciled graduates require fewer reminders to complete the survey than non-UK graduates.

Call handling

There are numerous indicators suggesting that the telephone interviewing component of Graduate Outcomes and call handling approach described in the previous edition of this report, is now firmly established and delivering successful outcomes for the project. Responses to the telephone survey increased by 9% in cohort D and 14% across the whole second year. This not only recovered the deficit in online response rates in the last two cohorts but also brought the overall rates closer to the target. More people tend to refuse to take part in the survey over the telephone than they do online. While the percentage of refusals remained similar in the first three cohorts, across the two years, the rate halved in the final cohort. Although landline numbers tend to perform worse than mobile numbers generally, in cohort D this year the completion rate on landline numbers was higher than expected. More graduates being at home because of the Covid-19 pandemic lockdowns, has meant that landline calls have generally been more productive. For the first time in the survey, we adopted a more focused data collection strategy aimed at graduates with UK home addresses. During November, we stopped calling non-UK graduates (except those with scheduled appointments) to focus our efforts on the UK group, as this was deemed to be the priority. Similarly, for the purpose of case prioritisation (an exercise aimed at reducing non-response bias), we exclusively targeted graduates with UK home addresses.

Towards the end of a cohort, we start observing signs of the sample becoming tired with low call pick-up and response rates. However, cohort D was unusual with sufficient ‘live’ sample still available in the final few weeks of the field period. Unfortunately, to keep the survey costs within the subscription cost limits, we had to ask our contact centre to significantly reduce call volumes in the final few days. Online data collection continued right through to the end of the 3-month period.

Interviewer error

Interviewer error is the effect of a human interviewer on the data gathering process. Graduate Outcomes uses many interviewers concurrently. CATI interviewers undergo training developed especially for the Graduate Outcomes survey, and which focuses on the contextual knowledge interviewers need to perform their roles effectively. They are recruited and trained by IFF according to closely-monitored quality criteria. Quality assurance by monitoring calls is also a part of the standard practice. All interviews are recorded digitally to keep an accurate record of interviews. A minimum of 5% of each interviewers’ calls are reviewed in full by a team leader. Quality control reviews are all documented using a series of scores. Should an interviewer have below acceptable scores, this will be discussed with them along with the issue raised, an action plan agreed and signed, and their work further quality controlled. Information about this is covered in the data collection section of the Survey methodology.[15] Further details are given in the operational survey information section on the contact centre.[16]

CATI operatives utilise an adapted version of the same instrument as online respondents. This allows a further level of data quality checks to be performed, as CATI operatives get similar feedback from the online instrument to online respondents, in addition to having their own quality processes built into the script. This also prevents any ‘clash’ or data problems occurring due to respondent mode switches. One difference is that a ‘back button’ is available to CATI operatives, which allows adjustments to be made, if a respondent wishes to change an earlier answer in the light of a later question. This kind of anecdotal feedback could help identify potential sources of respondent error, and HESA and IFF evaluate feedback from CATI operatives regularly, to determine if instrument improvements could offer marginal enhancements to data collection. While human error is always a potential factor, this is likely to be a matter of random variance in keying errors. There is no evidence to suggest that interviewer error has had any significant impact on the conduct of the survey. Rather, CATI operatives are a useful source of quality improvement suggestions, and regular fortnightly meetings occur where performance and survey issues are discussed, and recommendations logged for further assessment and action.

Next: Mode effects

[4] For telephone and contact centre aspects of the instrument, see

[5] For details of how HESA reflects this contradictory information in published outputs, see the XACTIVITY specification at:

[10] Readers wishing to understand these issues in detail, and in chronological order, are recommended to read the midpoint and end of cohort reviews, which are published at:

[11] Users may wish to note that this problem was discovered in the fix we had applied to address a previous data quality issue that affecting 5,040 records. This was referred to in the previous iteration of this report.

[12] Similarly, if graduates ticked ALLACT06 - Activity - Engaged in a course of study, training or research with any other combination. They were routed to the STUDYMEAN, STUDYONTRACK, STUDYSKILLS questions.