Skip to main content

Data collection

On this page: Online data collection | Telephone data collection | Postal data collection | Opt-outs | Case prioritisation | Welsh language requirements

Graduate Outcomes data, for a given academic year, is collected in four instalments, known as cohorts. Each cohort represents a group of graduates who completed their course during a certain period 15 months prior to start of data collection. Figure 1 outlines the data collection plan for 2018/19 collection year:

Figure 1: Data collection plan for the 18/19 collection

Table showing the data collection plan for the 18/19 collection.

As not all graduates will have access to the internet (or a telephone), the survey adopts a mixed mode design to maximise contact with respondents. The primary modes of data collection in every cohort are web and telephone, with several strategies (outlined below) that look to maximise response rates. Postal surveys are also used for a small number of graduates with no other contact details except a residential address.

The two main modes of data collection interact with each other seamlessly in that respondents starting the survey on one mode could easily finish it on another, without having to start at the beginning. They are also able to access the survey online, multiple times, until they reach the end and submit all of their responses. Respondents can choose not to complete the survey over the phone and in such instances, interviewers can transfer a respondent to the online survey by sending a link to the survey via email instantaneously.

Online data collection


Data collection commences at the start of a cohort with an invitation email that is sent to all graduates, using email addresses submitted by providers. The email contains a survey link that is unique to every graduate. This is followed by an SMS (usually the following day but it can take longer for larger cohorts) to UK mobile numbers only. All graduates therefore receive a form of invitation in the first week of data collection. Telephone follow-ups with all non-respondents commence in the second week. Respondents who only partially complete the survey online are given a few weeks to complete it online before they are contacted by telephone.

In year one, providers were asked to submit up to a maximum of 10 email addresses and mobile numbers per graduate. This requirement is being revised for future years in light of evidence that most graduates only have one email address and mobile number and having more contact details does not have a significant impact on response rates. Every single contact detail submitted and approved by providers is used to send emails and SMS messages.

During the entire 13-week field period in each cohort, up to five emails and SMS messages are sent to all non-responding graduates and those partially completing the survey. The exact timing of these reminders varies slightly from one cohort to another and is communicated on the engagement plan which is published for each cohort on our website.


The first year of Graduate Outcomes has seen the implementation of several enhancements during and in between cohorts. The objective of these enhancements has always been the improvement of data quality and/or effectiveness of the data collection instrument which in turn leads to higher response rates. Some of the enhancements include:

  • Trialling email and SMS delivery on different days and time of day. Using paradata to inform future deliveries.
  • Recognising respondents who may have partially completed the survey, through targeted emails and SMS messages.
  • Using SMS messages flexibly as a prompt or to encourage a direct response.

One of the main changes to our data collection strategy is the use of pre-notification or “warm up” emails to prospective respondents, before the start of data collection. This was implemented for the first time in cohort D in the 17/18 collection year. All graduates with approved contact details in this cohort received a pre-notification (warm-up) email at least a week before they received the first invitation. The purpose of this exercise was to improve the take up of online data completion and to ‘warm-up’ our IP addresses, to raise their recognition as legitimate by the information security utilised by the service providers that respondents receive notifications on e.g. gmail and microsoft.

We have taken steps to risk-assess these improvements prior to implementation to minimise any likely impact on bias in the survey. Balancing the potential improvements in response rates and data quality with assessed risk of bias has been a key consideration, but in the case of all improvements implemented, we believe the balance of benefits has been compelling.

View the emails used in the engagement strategy and survey materials on the HESA website.

Telephone data collection


Telephone interviewing usually commences in week two of field work. For the larger cohorts, graduates with no email addresses but a valid phone number are called in the first week as that is the only mode of data collection available for them.

Calls are handled using an auto-dialler that randomly selects respondents from the entire sample and connects them to an available interviewer. Depending on the outcome of the call, it is marked as a complete, incomplete or refusal. An incomplete status is further classified into the nature of the call and its outcome, for example, ‘no reply’, ‘busy’, ‘answer phone’ etc. To try and maximise response rates, interviewers are also able to book appointments with respondents if they wish to be contacted on certain days or time of day.

As with email addresses and mobile numbers, a graduate can have up to 10 UK landline and international numbers, although this is being reviewed for future years (see online data collection for more information). All numbers are used to contact respondents and collect a valid response. Once a number has been used to make direct contact with a graduate, it is marked as ‘successful’ and used in all subsequent attempts. As advised by our contact centre, mobile numbers are likely to be more unique to the graduates, therefore they are used before landline and international numbers.


The contact centre operates using a geo-dialling system, whereby the area code of the telephone number displayed to graduates matches that of the location of their university. Graduates are presented with a telephone number that is more familiar to them, increasing the likelihood of them answering a call rather than ignoring or rejecting it as they might from an unknown/unrecognisable number. This approach is supported more generally by existing best practice within the Market Research sector. As well as increasing the likelihood of graduates picking up the phone, it also dilutes the risk of a single number becoming backlisted.

Despite the benefits of a geo-dialling system, the use of phone numbers that are visible but unknown to respondents does increase the likelihood that they will repeatedly ignore or even bar the calls, especially where they are called multiple times from the same number. It was therefore vital to consider any steps that could be taken to reduce this behaviour, with a view to increasing levels of response. Therefore, during the second half of year one, the approach to geo-dialling was further enhanced by changing the telephone numbers used during fieldwork, once or multiple times, whilst retaining the geographical link to the area of each HEP.

Third-party interviewing

During the second half of the field period, interviewers are advised to collect responses from third parties, where possible, and where a suitable proxy respondent (defined as a partner, relative, carer or close friend) is available. Only the mandatory questions are asked and subjective questions are excluded.

Interviewer training and development

To minimise interviewer error, the contact centre undertakes an extensive training exercise to train their interviewers on Graduate Outcomes. HESA worked with them to compile a set of guidance notes and training materials on every question in the survey. The training covers practical, theoretical and technical aspects of the job requirements. For quality control purposes, team leaders provide ongoing support throughout, enhancing interviewer skills and coaching around areas for improvement. This is carried out through top-up sessions, structured de-briefs and shorter knowledge sharing initiatives about “what works”.

For Graduate Outcomes, all interviewers receive a detailed briefing upon commencing interviewing, covering the purpose of the survey, data requirements (for example level of detail needed in certain free-text questions), running through each survey question, and pointing out areas of potential difficulty so objections and questions can be handled appropriately and sensitively.

Making calls and scripting

Interviewers are randomly allocated to respondents by the telephone dialler. This reduces the risk of creating interviewer-respondent clusters based on common characteristics. The only exception to this rule is the employment of Welsh speaking interviewers who are allocated to Welsh speaking respondents only.

Interviewers introduce the Graduate Outcomes survey as the reason for the call and state they are calling on behalf of the provider for the particular graduate. If asked for further information, they will explain that they are from a research agency that has been appointed by HESA to carry out this work. If required, the interviewer can also advise that the survey has been commissioned by the UK higher education funding and regulatory bodies.

All interviews are recorded digitally to keep an accurate record of interviews. A minimum of 5% of each interviewers’ calls are reviewed in full by a team leader. Quality control reviews are all documented using a series of scores. Should an interviewer have below acceptable scores, this will be discussed with them along with the issue raised, an action plan agreed and signed, and their work further quality controlled. Team leaders rigorously check for tone/technique, data quality and conduct around data protection and information security.

Recontacting graduates

Some of the data collected on the survey is coded by an external supplier, using national industry and occupational coding frameworks. Where they are unable to code verbatim responses, these are returned to the contact centre who try and supply more detailed responses by listening back to the interview and where necessary calling the graduate again.

HESA collects regular feedback from interviewers on the handling of different questions and respondents with the aim of identifying survey or script modifications.

Postal data collection

A third and final mode of data collection used in Graduate Outcomes is postal. Under exceptional circumstances, where a higher education provider is unable to supply email addresses or phone numbers for graduates, survey questionnaires are sent by post to the residential address supplied by the provider. The number of records with only residential addresses is not permitted to exceed 5% of a provider’s population in a given cohort.

The postal survey is a much shorter questionnaire, containing only a subset of the core survey questions that are required as a minimum to produce the main outputs. This is largely done to keep the survey short and minimise the level of navigation required due to routing. So far, the requirement for postal surveys has been minimal across all cohorts and approximately 10% of recipients have returned a completed questionnaire. Data from completed surveys is manually entered into the system by HESA.


Graduates are able to opt out from the survey and any further communication through a number of different channels. The email invitations and online survey instrument provide direct access to information on how to opt-out. Respondents can contact HESA at any point to request an opt-out or deletion of their survey data or contact details as per their rights under GDPR (this extends to after the survey closes up to a fixed point which is outlined on the privacy notice).

Respondents can also refuse to take part in the survey over the phone and interviewers are trained to handle such requests. Graduates can also get in touch with their providers to request an opt-out, before or after we commence surveying. Such requests are redirected to HESA for a formal action. Respondents who opt-out are marked as such on the survey data collection system and all future communications cease within five working days from receipt of the request by HESA.

Further information about opt-outs can be found in our data protection FAQs.

Case prioritisation

While achieving a higher response rate can improve the precision of estimates, the impact this will have on bias is ambiguous. The reason for this is that non-response bias depends not only on the level of response, but also the discrepancy between respondent and non-respondent values. As the latter component can continue to widen as more individuals complete the survey, a better response rate will not necessarily solve the problem of bias. It has generally been the case that the post-collection procedure of weighting is applied as a solution to this issue. However, rather than simply relying on this technique on its own, it was concluded that trying to additionally address bias during the data gathering phase could bring supplementary benefits (e.g. less variable weights).

Consequently, for cohort C and D in year one, a case prioritisation approach was introduced (due to competing operational commitments necessary for firmly establishing Graduate Outcomes as a data collection service, case prioritisation could not be introduced until cohort C). This involved developing a response propensity model around halfway through the collection period for a cohort. While the dependent variable was a binary indicator highlighting whether the individual had responded to the survey, the independent variables all related to demographic (e.g. sex, disability, age etc) and course characteristics (e.g. level and mode of study) available through the HESA student record.

Following the creation of the logit model3, each individual was assigned their predicted probability of responding and these were ranked into order. Among those who hadn’t submitted the survey, the quartile with the lowest propensity scores were selected to be given extra priority.  

The priority sample was identified on the Computer Assisted Telephone Interviewing (CATI) system and allocated to a group of interviewers for a few weeks, towards the end of field period. This was done to enable a more concentrated effort to contact non-respondents who are least likely to respond to the survey. In theory, this would not only result in more calls per graduate for this group but also a higher response rate than what would be achieved if they were part of the main sample. This approach was first used in cohort C with the aim of identifying operational improvements that were subsequently implemented in cohort D and continue in year two of the survey.

Welsh language requirements

HESA is committed to providing access to Graduate Outcomes in Welsh, recognising the importance of ensuring Welsh speakers are not treated disadvantageously in comparison to English speaking graduates. Working alongside the Welsh funding and regulatory body, we have contracted with a partner organisation to undertake all English to Welsh translation work for Graduate Outcomes. This includes the logo, Graduate Outcomes website, the survey, script, results, email and SMS text.  
Following feedback from Welsh providers, HESA undertook a review of the approach to communication with graduates and it was agreed to adopt a nuanced approach based on Welsh language proficiency. We now offer all communications in Welsh, English or bilingual modes, depending on a graduate’s ability to speak fluently in Welsh.

Previous: Telephone survey design     Next: Data processing

3. A statistical technique that is used to investigate the relationship between the probability of an indicator and a few explanatory variables.