Skip to main content

Data collection

On this page: Online data collection | Telephone data collection | Postal data collection | Opt-outs | Case prioritisation | Welsh language requirements | Data collection and Our response to the COVID-19 situation

Graduate Outcomes data, for a given academic year, is collected in four instalments, known as cohorts. Each cohort represents a group of graduates who completed their course during a certain period 15 months prior to start of data collection. Figure 1 outlines the data collection plan for 2018/19 collection year:

Figure 1: Data collection plan for the 18/19 collection

Cohort End date of course Contact period
(c. 15 months after the end date)
Census week
(week commencing)
Cohort A 2018-08-01 to 2018-10-31 2019-12-01 to 2020-02-29 2019-12-02
Cohort B 2018-11-01 to 2019-01-31 2020-03-01 to 2020-05-31 2020-03-02
Cohort C 2019-02-01 to 2019-04-30 2020-06-01 to 2020-08-31 2020-06-01
Cohort D 2019-05-01 to 2019-07-31 2020-09-01 to 2020-11-30 2020-09-01

As not all graduates will have access to the internet (or a telephone), the survey adopts a mixed mode design to maximise contact with respondents. The primary modes of data collection in every cohort are web and telephone, with several strategies (outlined below) that look to maximise response rates. Postal surveys are also used for a small number of graduates with no other contact details except a residential address.

The two main modes of data collection interact with each other seamlessly in that respondents starting the survey on one mode can easily finish it on another, without having to start at the beginning. They are also able to access the survey online, multiple times, until they reach the end and submit all of their responses. Respondents can choose not to complete the survey over the phone, and, in such instances, interviewers can transfer a respondent to the online survey by sending a link to the survey via email instantaneously.

Online data collection

About

Data collection commences a few weeks prior to the start of a cohort with a pre-notification email sent to all graduates with an email address. This strategy was implemented for the first time in year one cohort D. The aim is to introduce the survey to prospective respondents and encourage them to look out for an invitation email. Once the cohort opens, an invitation email is sent to all graduates, using email addresses submitted by providers. The email contains a survey link that is unique to every graduate. This is followed by an SMS (usually the following day but it can take longer for larger cohorts) to UK mobile numbers only. All graduates therefore receive a form of invitation in the first week of data collection. Telephone follow-ups with all non-respondents commence in the second week. Respondents who only partially complete the survey online are given a few weeks to complete it online before they are contacted by telephone.

In year one, providers were asked to submit up to a maximum of ten email addresses and mobile numbers per graduate. This requirement is being revised for years three and four in light of evidence that most graduates only have one email address and mobile number and having more contact details does not have a significant impact on response rates. Every single contact detail submitted and approved by providers is used to send emails and SMS messages.

During the entire 13-week field period in each cohort, five to eight emails and SMS messages are sent to all non-responding graduates and those partially completing the survey. The exact number and timing of these reminders varies slightly from one cohort to another and is communicated on the engagement plan which is published for each cohort on our website.

Enhancements

The second year of Graduate Outcomes was largely consistent with the first, positively demonstrating a stabilisation of the methodology. The following enhancements were introduced during the year:

  • Introduction of a new postcode validator to improve the quality of data collected.
  • Improvements to the presentation of salary and well-being questions to collect more reliable data in response to sensitive questions.
  • Use of experiments to optimise engagement strategy

As part of our ongoing evaluation of best practices to collect survey data, we explored the use of a single number for telephone interviews versus geo-referenced phone numbers to increase uptake of telephone calls. Having reviewed the limited amount of literature on this topic and considering the possible negative impact on existing awareness raising campaigns and the successful administration of existing CATI services, we have decided to continue with our existing approach which uses geolocated telephone numbers (based on provider’s location) as opposed to a single number for all graduates at all providers. We will revisit this position if CATI response rates experience a dramatic unexplained decline.

Telephone data collection

About

Telephone interviewing usually commences in week two of field work. For the larger cohorts, graduates with no email addresses but a valid phone number are called in the first week as that is the only mode of data collection available for them.

Calls are handled using an auto-dialler that randomly selects respondents from the entire sample and connects them to an available interviewer. Depending on the outcome of the call, it is marked as a complete, incomplete or refusal. An incomplete status is further classified according to the nature of the call and its outcome, for example, ‘no reply’, ‘busy’, ‘answer phone’, etc. To try and maximise response rates, interviewers are also able to book appointments with respondents if they wish to be contacted on certain days or time of day.

As with email addresses and mobile numbers, a graduate can have up to ten UK landline and international numbers, although this is being reviewed for future years (see online data collection for more information). All numbers are used to contact respondents and collect a valid response. Once a number has been used to make direct contact with a graduate, it is marked as ‘successful’ and used in all subsequent attempts. As advised by our contact centre, mobile numbers are likely to be more unique to the graduates, therefore they are used before landline and international numbers.

Geo-dialling

The contact centre operates using a geo-dialling system, whereby the geographical location of providers is taken into consideration. Graduates are presented with a familiar area code, increasing the likelihood of them answering a call rather than ignoring or rejecting it as they might do from an unknown/unrecognisable number. This approach is supported more generally by existing best practice within the Market Research sector. As well as increasing the likelihood of graduates picking up the phone, it also dilutes the risk of a single number becoming backlisted.

Despite the benefits of a geo-dialling system, the use of phone numbers that are visible but unknown to respondents does increase the likelihood that they will repeatedly ignore or even bar the calls, especially where they are called multiple times from the same number. It was therefore vital to consider any steps that could be taken to reduce this behaviour, with a view to increasing levels of response. Therefore, during the second half of year one, the approach was further enhanced by changing the telephone numbers used for some of the fieldwork period. 

Third-party interviewing

During the second half of the field period, interviewers are advised to collect responses from third parties, where possible, and where a suitable proxy respondent (defined as a partner, relative, carer or close friend) is available. Only the mandatory questions are asked, and subjective questions are excluded.

Interviewer training and development

To minimise interviewer error, the contact centre undertakes an extensive training exercise to train their interviewers on Graduate Outcomes. HESA worked with the contact centre to compile a set of guidance notes and training materials on every question in the survey. The training covers practical, theoretical and technical aspects of the job requirements. For quality control purposes, team leaders provide ongoing support throughout, enhancing interviewer skills and coaching around areas for improvement. This is carried out through top-up sessions, structured de-briefs and shorter knowledge sharing initiatives about ‘what works’.

All interviewers receive a detailed briefing upon commencing interviewing, covering the purpose of the survey, data requirements (for example level of detail needed in certain free-text questions), running through each survey question, and pointing out areas of potential difficulty so objections and questions can be handled appropriately and sensitively.

Making calls and scripting

Interviewers are randomly allocated to respondents by the telephone dialler. This reduces the risk of creating interviewer-respondent clusters based on common characteristics. The only exception to this rule is the employment of Welsh speaking interviewers who are allocated to Welsh speaking respondents only.

Interviewers introduce the Graduate Outcomes survey as the reason for the call and state they are calling on behalf of the provider for the particular graduate. If asked for further information, they will explain that they are from a research agency that has been appointed by HESA to carry out this work. If required, the interviewer can also advise that the survey has been commissioned by the UK higher education funding and regulatory bodies.

All interviews are recorded digitally to keep an accurate record of interviews. A minimum of 5% of each interviewers’ calls are reviewed in full by a team leader. Quality control reviews are all documented using a series of scores. Should an interviewer have below acceptable scores, this will be discussed with them, an action plan agreed and signed, and their work further quality controlled. Team leaders rigorously check for tone/technique, data quality, and conduct around data protection and information security.

Recontacting graduates

Some of the data collected on the survey is coded by an external supplier, using national industry and occupational coding frameworks. Where they are unable to code verbatim responses, these are returned to the contact centre who try and supply more detailed responses by listening back to the interview and where necessary calling the graduate again.

HESA collects regular feedback from interviewers on the handling of different questions and respondents with the aim of identifying survey or script modifications.

Postal data collection

A third and final mode of data collection used in Graduate Outcomes is postal. Under exceptional circumstances, where a higher education provider is unable to supply email addresses or phone numbers for graduates, survey questionnaires are sent by post to the residential address supplied by the provider. The number of records with only residential addresses is not permitted to exceed 5% of a provider’s population in a given cohort.

The postal survey is a much shorter questionnaire, containing only a subset of the core survey questions that are required as a minimum to produce the main outputs. This is largely done to keep the survey short and minimise the level of navigation required due to routing. So far, the requirement for postal surveys has been minimal across all cohorts and approximately 10% of recipients have returned a completed questionnaire. Data from completed surveys is manually entered into the system by HESA.

Opt-outs

Graduates are able to opt out from the survey and any further communication through a number of different channels. The email invitations and online survey instrument provide direct access to information on how to opt out. Respondents can contact HESA at any point to request an opt-out or deletion of their survey data or contact details as per their rights under GDPR (this extends after the survey closes up to a fixed point which is outlined on the privacy notice).

Respondents can also refuse to take part in the survey over the phone, and interviewers are trained to handle such requests.

Graduates can also get in touch with their providers to request an opt-out. Such requests are redirected to HESA for formal action. Respondents who opt out are marked as such on the survey data collection system, and all future communications cease within five working days from receipt of the request.

Case prioritisation

While achieving a higher response rate can improve the precision of estimates, the impact this will have on bias is ambiguous. This is because non-response bias depends not only on the level of response, but also on the discrepancy between respondent and non-respondent values. As the latter component can continue to widen as more individuals complete the survey, a better response rate will not necessarily solve the problem of bias. It has generally been the case that the post-collection procedure of weighting is applied as a solution to this issue (for details on weighting refer to Data analysis section of this report). However, rather than simply relying on this technique on its own, it was concluded that trying to additionally address bias during the data gathering phase could bring supplementary benefits (e.g., less variable weights).

In Year 2, we continued using a Case Prioritisation process to target those least likely to respond, thereby achieving a more representative sample across a range of groups. This involved developing a statistical model to determine the propensity of a graduate to respond to the survey around halfway through the collection period for a cohort. While the dependent variable was a binary indicator highlighting whether the individual had responded to the survey, the independent variables related to demographic (e.g. sex, disability, age etc) and course characteristics (e.g. level and mode of study) available through the HESA student record. Those identified as being least likely to respond are given extra priority in the latter stages of the collection cycle.

The priority sample was identified on the Computer Assisted Telephone Interviewing (CATI) system and allocated to a group of interviewers for a few weeks, towards the end of field period. This was done to enable a more concentrated effort to contact non-respondents who are least likely to respond to the survey. In theory, this would result not only in more calls per graduate for this group but also in a higher response rate than what would be achieved if they were part of the main sample.

Since cohort D of year two, we have focused solely on UK domiciled graduates in our case prioritisation work (excluding those who attended FECs due to limited data availability). This enables a greater breadth of variables to be utilised when determining those with the lowest propensity to respond to the survey. For example, where appropriate, we can now incorporate variables relating to entry qualifications and socioeconomic status. It should also be noted that provider of study is also taken into account when classifying those with the lowest likelihood of partaking in Graduate Outcomes. Furthermore, we are currently undertaking a programme of work aiming to enhance the breadth and depth of FEC data available to us, with a view to including such colleges in our case prioritisation process when circumstances allow.

While our case prioritisation process currently considers personal and course attributes only, we will be analysing the quality of our paradata in future years and assessing whether this could be introduced into the process.

Welsh language requirements

HESA is committed to providing access to Graduate Outcomes in Welsh, recognising the importance of ensuring Welsh speakers are not treated disadvantageously in comparison to English speaking graduates. Working alongside the Welsh funding and regulatory body, we have contracted with a partner organisation to undertake all English to Welsh translation work for Graduate Outcomes. This includes the logo, Graduate Outcomes website, the survey, script, results, email and SMS text. All communications are offered in Welsh, English or bilingual modes, depending on a graduate’s ability to speak fluently in Welsh.

Data collection and Our response to the COVID-19 situation

While the first cases of Covid-19 were detected in the country during cohort A, the first set of national restrictions in the UK were introduced during cohort B. Given that cohort A was complete by the end of February and included a large international sample and cohort B was already in progress for much of the early onset of the pandemic in Europe and other parts of the world, the scope to change the survey significantly was limited during those initial stages.

Therefore, our focus was on ensuring that graduates could self-administer the survey to accurately reflect their personal situation and that interviewers could support participants sensitively and appropriately. The following survey changes were implemented for cohort C following approval by the Graduate Outcomes steering group:

Furloughed staff

To ensure that respondents who are furloughed under the new Government scheme (remaining technically employed) select the correct option at ‘What activities were you doing in [CENSUS WEEK]’, we added additional text to code 01 (Paid work) to clarify that this does include furloughed employees.

Subjective wellbeing

With higher levels of unemployment / volunteering / care-giving / interruptions to further education, a greater percentage of respondents are likely to skip past the work / study sections and arrive at the wellbeing questions relatively early-on in the survey. Under the current circumstances, to some individuals, this may feel insensitive. We have added supportive text to the survey (both on the online and CATI version) which signposts participants to mental health and wellbeing organisations across the world (the Samaritans, Befrienders Worldwide and Mind).

Our contact centre provided special training and support to all interviewers, enabling them to handle sensitive conversations with respondents.

Previous: Telephone survey design     Next: Data processing