Skip to main content

The sample

Graduate Outcomes is a population-scale survey (or colloquially, a census[1]). Our goal is to contact the entire sampling frame. The sampling frame and the sample are therefore largely synonymous.

A marker was developed to identify the sampling frame from within the HESA Student record(s), and appropriate file(s) were extracted. Similar logic was applied by the suppliers of the college HE data not collected by HESA. The datasets were then combined – no matching or linking was required.

Our ‘base population’ is the term used to refer to the dataset that comprises the entire sampling frame. This includes all graduates who fall within our coverage statement, but for whom we have inadequate, ineffective, or missing contact details, for whatever reason[2]. Hence, the survey sample is identical to the sampling frame. Graduates who exercise their right to opt out of the survey are also included in the denominator for response rates.

Response rate targets form part of the survey design. These rates are high, to reflect the desire among many users to evaluate smaller sub-samples as a part of their analysis, and thus to minimise the rate of unit non-response. Targets were set in October 2018, and further information on these is available in the Survey methodology[3]. HESA’s engagement strategy is the main tool for seeking high response rates[4]. Progress towards these targets (along with updates on the operational management of the survey) have previously been reported in a series of end of cohort reviews, published regularly on the HESA website up until the end of the second year of surveying[5]. Since then a summary of response rates is released at the end of each cohort in the weekly newsletter issued to the entire HE sector. An end of collection infographic is also published at the end of each collection year, containing provisional response rates and operational metrics. Final response rates, by domicile and mode of study, are published in the Statistical Bulletin, with response rates by provider, domicile, level of qualification, and subject of study included in the subsequent Open Data.

We cover issues related to non-response in the next two sections.

Next: Sampling error and non-response error

[1] Sometimes Graduate Outcomes is referred to as a “census”. Strictly, a census enumerates a population, which is the central function of the HESA Student record. We use our pre-existing census data from the Student record(s) to construct a sampling frame for the Graduate Outcomes survey. We make no attempt to gather survey responses from graduates outside the sampling frame. However, there is no standard statistical term to describe a survey of (effectively) a whole population. It is fine to call Graduate Outcomes a census in everyday usage, but the term “population-scale survey” hopefully gets the same point across without falling into error.

[2] Our approach to collecting contact details means we may still manage to contact these graduates, if adequate contact details are supplied during the period of fieldwork.

[4] We do not publish the full engagement strategy. Instead, we provide an outline plan for each cohort, updated quarterly here:
For an example of a more discursive account of the kinds of activities involved, see this blog post:

[5] For a full list of mid-point and end of cohort reviews from the 2018/19 cycle, with infographics, see: