Skip to main content

Sampling error and non-response error

Sampling error is the difference between a population value and an estimate based on a sample, and is one of the components of total survey error. It is normal for a quality report on a sample survey to offer a caveat explaining that, in principle, many random samples could be drawn and each would give different results, due to the fact that each sample would be made up of different people, who would give different answers to the questions asked. The spread of these results is the sampling variability. However, sampling error occurs because estimates are based on a sample rather than a census. As we have previously demonstrated, Graduate Outcomes is a population scale survey[1] where the sample is identical with the sampling frame, and the sampling frame resembles the population of interest very closely. While we know that the quality and availability of contact details must affect the response rate we can achieve from the sample, to develop a comprehensive measure of quality is a complex exercise in the absence of a perfect and accessible descriptor of quality. We are however making significant improvements in our understanding of the various facets of quality, as described in the Sampling frame data based on HESA data collections section. We aspire to provide response rates not just as a proportion of the target population but also as a proportion of the contactable population. Therefore, the response rate achieved is itself our present best indicator of the quality of contact details. Hence, our analytical focus in this section is on the extent to which the achieved sample is representative of the population. We therefore focus on non-response error.

This section comprises two subsections, which cover the strategies HESA has followed to limit the practical effects of missing responses. In conducting a survey, one of the main types of non-sampling error that can arise is that resulting from non-response. Whilst a lower level of response causes a reduction in the precision of obtained estimates, the impact of response rates on bias is ambiguous[2]. The two types of error in this category are unit non-response[3] and item non-response[4]. We cover issues related to these in the next two sections.

Unit non-response error

Unit non-response occurs where a graduate does not respond to the survey. A poor response rate will result in less precision in any estimates we generate. Its effect on bias is less certain. Bias is determined by two components[5]. These are the response rate, as well as the variation between respondent and non-respondent values. Hence, a better response rate can be associated with increased bias, if the discrepancy between those who respond to the survey and those who do not grows larger. Consequently, attempting to maximise response rates will not necessarily minimise non-response bias[6].

A number of elements of the survey design are intended to maximise response rates, and an overview is offered in the operational survey information on the HESA website[7]. These include:

  • A website aimed at respondents to reinforce the legitimacy and credentials of the survey[8]
  • A smartphone-optimised survey
  • Allowing the survey to be completed in more than one stage, whether online, at the telephone, or using a mixture of both modes
  • Bespoke email invitations and reminders that include the name of the graduate and their provider
  • A dynamic engagement strategy informed by best practice and survey paradata
  • Using a data collection platform that seamlessly integrates all modes together
  • The adoption of a concurrent mixed-mode design (computer-assisted telephone interviewing (CATI) starts a week after the online system opens, and those who start online are not followed up until much later in the field period)
  • Increasing the convenience of responding for graduates, by making appointments for telephone interviews at times that suit them
  • Collecting proxy responses from half-way through the fieldwork period.

For the rest of this section we cover the specifics of our approach where non-response bias is concerned. Root cause remediation is one of the practices HESA adopts to proactively manage data quality[9]. In this case, our goal was to reduce data quality issues arising during collection. Historically, organisations that have administered surveys have relied upon methods executed after collection (i.e. weighting) to deal with the challenge of non-response. Yet, over the last decade, those working in this area have increasingly looked at whether anything can also be done during the data gathering phase. Work by the Netherlands’ official statistics agency[10] points to the advantages in attempting to do this, such as improved precision due to less variable weights. In trying to reduce non-response bias, other authors highlight the potential benefit of developing propensity models and subsequently diverting more attention to those individuals with a lower likelihood of responding in the latter stages of the collection process[11]. An adaptive survey design methodology was therefore designed and implemented from cohort C of the first year of the survey, onwards, which was subject to a quarterly refinement process where opportunities for improvements to the response propensity model were identified and where possible implemented by analysts. Whilst the premise is well established and in theory, could have been effective, subsequent review of case prioritisation   indicated to the survey data collection team that our approach to prioritisation was ineffective and burdensome.  Further details of the findings from the last three years, and the concerns highlighted as a result, are covered in detail in the section of the Survey methodology covering data collection[12]. Regardless of the steps taken during the data collection stages, the resulting data must be assessed and if necessary, action taken to address bias. This is referred to as “weighting” the survey. The overarching objective of weighting is to enable the sample to be adjusted such that it is more representative of the population[13]. Most surveys are weighted following collection. However, the Graduate Outcomes survey has some unusual features, such as a large sample size, an adaptive survey design, and a concurrent mixed-mode data collection approach. Over the last few years HESA, along with academic partners, have undertaken various investigations into the application of weights to the survey estimates and their impact. The conclusion of every assessment has been the same – there is not evidence of bias relating to mis-match between the achieved sample and graduate population characteristics in any direction at sector level. Indeed, when analysing across a range of demographic and course variables, we found a high level of similarity between the sample and population distributions. We trialled various weighting methods, and these did not improve the quality of our estimates. Overall, across the breadth of HESA variables analysed, we generally observe close resemblance between the sample and the population, reducing concerns over potential bias. For a summary of our research and the findings, see the Survey methodology section on data analysis[14].

Some statistics published from the Graduate Outcomes survey are at a very granular level, e.g. activity by provider, domicile, level of qualification and mode of qualification. In some cases, the sample size for such statistics may be small. In these cases, the statistics may be subject to high levels of variability and a lack of statistical precision. Confidence intervals on these statistics (ranges within which we have a high level of confidence that the equivalent whole-population parameter would fall, where a narrow range indicates greater precision and a wide range indicates less precision) are, for key tables, published alongside the data.

In addition, for some statistics, it may be necessary to introduce publication thresholds whereby statistics based on very small sample sizes and/or lower response rates are suppressed – this will be explained in any statistical releases where this decision is taken[15].

Research to date therefore indicates there is no evidence of measurable non-response bias in the data. We are fortunate to be able to link to good data on population characteristics to support these assessments. The risk of non-response bias appears to have been minimised by features such as the relatively high response rates. Despite this, it is not easy to quantify the extent to which non-response bias remains a problem. There may be variables that we are not currently measuring that are more strongly correlated with unit nonresponse. The Longitudinal Educational Outcomes data offers a suitable external source for analysis of bias, and undertaking this work forms part of our future plans. Survey paradata may also prove useful in this respect in future. Users of Graduate Outcomes microdata may wish to conduct their own analyses to ensure the Graduate Outcomes data supports their analytical objectives. However, users should be reassured that there is no evidence to suggest that measurable non-response bias is present in the Graduate Outcomes survey data.

Item non-response error

Item non-response occurs where a value for a particular variable is missing for a graduate, in a case where this observation was expected. In our survey, this typically occurs when respondents decline to answer particular questions. No single graduate is expected to answer all available survey questions. A routing structure directs respondents to particular sets of questions that are most relevant to their circumstances[16]. Furthermore, optional questions will not be presented to all respondents. So, some data will not be present, but this does not mean it is missing – it may never have been sought, as it was not relevant to be asked in that case. In HESA’s publications, these issues will be made clear in the data and the notes, for example by indicating the sample used to produce a table or chart in its title, and by enumerating the unknown values. Researchers and other microdata users in particular will need to note this feature of the survey.

A derived field (ZRESPSTATUS[20]) describes the status of response to the Graduate Outcomes survey for each graduate for whom some (however minimal) results data has been received. A core set of mandatory questions[21] are required to be completed for a response to be marked as completed. This field classifies responses into categories denoting various states of completeness. The terms ‘complete’ and ‘full response’[22] are used interchangeably to refer to those cases where all the questions requiring a response have been completed and are populated with an answer. In addition to responses classified as ‘survey completed’,[23] a status of ‘partially completed’[24] has been assigned where some of the core questions are missing but the first two questions have been answered.[25] Although partially completed responses do not contribute to the survey’s response rate targets, partially complete responses are used alongside ‘survey completed’ responses in statistical outputs. Again, data from such responses will appear in published statistics in the following ways: in tables with numbers, unknown values are shown for questions that were not answered. Wherever we display % values, we exclude unknowns from the calculations. The sample used will be clear in the title or accompanying text.

Just as unit non-response has the potential to introduce bias into overall survey results, item non-response can also introduce bias into estimates based on responses to specific questions which experience a relatively high proportion of survey drop-out. Where this non-response is non-randomly distributed for reasons such as question sensitivity and social desirability bias, it is important that patterns of non-response are well understood.[26] This would enable us to implement treatment plans to reduce non-response and therefore the risk of bias.

So far, we have observed a high completion and a very low drop out rate in Graduate Outcomes. Most people (more than 90%) who start responding to the Graduate Outcomes survey tend to complete it. This not only reduces the risk of item non-response, but it also reduces the requirement for interventions. HESA has started a program of work which is aimed at getting a better understanding of the characteristics of and reasons behind unit and item non-response, leading to the development and implementation of treatment plans where necessary and possible.

With regards to item non-response, in year two we prioritised the most sensitive questions in the survey which are prone to higher drop-out rates compared with other questions. For year three we turned our attention to questions which had undergone noticeable change either in the form of question wording, routing or their presentation.  As committed to in the report last year, we have also created a comprehensive report on item non-response for the questionnaire. Additionally, we have introduced flags into the survey that will allow us to track item non-response more accurately and are working on improving these flags to ensure that they are reliable for all of the data items.  This has aided us in continuing to track item non-response in the fourth year of the survey and has allowed us to put action plans in place to improve response levels to specific questions if needed.

The following table contains response rates for some of the questions assessed using the 2020/21 survey data. Further detail can be found later in later sections of the Survey Quality Report. 

Table: Response rates for some of the questions assessed in year four


Response rate

Base description

Job title (employment)


Graduates in or due to start employment who answered employment intensity 

Employment basis


Graduates in or due to start employment who answered job duties



Graduates in employment or self-employment who answered currency as UK £

As indicated by the response rates in the table above, item non-response levels in the survey are generally low, even for the most sensitive questions such as salary and job title. Item non-response to salary appears to be the highest in the table, which is not surprising when considering that it is an optional survey question, and that income is often found to be a particularly sensitive survey topic. Indeed, the levels of item non-response to salary in Graduate Outcomes are lower than the levels often seen in surveys, however, we are always aiming to improve response levels and indeed have seen an improvement from the rate reported last year of 2.8 percentage points. Further detail on this question is laid out in the section on reliability of sensitive data. Item non-response continues to be monitored to aid in determining the impact of existing changes and to identify further interventions that may aid in improving response levels.

Next: Proxy responses

[1] See footnote 1 on The sample section.

[2] As Koch and Blohm (2016) note.

[3] This is where we are missing all observations for a case – this would mainly happen in situations where we are unable to elicit any response from a graduate.

[4] This is where we are missing some observations for a case – a common situation might be a graduate who answers the survey, but does not wish to answer some questions in the survey. We explain more about how we handle this sort of issue, in the following section.

[5] As Groves (2004) illustrates.

[6] Keeter et al (2000) and Curtin et al (2000) are examples of previous studies that have demonstrated the phenomenon of achieving both higher response rates and bias.

[7] See

[8] See

[9] Addressing quality issues closest to their source is generally the most efficient approach, and follows established data quality management principles (Data Management Association, 2017, p. 453).

[10] (Schouten & Shlomo, 2017)

[11] See Rosen et al. (2014) for details. The use of this approach has also been applied in a similar fashion by Peytchev et al (2010) and Wagner (2013).

[12] See (particularly the section on case prioritisation).

[13] The creation of weights can comprise of several components. First, the base weight refers to the probability that an individual is selected into the sample given the design of the survey. In Graduate Outcomes, we aim to send the survey to everyone in the sampling frame. We have not quantified how many people actually receive the survey. Second, a (unit) non-response weight may be generated, which seeks to account for the fact that participation may vary among different groups. In instances where information is available on the entire population, a final step would be to ensure that the weights can allow the sample data to match known population totals for a chosen set of categories.

[14] See

[15] Where suppression is applied, this will be done in line with the prevailing HESA statistical confidentiality policy (see and the associated rounding and suppression approach: (summarised in the Confidentiality and disclosure control section of this report).

[16] A flow diagram showing the survey response record fields produced given each survey routing, is available in the coding manual:

[20] See the derived field specification at:

[21] Details of mandatory questions can be found as a PDF download from:

[22] See



[25] The observations gathered from the first two survey questions permit the derived field XACTIVITY to be produced – see . Since ‘activity’ is the Graduate Outcomes survey’s central concept, these responses are often partly usable.

[26] (De Leeuw, Hox and Huisman, 2003)