Sampling error and non-response error
Sampling error is the difference between a population value and an estimate based on a sample, and is one of the components of total survey error. It is normal for a quality report on a sample survey to offer a caveat explaining that, in principle, many random samples could be drawn and each would give different results, due to the fact that each sample would be made up of different people, who would give different answers to the questions asked. The spread of these results is the sampling variability. However, sampling error occurs because estimates are based on a sample rather than a census. As we have previously demonstrated, Graduate Outcomes is a population scale survey[1] where the sample is identical with the sampling frame, and the sampling frame resembles the population of interest very closely. While we know that the quality and availability of contact details must affect the response rate we can achieve from the sample, to develop a comprehensive measure of quality is a complex exercise in the absence of a perfect and accessible descriptor of quality. We are however making significant improvements in our understanding of the various facets of quality, as described in the Sampling frame data based on HESA data collections section. We aspire to provide response rates not just as a proportion of the target population but also as a proportion of the contactable population. Therefore, the response rate achieved is itself our present best indicator of the quality of contact details. Hence, our analytical focus in this section is on the extent to which the achieved sample is representative of the population. We therefore focus on non-response error.
This section comprises two subsections, which cover the strategies HESA has followed to limit the practical effects of missing responses. In conducting a survey, one of the main types of non-sampling error that can arise is that resulting from non-response. Whilst a lower level of response causes a reduction in the precision of obtained estimates, the impact of response rates on bias is ambiguous[2]. The two types of error in this category are unit non-response[3] and item non-response[4]. We cover issues related to these in the next two sections.
Unit non-response error
Unit non-response occurs where a graduate does not respond to the survey. A poor response rate will result in less precision in any estimates we generate. Its effect on bias is less certain. Bias is determined by two components[5]. These are the response rate, as well as the variation between respondent and non-respondent values. Hence, a better response rate can be associated with increased bias, if the discrepancy between those who respond to the survey and those who do not grows larger. Consequently, attempting to maximise response rates will not necessarily minimise non-response bias[6].
A number of elements of the survey design are intended to maximise response rates, and an overview is offered in the operational survey information on the HESA website[7]. These include:
- A website aimed at respondents to reinforce the legitimacy and credentials of the survey[8]
- A smartphone-optimised survey
- Allowing the survey to be completed in more than one stage, whether online, at the telephone, or using a mixture of both modes
- Bespoke email invitations and reminders that include the name of the graduate and their provider
- A dynamic engagement strategy informed by best practice and survey paradata
- Using a data collection platform that seamlessly integrates all modes together
- The adoption of a concurrent mixed-mode design (computer-assisted telephone interviewing (CATI) starts a week after the online system opens, and those who start online are not followed up until much later in the field period)
- Increasing the convenience of responding for graduates, by making appointments for telephone interviews at times that suit them
- Collecting proxy responses from half-way through the fieldwork period.
For the rest of this section we cover the specifics of our approach where non-response bias is concerned. Root cause remediation is one of the practices HESA adopts to proactively manage data quality[9]. In this case, our goal was to reduce data quality issues arising during collection. Historically, organisations that have administered surveys have relied upon methods executed after collection (i.e. weighting) to deal with the challenge of non-response. Yet, over the last decade, those working in this area have increasingly looked at whether anything can also be done during the data gathering phase. Work by the Netherlands’ official statistics agency[10] points to the advantages in attempting to do this, such as improved precision due to less variable weights. In trying to reduce non-response bias, other authors highlight the potential benefit of developing propensity models and subsequently diverting more attention to those individuals with a lower likelihood of responding in the latter stages of the collection process[11]. An adaptive survey design methodology was therefore designed and implemented from cohort C of the first year of the survey, onwards. This is subject to a quarterly refinement process where opportunities for improvements to our response propensity model are identified and where possible implemented by analysts. Details of the practical approach to case prioritisation we take (based on our response propensity model) are covered in detail in the section of the Survey methodology covering data collection[12]. In summary, approximately halfway through a collection cycle, a logit model (consisting of student and course characteristics as independent variables) is created to generate individual response propensities. Additional resource and effort is then allocated to obtaining responses from those graduates identified as being least likely to partake in the survey. The objective of this exercise is to ensure not only higher response rates, but also to reduce possible non-response bias by aspiring to achieve a more representative sample.
We cannot, however, simply assume that the adaptive survey design will achieve its objective. The resulting data must be assessed and if necessary, action taken to address bias. This is referred to as “weighting” the survey. The overarching objective of weighting is to enable the sample to be adjusted such that it is more representative of the population[13]. Most surveys are weighted following collection. However, the Graduate Outcomes survey has some unusual features, such as a large sample size, an adaptive survey design, and a concurrent mixed-mode data collection approach. Over the last few years HESA, along with academic partners, have undertaken various investigations into the application of weights to the survey estimates and their impact. The conclusion of every assessment has been the same – there is not evidence of bias relating to mis-match between the achieved sample and graduate population characteristics in any direction at sector level. Indeed, when analysing across a range of demographic and course variables, we found a high level of similarity between the sample and population distributions. We trialled various weighting methods, and these did not improve the quality of our estimates. Overall, across the breadth of HESA variables analysed, we generally observe close resemblance between the sample and the population, reducing concerns over potential bias. For a summary of our research and the findings, see the Survey methodology section on data analysis[14].
Some statistics published from the Graduate Outcomes survey are at a very granular level, e.g. activity by provider, domicile, level of qualification and mode of qualification. In some cases, the sample size for such statistics may be small. In these cases, the statistics may be subject to high levels of variability and a lack of statistical precision. Confidence intervals on these statistics (ranges within which we have a high level of confidence that the equivalent whole-population parameter would fall, where a narrow range indicates greater precision and a wide range indicates less precision) are, for key tables, published alongside the data.
In addition, for some statistics, it may be necessary to introduce publication thresholds whereby statistics based on very small sample sizes and/or lower response rates are suppressed – this will be explained in any statistical releases where this decision is taken[15].
Research to date therefore indicates there is no evidence of measurable non-response bias in the data. We are fortunate to be able to link to good data on population characteristics to support these assessments. The risk of non-response bias appears to have been minimised by the combination of relatively high response rates, and the adaptive survey design. Despite this, it is not easy to quantify the extent to which non-response bias remains a problem. There may be variables that we are not currently measuring that are more strongly correlated with unit nonresponse. The Longitudinal Educational Outcomes data offers a suitable external source for analysis of bias, and undertaking this work forms part of our future plans. Survey paradata may also prove useful in this respect in future. Users of Graduate Outcomes microdata may wish to conduct their own analyses to ensure the Graduate Outcomes data supports their analytical objectives. However, users should be reassured that there is no evidence to suggest that measurable non-response bias is present in the Graduate Outcomes survey data.
Item non-response error
Item non-response occurs where a value for a particular variable is missing for a graduate, in a case where this observation was expected. In our survey, this typically occurs when respondents decline to answer particular questions. No single graduate is expected to answer all available survey questions. A routing structure directs respondents to particular sets of questions that are most relevant to their circumstances[16]. Furthermore, optional questions will not be presented to all respondents. So, some data will not be present, but this does not mean it is missing – it may never have been sought, as it was not relevant to be asked in that case. In HESA’s publications, these issues will be made clear in the data and the notes, for example by indicating the sample used to produce a table or chart in its title, and by enumerating the unknown values. Researchers and other microdata users in particular will need to note this feature of the survey.
A derived field (ZRESPSTATUS[20]) describes the status of response to the Graduate Outcomes survey for each graduate for whom some (however minimal) results data has been received. A core set of mandatory questions[21] are required to be completed for a response to be marked as completed. This field classifies responses into categories denoting various states of completeness. The terms ‘complete’ and ‘full response’[22] are used interchangeably to refer to those cases where all the questions requiring a response have been completed and are populated with an answer. In addition to responses classified as ‘survey completed’,[23] a status of ‘partially completed’[24] has been assigned where some of the core questions are missing but the first two questions have been answered.[25] Although partially completed responses do not contribute to the survey’s response rate targets, partially complete responses are used alongside ‘survey completed’ responses in statistical outputs. Again, data from such responses will appear in published statistics in the following ways: in tables with numbers, unknown values are shown for questions that were not answered. Wherever we display % values, we exclude unknowns from the calculations. The sample used will be clear in the title or accompanying text.
Just as unit non-response has the potential to introduce bias into overall survey results, item non-response can also introduce bias into estimates based on responses to specific questions which experience a relatively high proportion of survey drop-out. Where this non-response is non-randomly distributed for reasons such as question sensitivity and social desirability bias, it is important that patterns of non-response are well understood.[26] This would enable us to implement treatment plans to reduce non-response and therefore the risk of bias.
So far, we have observed a high completion and a very low drop out rate in Graduate Outcomes. Most people (more than 90%) who start responding to the Graduate Outcomes survey tend to complete it. This not only reduces the risk of item non-response, but it also reduces the requirement for interventions. HESA has started a program of work which is aimed at getting a better understanding of the characteristics of and reasons behind unit and item non-response, leading to the development and implementation of treatment plans where necessary and possible.
With regards to item non-response, in year two we prioritised the most sensitive questions in the survey which are prone to higher drop-out rates compared with other questions. For year three we turned our attention to questions which had undergone noticeable change either in the form of question wording, routing or their presentation. We are currently preparing a comprehensive report on item non-response for the entire questionnaire covering every single data item.
The following table contains response rates for each of the sensitive questions assessed using the 2019/20 survey data.
Table 2: Response rates for revised questions, year three
Question/topic |
Response rate |
Base description |
---|---|---|
Job title (employment) |
96.9% |
Graduates in or due to start employment who answered employment intensity (not including information copied over from same activity) |
Job title (self-employment, business, portfolio) |
96.2% |
Graduates in self-employment, running a business or developing a portfolio who answered employment intensity (not including information copied over from same activity) |
Employer’s name (employment) |
98.5% |
Graduates in or due to start employment who answered employment basis and, if relevant, salary and currency |
Employer’s name (self-employment, business) |
97.8% |
Graduates in self-employment or running a business who answered the last relevant mandatory question (job duties) |
Salary |
82.8% |
Graduates in employment, self-employment or due to start work who answered currency as UK £ |
As indicated by the response rates in the table above, item non-response levels in the survey are generally low, even for the most sensitive questions. Item non-response to salary appears to be the highest in the table, which is not surprising when considering that it is an optional survey question, and that income is often found to be a particularly sensitive survey topic. Indeed, the levels of item non-response to salary in Graduate Outcomes are lower than the levels often seen in surveys, however, we are always aiming to improve response levels. During year three we introduced additional guidance on the questions in the table with the aim of encouraging graduates to respond and detailed analysis of the impact so far is laid out in the section on reliability of sensitive data. Item non-response continues to be monitored to aid in determining the impact of existing changes and to identify further interventions that may aid in improving response levels.
[1] See footnote 1 on The sample section.
[2] As Koch and Blohm (2016) note.
[3] This is where we are missing all observations for a case – this would mainly happen in situations where we are unable to elicit any response from a graduate.
[4] This is where we are missing some observations for a case – a common situation might be a graduate who answers the survey, but does not wish to answer some questions in the survey. We explain more about how we handle this sort of issue, in the following section.
[5] As Groves (2004) illustrates.
[6] Keeter et al (2000) and Curtin et al (2000) are examples of previous studies that have demonstrated the phenomenon of achieving both higher response rates and bias.
[7] See https://www.hesa.ac.uk/definitions/operational-survey-information#contact-centre-methodology
[8] See https://www.graduateoutcomes.ac.uk/
[9] Addressing quality issues closest to their source is generally the most efficient approach, and follows established data quality management principles (Data Management Association, 2017, p. 453).
[10] (Schouten & Shlomo, 2017)
[11] See Rosen et al. (2014) for details. The use of this approach has also been applied in a similar fashion by Peytchev et al (2010) and Wagner (2013).
[12] See https://www.hesa.ac.uk/data-and-analysis/graduates/methodology/data-collection (particularly the section on case prioritisation).
[13] The creation of weights can comprise of several components. First, the base weight refers to the probability that an individual is selected into the sample given the design of the survey. In Graduate Outcomes, we aim to send the survey to everyone in the sampling frame. We have not quantified how many people actually receive the survey. Second, a (unit) non-response weight may be generated, which seeks to account for the fact that participation may vary among different groups. In instances where information is available on the entire population, a final step would be to ensure that the weights can allow the sample data to match known population totals for a chosen set of categories.
[14] See https://www.hesa.ac.uk/data-and-analysis/graduates/methodology/data-analysis
[15] Where suppression is applied, this will be done in line with the prevailing HESA statistical confidentiality policy (see https://www.hesa.ac.uk/about/regulation/official-statistics/confidentiality) and the associated rounding and suppression approach: https://www.hesa.ac.uk/about/regulation/data-protection/rounding-and-suppression-anonymise-statistics (summarised in the Confidentiality and disclosure control section of this report).
[16] A flow diagram showing the survey response record fields produced given each survey routing, is available in the coding manual: https://www.hesa.ac.uk/collection/c19072/download/GO_SurveyRouting_19072.pdf
[20] See the derived field specification at: https://www.hesa.ac.uk/collection/c19072/derived/zrespstatus
[21] Details of mandatory questions can be found as a PDF download from: https://www.hesa.ac.uk/innovation/outcomes/survey
[22] See https://www.hesa.ac.uk/definitions/glossary#F
[23] ZRESPSTATUS=04
[24] ZRESPSTATUS=03
[25] The observations gathered from the first two survey questions permit the derived field XACTIVITY to be produced – see https://www.hesa.ac.uk/collection/c19072/derived/xactivity . Since ‘activity’ is the Graduate Outcomes survey’s central concept, these responses are often partly usable.
[26] (De Leeuw, Hox and Huisman, 2003)