HESA has agreed with the Graduate Outcomes Steering Group that weighting will not be applied to all statistics published by HESA for this first year (17/18) of survey data. Our analysis of the survey data has not identified any evidence of bias relating to mis-match between the achieved sample and graduate population characteristics in any direction at sector level. Indeed across a range of demographic and course variables, we see a high level of similarity between the sample and population distributions.
The weighting approaches we have developed as part of this analysis have shown little if any divergence between weighted and unweighted estimates at sector level. At a more granular level, some effects have been seen when applying particular weighting methodologies, especially for small sample sizes (e.g. statistics disaggregated by provider and subject with very small numbers of responses) but these are subject to high variability. When considered together, we have been unable to determine any weighting approach which consistently and materially improves the quality of estimates.
The following assessment further explains our conclusion on this issue. For those with greater expertise in this field, a more comprehensive technical description of the analysis and conclusions on the weighting methodology will be published shortly.
As described in previous sections, Graduate Outcomes aims to survey all (with a small number of necessary exceptions) individuals who qualified from higher education in each academic year. With participation being voluntary, non-response is recognised as one of the factors that could impact on the quality of the collected data, both in terms of potential non-response bias and precision of the resulting estimates. By non-response bias, we mean that any estimates generated from the sample will not accurately reflect the outcomes of the wider population. In Graduate Outcomes, this is likely to occur if the composition of the sample differs to that of the population.
Greater precision meanwhile would mean that we can be confident that an estimate we derive from our achieved sample of respondents, with a small margin for error, fairly reflects the true statistic for the whole population. For example, if we could estimate that the whole population percentage in employment for a given university was 81.5% (with a margin of error plus or minus 0.5%) based on the sample of responses, that would be a relatively precise estimate. If, on the other hand, that 81.5% was subject to plus or minus 10% then it would not be precise.
Higher response rates provide the advantage of generating more precise estimates; this can be interpreted (loosely) as more ‘reliable’ estimates at granular levels such as by HE provider and subject, mainly because higher response rates provide greater numbers of graduate responses at these levels. Non-response bias, however, is determined by a combination of the response rate and the difference between respondents and non-respondents for any given statistic of interest. Consequently, a larger response rate does not always guarantee a reduction in non-response bias, as it is possible that it is the most hardened of non-respondents who are most different from those who respond. It is entirely feasible for unbiased statistics to be derived from survey data based on relatively low response rates if appropriate survey design and operation have been deployed and, where required, approaches such as weighting have been applied.
As previously described, as part of the response-chasing operation, HESA has utilised a case prioritisation process to try to balance response rates across a range of groups. This technique involves identifying those considered least likely to respond and giving such individuals a higher priority as part of our engagement strategy in the latter stages of the field work. Such an approach aims to mitigate possible bias resulting from non-response, rather than simply ensuring high response rates are achieved.
Use of weighting
One of the techniques commonly deployed in surveys post-collection is the use of weighting. This involves the use of ‘scaling factors’ (e.g. a factor of 0.75 applied to a response would reduce its relative weight) applied to each survey response in an attempt to make the sample more representative of the population. Weighting seeks to mitigate the impact of non-response bias and, under certain conditions, can also improve the precision of estimates.
Survey weighting is almost always used when the survey is designed around a ‘structured sample’ (a specific subset of a population designed to conform to certain characteristics) but Graduate Outcomes is not designed in this way – it is a census survey. Even with a census survey, once the resulting responses are analysed, they can be found to show materially different characteristics from the population and the application of weighting can ‘correct’ for this imbalance.
HESA analysts have worked in collaboration with analysts from the Office for Students and with advice from experts at the Office for National Statistics to undertake extensive analysis of the first year of Graduate Outcomes survey data (for academic year 2017/18). This work has focused on assessing the extent to which the achieved sample for the survey shows similar characteristics to the population of all graduates, deriving and implementing a number of different statistical models for weighting and then testing to assess the impact of each weighting model through comparing weighted and unweighted data.
It is important to note here that HESA holds data on the population of graduates through the HESA Student Record and associated census records for HE taught in Further Education Colleges, so it is possible to compare demographics, study, qualifications and HE provider characteristics between the achieved sample of respondents in the survey and the entire population. HESA does not hold population data on outcome characteristics, such as nature of employment or other outcome activities (though future work might provide some insight into this missing element, such as use of Longitudinal Educational Outcomes (LEO) data). Consequently, we have only been able to make inferences about bias using the data that we hold. We cannot definitively know anything about the responses to the survey that would have been provided by those who chose not to respond (without identifying alternative sources of these data).
Notwithstanding the above caveat on our analysis, our findings are that there was little observed difference between the achieved sample and the population across a range of demographic, study, qualification and HE provider characteristics that were examined. Having applied a variety of different weighting models, the weighted and unweighted estimates (such as percentages of those in employment or study) were very similar and this also often applied at sub-sample level too (for example, by subject and/or provider). The largest differences we observed between weighted and unweighted estimates were most commonly found in instances of small sample sizes, which are estimated less precisely in any case. We note that, in general, weighting led to less precise estimates than unweighted data.
The above findings have led to HESA agreeing with the Graduate Outcomes Steering Group that weighting will not be applied to all statistics published by HESA for this first year of survey data.
The position regarding use of weighting in future years of the survey remains under review. HESA is planning some additional exploration of more nuanced approaches to weighting through the remainder of 2020 (and early 2021), which will utilise data from the second year of the survey once available. If an approach to weighting can be identified at that time that can be shown to improve the quality of statistics derived from the surveys, then this will be applied from 2018/19 Graduate Outcomes data onwards and will also be retrospectively applied to key data outputs from the 2017/18 survey to enable comparisons across years.
Some statistics published from the Graduate Outcomes survey will be at a very granular level, e.g. employment rates by HE provider and subject. In some cases, the sample of respondents for such statistics may be small and/or the response rate for that sample may be lower than the overall survey response rate. In these cases, the statistics may be subject to high levels of variability and a lack of statistical precision. HESA intends to publish confidence intervals on these statistics (ranges within which we have a high level of confidence that the equivalent whole-population statistic would fall, where a narrow range indicates greater precision and a wide range indicates less precision).
In addition, for some statistics, it may be necessary to introduce publication thresholds whereby statistics based on very small sample sizes and/or lower response rates are suppressed. The actual decisions on use of these techniques will be clearly explained in each HESA statistical release.