Dissemination - Approach to data concepts and standards
On this page: Classes of usage
HESA follows well-managed third-party data and metadata standards where we can and we manage our own where no suitable standards exist. This helps us to improve consistency across disparate processes or sources, reduce the costs of integration between sources, and manage data quality. Data and metadata standards help our users by providing stable and predictable approaches that are applied across different years with harmonised definitions between different datasets.
HESA’s longstanding practice is to use existing national and international data standards where possible and appropriate in all of our data collection and dissemination activities. Graduate Outcomes is no exception. As an official statistics producer, where possible, we prefer to align with standards published by the Government’s Analysis Function. Among the most prominent of these are SOC2020 and SIC2007, which describe occupations and industries, respectively. However, HESA data often covers topic areas where applicable data standards may not exist or are unsuitable for a range of reasons (e.g. sometimes data standards may have been developed for different industrial sectors but have poor applicability to the HE sector).
In such cases, HESA may establish its own approach by creating a data definition or derivation. Of course, that does not automatically result in a new data standard, since data standards only result from widespread adoption of a definition – but often we have seen HESA definitions being adopted for a range of purposes.
Where HESA creates its own standard, we usually attempt to align it to the maximum reasonable extent with other data standards. For instance, the ages of the student population are right-skewed, as students are more likely to be young than the general population. To give greater insights into the ages of students and graduates, we utilize additional categories than those available in the Analysis Function’s Age and date of birth harmonised standard. However, we ensure that our additional age categories nest within the categories in the harmonized standard, to enable users of aggregate statistics to compare HESA data with data from other sources using this standard, without reference to microdata (where actual ages are available).
In other sections of the Survey methodology we explain the wide range of expected uses of Graduate Outcomes data by those seeking to understand graduate outcomes and destinations. These uses will necessitate different sub-sets of survey data, ‘cut’ and presented in different ways. There are various ways to categorise the range of uses but one categorisation in particular influences the derivation of data definitions or standards – that is between uses for broadly descriptive purposes and uses for regulatory or performance assessment applications:
- By ‘descriptive purposes’, we mean uses where the primary aim is to understand and generate insight into the experiences of graduates after they gain qualifications.
- By ‘regulatory and performance uses’ we refer to applications of data to assess the characteristics and performance of HE providers in delivering on national policy imperatives or providing high quality education for their students.
It is reasonable to expect that in some cases, choices of data definitions for descriptive statistical uses may differ from those used for regulatory and performance assessment applications – a definition that may be helpful in explaining the experiences of graduates may not provide a fair and reasonable basis for assessing HE provider performance.
An example of this may be in the use of survey questions about ‘main activity’ versus ‘all activities’ when assessing percentages of graduates entering ‘highly-skilled’ jobs. From a descriptive statistics perspective it is interesting for users to be able to understand the job types of graduates in all circumstances but as a measure of HE provider performance in preparing graduates with the skills to undertake specific professions, it may (arguably) be fairer to only consider the jobs that graduates report as their ‘main activity’.
Recognising the differences between these main categories of use, HESA has choices to make about how we approach data definitions and presentation of our statistics. One choice could be to narrowly prescribe definitions driven primarily by expected uses in regulatory and performance measures – accepting that the external organisations that are largely responsible for such measures may choose not to adopt HESA definitions in any case.
The alternative approach, and our preference, is to structure definitions and data within our main statistical products in flexible ways that are capable of being used for a wide range of purposes.
We believe that users of our data products should be empowered to make informed choices about how they utilise the data we publish. We give them choices through use of interactive and dynamic data publication (e.g. use of data filters on tabulations and charts) and through services supported by highly trained analysts. We ensure they are informed to make the right choices through the delivery of comprehensive explanation of data and definitions, along with information to help them understand the options they have in utilising the data.
A full description of data standards used within HESA’s Graduate Outcomes statistical products can be found in the definitions that accompany each product and the Graduate Outcomes Quality Report.