Anonymisation: Processing personal data so that individuals cannot be identified from it. (see ICO – What is personal data?)
Aggregation: A process to combine smaller numbers (such as small populations of students or graduates) to express these as one larger number. This can then be used to produce useful statistics where, for example, percentages or averages of smaller populations would not be meaningful, or would pose a data protection risk. Examples include grouping postcodes into regions, or subjects into subject areas.
Approval: The action of a defined role (for example a student returns officer) approving an in-Reference period data delivery to a HESA data customer, on behalf of the HE provider e.g. ITT return to DfE.
Approved: A state of data that has undergone approval and is still open to amendment during the Reference period.
Base population (Graduate Outcomes survey): All the graduates that are within the population calculated from the relevant student collection. This identifies all graduated who are eligible to complete the survey in the relevant cohort.
Brand guidelines (Graduate Outcomes survey): The guidelines underpinning all promotional activity relating to the survey.
CATI: Computer Assisted Telephone Interviewing
Census week (Graduate Outcomes survey): A census week replaces a census date (as used in DLHE and LDLHE). The census week is the first week of the contact period.
Coding frame: Table of values and labels that specifies the valid entries for a particular field.
Coding manual: The suite of material that supports HE providers and stakeholders in their submission and understanding of HESA data collections.
Cogs (Graduate Outcomes survey): These are the statuses identified for each graduate in the provider portal progress bar.
Cohort (Graduate Outcomes survey): Graduates are split across the year into four cohorts which regulate when a graduate will be surveyed. The cohort a graduate is assigned to relates to the month the graduate completes their course. Specific detail about these time periods can be found in the relevant collection schedule.
Collection specification: suite of material that supports HE providers. Includes: Data model definition, coding frame definition, guidance definition, quality ruleset, derived fields and entities, corresponding XSD file.
Competition Law: HESA evaluates all dissemination practices to ensure compliance with UK Competition Law. As part of HESA’s compliance, where information is approved for sharing, we seek to make it equally accessible to all potential competitors.
Completion period (Graduate Outcomes survey): The period when the relevant graduate completes their studies. The date of completion determines which cohort the graduate falls into.
Contact period (Graduate Outcomes survey): A period of time during which HESA may contact a graduate. The contact period assigned to a graduate is determined by which cohort they sit in. There are four contact periods in a year: A (December-February), B (March-May), C (June-August), and D (September-November) - in each case these are inclusive of the months referred to.
Core questions (Graduate Outcomes survey): The questions forming the core of the survey and which are to be asked of all graduates (depending upon routing).
Data minimisation: Processing the minimum amount of personal data required to fulfil a purpose. (see ICO – Data minimisation).
Data model: Definition and visual representation of all structural elements of a collection specification including relationships between entities to inform the data collection system.
Data sharing agreement: Defines the purposes for which data may be used over a pre-agreed period of time, after which data must be destroyed.
Data standards: Help to improve consistency across disparate processes or sources, to reduce the costs of integration between sources, and to manage data quality. They help HESA users through stable and predictable approaches that are applied across different years and by supporting harmonised definitions between different datasets.
Derived entity: An entity created by HESA to aid either collection or analytical activities.
Derived field: A field created by HESA to aid either collection or analytical activities.
Discrete collection: sees data returned to reflect the status at a point in time – in this case the end of the Reference Period – with each collection operating independently from any others.
Dissemination point: The specified date, after the end of a Reference period, by which signed-off data will be extracted and supplied to HESA's data customers. Data disseminated at the Dissemination point will be used for official accounts of the higher education provider’s activity for statistical, regulatory, and public information purposes.
Disaggregation: The opposite of aggregation. Breaking up larger populations into their constituent smaller populations to create more detailed statistics. For example breaking the ‘Languages’ subject area down into English, French, Chinese, etc.
DLHE: Destinations of Leavers from Higher Education (survey of graduates that existed prior to Graduate Outcomes).
DLSG: Data Landscape Steering Group. The data landscape governance body charged with maintaining the HE sector’s data language and supporting logical model, and with promoting good practice for data collections.
Engagement strategy (Graduate Outcomes survey): The strategy HESA uses to secure responses to the survey.
Enhanced coding frame: A coding frame with further information regarding the valid entries recorded in additional columns, for example applicability of valid entries or categories for onward use.
Enrolment: The act of a student committing to undertake a course at a provider.
Eudemonic wellbeing: Conversely to hedonic wellbeing, eudemonic wellbeing questions attempt to measure human flourishing in a more evaluative and reflective way.
Exception processing: A mechanism for handling historical amendments to provider data.
Experimental statistics: Experimental statistics are an existing class or subset of official statistics and are defined as newly developed or innovative official statistics undergoing evaluation. See 2020 blog post ‘Why Graduate Outcomes statistics are experimental’.
Foreign key: A field on one entity that refers to the primary key of another to allow linking between entities. For example, StudentCourseSession.COURSEID links the StudentCourseSession to the relevant Course.
FPE: Full-person equivalent.
Freedom Of Information (FOI): HESA is not covered by this legislation as we are not a Public Authority under the Freedom of Information Act 2000. However, HESA aims to support openness and transparency in its operations and we publish extensive information via our website as well as responding promptly to all reasonable requests for information.
FTE: Full-time equivalent.
Full response (Graduate Outcomes survey): A survey response where all the questions requiring a response have been completed and are populated with a valid answer.
Fuzzy matching: a data linking technique which uses non-exact matches across different sources, for one or more common data items, that are used to link the two data sources.
Graduate Outcomes steering group: A group that has been established to advise HESA through the implementation phase of the Graduate Outcomes survey, and into business as usual. It comprises of representatives from HESA, the funding bodies, providers and other stakeholders.
Graduate Outcomes survey: This is the name of the survey. When referring to the record, the survey or the data collected by the record, capitalise Graduate Outcomes. If you are referring to general data, use lower case. Arolwg Hynt Graddedigion is the Welsh version. We do not foresee a need to shorten Graduate Outcomes, and so we are not going to introduce a new acronym. Where a shortening is absolutely necessary, it should be sufficient to use just ‘the record’, ‘the ‘survey’, or ‘the data’.
HECoS: Higher Education Classification of Subjects.
HEDIIP: Higher Education Data and Information Improvement Programme (2015-16).
Hedonic wellbeing: Measures of hedonic wellbeing aim to understand positive and negative affect, using questions that promote recall of recent experience of feelings. The ideal data collection instrument for hedonic wellbeing would therefore be something akin to a brain scan. See Eudemonic wellbeing.
HEI: Higher education institution.
HEP: Higher education provider.
heidi: Higher Education Information Database for Institutions (replaced by Heidi Plus, which is not an acronym).
HERA: Higher Education and Research Act 2017.
HESA: Higher Education Statistics Agency.
HESA Data Platform (HDP): Technology platform to collect, assure and disseminate data.
HESPA: Higher Education Strategic Planners' Association.
Historical amendment: A data submission following the closure of a collection to correct errors in the original submission.
Identifiable personal data: Information relating to an identified or identifiable person (see ICO – What is personal data?) We implement a rounding and suppression strategy in HESA publications designed to prevent the disclosure of personal information about any individual. This strategy involves rounding all numbers to the nearest multiple of 5 and suppressing percentages and averages based on small populations.
Imputation: A statistical technique by which missing or incorrect values are repaired using information available elsewhere, either within the same source or a different source altogether.
In-scope period: The duration for which each entity is relevant for sign-off by an HE provider. Data submitted prior to the in-scope period will not require sign-off until it becomes in scope. Submission after this period will be subject to exception processing.
ITT: Initial Teacher Training.
JACS: Joint Academic Coding System, replaced by Higher Education Classification of Subjects (HECoS).
Jisc: Joint Information Systems Committee (note: while Jisc was originally an acronym, they no longer use the expanded title).
KPIs: Key Performance Indicators
LEO: Longitudinal Education Outcomes administrative dataset (which links education, benefit and tax records) has become the primary source for understanding the trajectory of graduate earnings. (From the return to a degree report.)
Logical model: A data model which displays the logical structure of how various elements of HE data are associated and interact in the real world.
Longitudinal DLHE: One of Graduate Outcomes’ predecessors, this was a follow up to the Destinations of Leavers from Higher Education (DLHE) survey that aimed to find out what leavers were doing a further three years on.
Metadata: Metadata is simply a card catalogue (like those used in a library) implemented in a managed data environment. If is therefore a very wide-ranging term that can include may component areas, including business architecture, rules and definitions, data governance, integration and quality, document content, information technology and all sorts of data and process models.
Microdata: data at a very detailed level that could pose risks to confidentiality of individuals.
National communications plan (Graduate Outcomes survey): The plan the HESA will create to raise the profile of the survey. This includes a website, social media presence and an additional suite of materials for providers.
Non-response bias: This means that any estimates generated from the sample will not accurately reflect the outcomes of the wider population. In Graduate Outcomes, this is likely to occur if the composition of the sample differs to that of the population.
Non-surveyable (Graduate Outcomes survey): Graduates recorded as Graduate.GRADSTATUS 01 (deceased) or 03 (serious illness).
NSS: National Student Survey.
Official Statistics: HESA is a designated producer of Official Statistics under the Statistics and Registration Service Act 2007 and associated Official Statistics Orders. See UK Statistics Authority - What is an official statistic?
OfS: Office for Students.
Open data: Defined as data that’s available to everyone to access, use and share. HESA’s open data releases provide a wide range of in-depth interactive tables and charts with accompanying data downloads. For Student, Staff and Graduate Outcomes data these releases build on the Statistical Bulletins and include data disaggregated by individual HE Providers. Release dates are listed on the Upcoming data releases page and on the UK Statistics publication hub at least four weeks prior to release. Release happens at 09:30am on the chosen day.
Opt-in question banks (Graduate Outcomes survey): Paid-for banks of questions that can be opted in to by one or more providers, statutory customers or public purpose customers.
PI: Performance Indicators.
Primary key: The field or fields that uniquely identify each entity within a provider’s submission, e.g. Student.SID uniquely identifies a student.
Progress bar (Graduate Outcomes survey): This is located across the ‘Contact details’ tab within the provider portal and informs the provider of the status of their survey collection.
Provider portal (Graduate Outcomes survey): The provider portal is a purpose-built system where providers will upload and quality assure their contact details data, add personalisation to the survey, find information on their response rates and download survey data. The HESA Identity System (IDS) enables users to access the portal. Users are assigned IDS roles and each role has different access rights within the portal. The provider can also personalise aspects of their Graduate Outcomes survey using this portal.
Provider questions (Graduate Outcomes survey): Paid-for questions which providers ask HESA to include in the survey and which align with the provider terms and conditions. These questions would appear in a separate section of the survey and cannot be included within the core or opt-in question sections. HESA took the decision to postpone the inclusion of the provider questions in the survey due to the legal, financial and operational complexities which surround these and to allow providers to manage the potential demand for these. This decision will be reviewed as necessary.
Pseudonymisation: Processing personal data so that individuals cannot be identified without the use of additional information. (see ICO – What is personal data?)
QTS: Qualified Teacher Status.
REF: Research Excellence Framework.
Reference period: A fixed period of time for which data is provided, the end of which, aligns to when HESA’s statutory and public purpose customers require sector-wide data and information.
Reference period data: Data relating to a specific reference period.
Registration: A student registration is a binding agreement between a student and an organisation for the delivery of educational services, within the meaning of "Stage 3: enrolment stage" in the Competition and Markets Authority's advice for HE providers on consumer protection law.
Regulation Population: identifies which regulators have responsibility for which providers and students. Every provider has a primary regulator who has an interest in all HE students at that provider. In addition, some specialist regulators have an interest in students studying specific types of course.
Sampling frame: In any population survey, it is inevitable that there is some under-coverage: e.g. in Graduate Outcomes, some graduates cannot be surveyed, perhaps because their contact details were unavailable, or because they are seriously ill, or have died. We call the list of all cases we can include, the ‘sampling frame’.
Sector data: The data from more than one provider.
SFR: Statistical First Release.
SIC: Standard Industrial Classification.
Signed-off: A state of data that has undergone sign-off and may therefore be used in public purpose/statutory and analytical output.
Sign-off: The process of a defined role (for example Vice Chancellor) making a formal declaration that the data submitted to HESA represents an honest, impartial, and rigorous account of the HE provider’s events up to the end of the reference period.
Sign-off period: the period of time between the end of the reference period and the dissemination point. During this window providers are expected to perform final quality assurance activities and then sign-off their data ready for dissemination.
SOC: Standard Occupational Classification.
Statistical Bulletin: The first release of data from a particular collection or stream. These publications include highlights and summary analysis drawn from the data, with associated commentary to give context to the data presented. They contain interactive tables and charts with accompanying data downloads. Release dates are listed on the Upcoming data releases page and on the UK Statistics publication hub at least four weeks prior to release. Release happens at 09:30am on the chosen day.
Submission: Process of supplying data to HESA.
Submitted: A state of data that has undergone submission but has not been signed-off.
Surveyable population (Graduate Outcomes survey): This is the base population minus graduates who are recorded as dead or seriously ill by the provider (where contact details are not required).
Survey platform (Graduate Outcomes survey): The online platform which houses the online survey provided by Confirmit.
Survey year (Graduate Outcomes survey): The twelve-month period running from 1 December to 30 November. There are four contact periods per survey year.
Tolerance: Represents the permissible quality threshold for a rule.
Weighting: This process involves the use of ‘scaling factors’ (e.g. a factor of 0.75 applied to a response would reduce its relative weight) applied to a survey response in an attempt to make the sample more representative of the population.