Skip to main content

Student record - Data Futures

Back to Student

Data migration overview

Version 1.0.0 Produced 2018-11-20

Background

As part of the Data Futures programme, HESA will be migrating data from the Student and AP Student records from their current format into the new data model. The data migration will provide a basis for returning data, support credibility reporting and assist continuity checking.

For the Student record, data will be migrated from the 2013/14 academic year onwards. For the AP Student record data will be migrated for 2014/15 academic year onwards. We have currently migrated data up to and including 2016/17 data. We will migrate 2017/18 and 2018/19 data once it has been submitted and signed-off.

The migrated data will be used as a starting point for continuing students, courses and modules. Data previously returned for these students will be migrated and providers will only be required to return new data. We will require submission of all data for new entrants from 2019/20 onwards. Migrating this data will reduce the burden for providers in having to submit historical data to the new model when completing the first Data Futures submission.

We recognise that submitting against migrated data will require a detailed understanding of the migration specification and corresponding data. We are aiming to publish a detailed migration specification as soon as possible, which will be used for the Beta phase. We will refine this specification following feedback from Beta participants.

All providers will be able to download their migrated data in Summer 2019 to give early sight before the mandatory trial submission. This will also allow time for reconciliation activity. The details of this reconciliation activity will be confirmed in due course.

Entity mapping

HESA will not be editing or cleansing previously submitted data, it will just be translated and migrated as simply as possible. However, due to the changes in approach to collecting some of the data, it will not be possible to populate all fields/entities in the Data Futures model with migrated data. This will require action from providers to populate missing data.

The below table displays, at entity level, the data which will be migrated. We have indicated which data is and is not possible to migrate, which may then need to be submitted by the provider to complete a data return.

Entity Will this be migrated Comments
AwardingBodyRole No No data will be migrated, as no equivalent field within current model.
CareLeaver Yes Values will be migrated and mapped to new coding frame to include changes over time.
Carer Yes Values will be migrated and mapped to new coding frame to include changes over time.
CollaborativeProvision Yes This can be partially migrated where current data exists.
ContactPurpose Yes Applicable values from Domicile and TTPCODE will be migrated.
CourseDelivery Yes Applicable values for each unique course in a given year will be migrated. COURSEDELTYPEID will be defaulted to 01.
CourseDeliveryInitiative No No data will be migrated, as field within current model is at student level.
CourseDeliveryRole Yes Applicable values for each unique course in a given year will be migrated with ROLETYPE and CDRPROPORTION defaulted to 401 and 100.
CourseDeliveryVenue Yes Applicable values for each unique course in a given year will be migrated.
CourseDeliveryReference Yes Applicable values for each unique course in a given year will be migrated with COURSEREFRNCIDTYPE defaulted to 01.
CourseSession Yes Applicable values for each unique course of each mode in a given year will be migrated.
CurriculumAccreditation Yes This entity can be partially migrated using unique values contained within REGBODY, REGBODY2, TQSSEC, and COURSEAIM.
Dependant Yes Most recent known values will be migrated from NIDEPEND and SDEPEND.
Disability Yes Values will be migrated and mapped to new coding frame to include changes over time.
Engagement Yes Each unique instance will be migrated to form an engagement.
EntryQualificationAward Yes Each unique qualification on entry record will be migrated.
EntryQualificationSubject Yes Each unique qualification on entry subject record will be migrated.
Ethnicity Yes Values will be migrated and mapped to new coding frame to include changes over time.
FeeInvoiceAmount Yes Applicable values from NETFEE and MSTUFEE will be migrated.
FinancialSupportOffer Yes Applicable values will be migrated from where DISALL=04, FEEWAIVETYPE=01-03 and where FINTYPE=01-04.
FinancialSupportScheme Yes Applicable values will be migrated from where DISALL=04, FEEWAIVETYPE=01-03 and where FINTYPE=01-04.
FullTimeEquivalence No No data will be migrated, as no equivalent field within current model.
FundingBody Yes Applicable values will be migrated where EMPFUND, FUNDCODE, RCSTDNT exists.
FundingAndMonitoring Yes Applicable values will be migrated for instances where EMPFUND, FUNDCODE OR RCSTDNT exists.
GenderIdentity Yes Values will be migrated and mapped to new coding frame to include changes over time.
ITTEngagement No Initial teacher training (ITT) data will not be migrated.
LanguageProficiency Yes Applicable values from WELSSP will be migrated and LEVEL will be defaulted to 02.
Leaver Yes Applicable values will be migrated for all instances where RSNEND, ENDDATE or COLTODATE are not null.
MaritalStatus Yes Values will be migrated and mapped to new coding frame to include changes over time.
Module Yes Applicable values will be migrated for each module with a unique module identifier.
ModuleDelivery Yes Applicable values will be migrated for each module with a unique module identifier in a given year. Assuming each year of a module is a new delivery.
ModuleDeliveryLocation No No data will be migrated, as no equivalent field within current model.
ModuleDeliveryRole Yes Applicable values will be migrated for unique non-franchised modules. ROLETYPE will be defaulted to 202.
ModuleCostCentre Yes Applicable cost centre values will be migrated.
ModuleInstance Yes Applicable values will be migrated for unique modules which have students studying within a given year.
ModuleOutcome Yes Applicable values will be migrated for unique modules within a given year where an outcome has been returned.
ModuleSubject No This cannot be migrated due to differing coding frames, but a legacy entity containing the JACS code will be produced.
NationalIdentity Yes Values will be migrated and mapped to new coding frame to include changes over time.
Nationality Yes Values will be migrated and mapped to new coding frame to include changes over time.
OffVenueActivity Yes Applicable values will be migrated from the Mobility entity.
OffVenueActivityLocation Yes Applicable values will be migrated from the Mobility entity.
ParentalEducation Yes Values will be migrated and mapped to new coding frame to include changes over time.
PersonAddress Yes Applicable values will be migrated from POSTCODE, DOMICILE and TTPCODE and defaulted to 01, 01 and 02 respectively in CONTACTTYPEID.
PersonIdentifier Yes Applicable values will be migrated from UCASAPPID, DHREGREF, RCSTDID, TREFNO, ORCID, OWNSTU, HUSID, SCN, UCASPERID, ULN and SSN and IDTYPECODE will default to relevant values.
Qualification Yes Applicable values will be migrated for all unique qualifications awarded based on COURSEID.
QualificationAwardAccreditation Yes Applicable values will be migrated for TQGSEC, OUTCOME and QUAL.
QualificationAwarded Yes Applicable values will be migrated and NUMREG will be defaulted to 01.
QualificationSubject No This cannot be migrated due to differing coding frames, but a legacy entity containing the JACS code will be migrated to support quality analysis.
Religion Yes Most recent known values will be migrated from the RELBLF field.
Religious Background Yes Most recent known values will be migrated from the RELIGION field.
ServiceLeaver No No data will be migrated, as no equivalent field within current model.
SessionStatus Yes Applicable records with an MCDATE will be migrated using the value returned in MODE to define STATUSCHANGEDTO.
SexualOrientation Yes Values will be migrated and mapped to new coding frame to include changes over time.
SocioEconomicClassification Yes Values will be migrated and mapped to new coding frame to include changes over time.
StandardOccupationalClassification Yes Values will be migrated and mapped to new coding frame to include changes over time.
Student Yes All applicable unique values will be migrated based on names.
StudentAccreditationAim No No data will be migrated as currently data is only available at course level.
StudentCourseSession Yes Applicable values will be migrated for each unique active instance.
StudentCourseSessionMode Yes Applicable values will be migrated.
StudentFee Yes Applicable values will be migrated based on GROSSFEE values.
StudentModuleFee No No data will be migrated as fee data is not currently returned at module level.
StudentInitiatives Yes Applicable values will be migrated from INITIATIVES1, INITIATIVES2, ITTSCHMS and ARTICLN.
StudentFinancialSupport Yes Applicable values will be migrated from FEEWAIVEAMT and FINTYPE.
StudentRegistration Yes Applicable values will be migrated for each unique instance.
StudyLocation Yes Applicable values will be migrated where CAMPID or LOCATION 01-06 are not null.
SupervisorAllocation Yes Applicable values will be returned based on REFDATA.
Venue Yes Applicable values will be migrated based on Provider Profile data where applicable or LOCATION 01-06.

Timeline

This timeline indicates high-level activity; we will provide further details on anticipated activity and the feedback process when it is available.

January 2019 – Full migration specification for Beta published – Data migrated ready for Beta phase.

During Beta – Beta providers can review all in-scope migrated data, and feedback on issues.

Summer 2019 – Publication of updated migration specification and provision of migrated data to all providers to give early sight and allow reconciliation activity to begin.

January 2020 – Data migrated (to include 2018/19 data) ready for mandatory trial submission.

Further information on the data migration

Keys

With the introduction of a large number of new entities and relationships between these entities, to aid migration we will be creating composite keys where no identifiers currently exist. The keys will be created using information from the existing data to clarify how the migrated data has been created. These keys will need to be carried forward and used to update any of these entities or maintain existing relationships within the new model.

An example of a composite key for Module Delivery Identifier is as follows: Module.MODID, Module.UKPRN and start of HESA reporting year will be concatenated to create a unique Module Delivery Identifier, i.e. ABCD1000457820160801.

Dates

Moving from a retrospective to an in-year reporting model means more dates are captured than have been returned previously. This creates a challenge with migrating data into the new model, as the dates required are not able to be derived; in many cases we can only make assumptions from which reporting period they were returned in. Our current thinking is to populate dates using the below method; however, it may be preferable to leave dates blank for providers to complete.

Example: Modules were previously returned without dates, so when we migrate module data to the new model including the from and to dates, we have to assume the previous years’ modules spanned the entire reporting period, e.g. 1 August-31 July. As a result, providers will need to treat the migrated dates data with caution and amend it where possible to ensure the data remains as accurate as possible. Further to this, fields such as AWARDDATE for Entry Qualifications now requires a more accurate date than just the year previously returned. With this in mind, we have defaulted all the migrated data available to 1 January in the provided year. Where possible this should be amended to a more accurate date.

Granularity

When migrating the data, we are able to aggregate current data to populate the new model. However, for the most part the Data Futures model collects data at a more granular level than the current model, such as Thesis data. This means we do not always have sufficient information to populate data in this way. This is also the case for fields where valid entries may have changed such as Language Proficiency or Module data.

Also, for fields containing subject data, with the move towards the Higher Education Coding of Subjects (HECoS) coding frame, JACS codes will be migrated and held in a legacy field for information, but providers will be required to populate HECoS values for new or continuing activity. Historical activity will not need populating.

Stated coverage

Each field contains a coverage statement. Where data has previously been collected that falls outside of this stated coverage, it will not be migrated. Similarly, to ensure General Data Protection Regulation (GDPR) compliance, we will not accept any data submitted outside of the stated coverage of the new model.


Need help?

Contact Liaison by email or on 01242 211144.