Student record - Data Futures
Deleting data in a continuous collection cycle
Version 1.1.0 Produced 2017-12-14
The eagle-eyed amongst you may have noticed that in our October release we published an XSD that declares three different actions; delete, upsert and identify. We would like to explain a little more about how we envisage the deletions process working as this is an area on which we have received a few questions from both providers and software suppliers.
The current HESA Student and AP student collections operate on a single file submission basis; the sending of a new file automatically overwrites the previous. In the Data Futures model we are moving to a multiple file submission basis which is more akin to the ITT collection but not quite the same; uploaded files could range in scope from a single entity for one student to the full dataset for the entire student population and everything in between.
It will not be possible to retract or effectively 'undo' a file that has been submitted, rather a further file would need to be sent declaring which part(s) of the previously submitted file or files should be deleted. An important point to note is that a deletion is a deletion, not a roll-back it will remove the relevant data altogether rather than reset to the previous position. So, how would this work in practice?
Our current thinking is that providers can delete a record they have submitted by sending in an XML file that includes a record with the same primary key information where the action attribute for that record is set to "delete" (i.e. the XML file contains an element of the form '<EntityName action="delete">').
Where a record to be deleted appears as a nested element in the XML file, the Provider can provide the full content of the outer elements with the action attribute on those outer elements being set to "upsert". In this case the system will perform upsert operations for those outer elements, inserting or updating them as appropriate.
On deleting a record, the provider can choose to include or omit the non-primary key fields for that record. Any details that are provided within non-primary key fields will be ignored when processing the deletion.
Each time an action attribute of "delete" is specified, this will cause only one record to be deleted there is no notion of deletions "cascading", either to nested records within the XML file or to other records that reference the record being deleted. This is particularly important when deleting something like a course delivery deleting course delivery 'X' only deletes the course delivery record itself. If it is referenced elsewhere, for example where students are associated with that course delivery or there is an accreditation associated with it, further work is required to update those associated records. Orphaned records will usually be highlighted through validation and we are looking at what we can do to help providers manage the process and keep a handle on their submitted data.
Crucially, deleting a record can change the validity of any other records whose validity depends on its contents. For instance:
- Deleting a CourseSession record will invalidate any StudentCourseSession records that reference that CourseSession.
- Deleting a Course delivery record without also deleting its nested elements CourseDeliveryReference, CurriculumAccreditation, CourseDeliveryFinancialSupportOffer, CourseDeliveryInitiative, CourseDeliveryRole, CourseDeliveryVenue will invalidate all those nested elements - deleting the parent does not delete the child.
- Deleting the only QualificationSubject record within a Qualification will invalidate the Qualification since each Qualification needs at least one QualificationSubject.
- Deleting any item will almost certainly have an effect on Quality Rules and Credibility Reports.
As an example of a deletion in action let's say I have inadvertently uploaded a file for student 'X' with two disabilities; a specific learning disability (DISABILITY = 51) and a serious hearing impairment (DISABILITY = 57). The hearing impairment was sent in error and the student only ever declared one disability. I need to delete the erroneous Disability record by resending the Student entity and the entity that I want to delete, so I might send:
<Student action="identify"> <SID>190000001234561</SID> <Disability action="delete"> <DISABILITY>57</DISABILITY> </Disability> </Student>
However, I could also choose to send the full details of the parent entity too. These additional data items will be ignored when processing the delete.
<Student action="identify"> <SID>190000001234561</SID> <BIRTHDTE>1987-09-17</BIRTHDTE> <FNAMES>Joe</FNAMES> <SURNAME>Bloggs</SURNAME> <Disability action="delete"> <DISABILITY>57</DISABILITY> </Disability> </Student>
In theory, deletions should be infrequently used and where they occur should be at lower levels (as in the example above) rather than relating to the deletion of a student or course delivery record. The continuous collection cycle means extra care needs to be taken because if the record for a continuing student, for example, is deleted this will have implications for data previously signed-off. A deletion does not apply from the point in time it is declared to HESA, it is retrospectively applied to the data from the first point at which we were aware of it.
During the Alpha and Beta pilots we will be putting the deletions process to the test and if that results in changes to our implementation we will of course let you know as soon as possible.
Contact Liaison by email or on 01242 211144.