Help with validation kits

 

Help with validation kits

Downloading a validation kit

Set up a folder on your C dive to save the validation kit into.

Select the validaiton kit required from validation kits and save it into the created folder.

You may need to unzip the kit if this action does not trigger automatically.

If you are using more than one kit (e.g. Staff Person Table, Staff Contract Table and Staff Grade Table) set up a separate folder for each one.

Using a validation kit on a Windows computer

If you are using a computer running Microsoft Windows then start a DOS session. In Windows 2000 you would do this by clicking Start, Programs, Accessories, Command Prompt.

A DOS window will appear.

Change directory to where you saved the zip file. In the example shown below, the command cd (change directory) is used to go to c:\vkit_013.

 
Microsoft Windows 2000 [Version 5.00.2195] (C) Copyright 1985-2000 Microsoft Corp.C:>cd vkit_013C:>Vkit_013

Then use PKUNZIP or WINZIP to extract the files from the zip file.
You can obtain the PKUNZIP utility from http://www.pkware.com/downloads.

C:\Vkit_013>pkunzip 05013.zip PKUNZIP (R) FAST! Extract Utility Version 2.04g 02-01-93
Copr. 1989-1993 PKWARE Inc. All Rights Reserved. Registered
version PKUNZIP Reg. U. S. Pat and Tm. Off.

80486 CPU detected.
XMS version 2.00 detected.
DPMI version 0.90 detected.

Searching ZIP: 05013.ZIP
Inflating: 05013.EXE Inflating: 05013.BIN Inflating: POSTCODE.CSV Inflating: HESACODE.CSV Inflating: 05013RL.CSV Inflating: 05013VA.CSV Inflating: VALUES.C Inflating: VMAIN.C Inflating: FILEHELP.C Inflating: HELPER.C Inflating: HESACODE.C Inflating: NPROFILE.C Inflating: VLTDN.H Inflating: HELPER.H Inflating: VMACROS.H Inflating: README.TXT Inflating: 05013.C Inflating: 05013MAK Inflating: 05013FL.RTF C:\Vkit_013>

One of the files extracted will be the EXE (executable program) file, in this example 05013.EXE.

Top

Usage instructions

For usage instructions, run the EXE without any arguments, as shown below:

 
                    C:Vkit_013>05013.exeHESA Record Validation Program. Build 2.28Validation Rules Version 1.0Copyright (C) The Higher Education Statistics Agency 1994. 1995Usage: C:VKIT_01305013.EXE <filename> [switches]Switches are:-b       Wait for keypress before exiting-e<n>    Maximum number of errors (Default 1000)-g<n>    Country of Institution : England(g1),N.Ireland(g4),Scotland(g2),Wales(g3)-i<nnn>  Dont validate field <nnn>-j<nnn>  Ignore rule <nnn>-n       Ignore all fields EXCEPT those given by -i-p<dir>  look for hesacode,postcode in <dir> -q       Quiet Mode-h       Error report in database input format-r       Raw Output-tc      Input is Comma Separated-tf      Input is Fixed Length-uc      Output in comma separated format, including full English text-uf      Output in fixed length format, including full English text-v       Verbose Output-w<nn>   Set warning level (-w0) to include warnings in error file-x       Explain Errors (Wide Output)-y       Allows Data Structure Errors in dataC:>Vkit_013

To validate your data you must include the following switches:

either

  • -tc (if data is in comma separated format)
    or
  • -tf (fixed-length format)

and you must include the country code applicable for your institution:

  • -g1, -g2 -g3 or -g4 (England, Scotland, Wales or N.Ireland)

We also recommend one of either

  • -uc : Output in comma separated format, including full English text
    or
  • -uf : Output in fixed length format, including full English text

Top

Example

 

Here is a sample command used to run a validation kit.

 
<h3>05013.exe c:hesavaldatamodules05.csv -tc -g1 -uf -w0 </h3>

An explanation of each part of this command is as follows:

05013.exe - the validation kit executable program.

c:\hesaval\data\modules05.csv - the location of the data file being tested.

-tc : a switch to tell it that the data is in CSV format. You must specify either -tc or -tf (for fixed length data)

-g1 : This tells it that the institution is in England. You must enter the correct code for your country, i.e -g1 (England) -g2 (Scotland) -g3 (Wales) or -g4 (N.Ireland)

-uf : tells it to output the error file in fixed length format with full English text. This is optional but we recommend either -uf or -uc.

-w0 : Show warnings as well as errors. (note: 0 is the digit zero)

What it looks like when run:

 
C:Vkit_013>05013.exe c:hesavaldatamodules05.csv -tc -g1 -uf -w0HESA Record Validation Program C:VKIT_01305013.EXE. Build 2.28 Validation Rules Version 1.0Loading HESACODEsFinished Reading 2542 HESACODEs.HESACODEs need 16523 bytes of memory. 19 Reallocations were madeLoading Valid EntriesFinished Reading 865 Valid Entries.Valid Entries need 7350 bytes of memory. 5 Reallocations were madeProcessing data file c:hesavaldatamodules05.csv.00000000: .X.X.......XComplete. 12 lines from c:hesavaldatamodules05.csv were processed0 seconds elapsed. 12 records/second.3 Errors in 05013.err0 Warnings in 05013.errC:>VKit_013

In the above example it found 3 errors and no warnings.

Within the folder where the executable program was running, i.e. c:\vkit_013, it will have created the following files:

05013.INF - information about how the program ran
05013.ERR - the errors generated. See next section

(where 05013 is the record type being tested)

Top

The Error file

The ERR file lists each error on a separate line. For each field processed, it will output the first error encountered and then move on to the next field in the record.

The columns output depend upon the switches that you have used when calling the validation kit.

 

We recommend using either


 

  • -uc

    Output in comma separated format, including full English text

  • or

  • -uf

    Output in fixed length format, including full English text

The columns output when using either of these are as follows:

Record The number of the record within your data file
ID The field identifying the record, e.g. the HUSID in the case of student data or STAFFID for staff data
Ownstu the contents of the OWNSTU field. Student kit only.
Ownpsd the contents of the OWNPSD field. Student kit only.
Fld The field number to which the error relates
Abbrev the abbreviation of the field name
Error The number of the rule which gave rise to the error or warning
Sev. the severity. if 10 more or then it represents an error; if less than 10 then it is a warning
Data the contents of the field which caused the error. This is enclosed between angled brackets.
English Text of Rule The full specification of the rule

Top

Deprecated output options

A variety of other output options existed before the introduction of -uf and -uc and these are not recommended, but may still be used if required.

-h

will produce the error file in fixed length format. All columns are in fixed positions, with the 'data' column occupying a fixed length of 100 characters and right justified, within angled brackets.

-x

will output an error file consisting of Record, ID, Fld, Abbrev, (i.e. rule number), Error Text, Data (enclosed within angled brackets), and in the case of Student data, OWNSTU and OWNPSD.
The column headed 'Error text' consists of a short explanation of the rule condensed into 32 characters. Because this is not always possible it may say something like: 'See 05013rl.csv for full rule'. The csv file referred to here is one of those that can be unloaded from the ZIP file. It contains a list of rules, sorted in rule number order, and is in comma-separated format suitable for viewing using Excel or similar spreadsheet.
A drawback of this format is that the position of OWNSTU and OWNPSD will vary depending upon the length of the data.

-r

represents raw format. No headings will appear at the top of the output file. The output will consist of Record, ID, Record type (eg 11 for student combined record), Fld, Abbrev, Rule Number, Severity, Data (enclosed within angled brackets), and in the case of Student data, OWNSTU and OWNPSD.
A drawback of this format is that the position of OWNSTU and OWNPSD will vary depending upon the length of the data.

default.

The default format for the error file, i.e. the format generated if you do not select any specific output format, will create an error file consisting of:
Record, ID, Fld, Abbrev, Error (i.e. rule number), Data (enclosed within angled brackets), and in the case of Student data, OWNSTU and OWNPSD.
A drawback of this format is that the position of OWNSTU and OWNPSD will vary depending upon the length of the data.


Top

Using a validation kit on a non-Windows computer

If you have a non-Windows computer, then instead of using the EXE (executable program) file provided by us, you will need to compile a program on your own computer using the source files included within the ZIP file.

The source files are written in ANSI compliant 'C' code and should compile with little or no difficulty, but the procedure for compiling the executable program will vary from site to site, and you may need to obtain help from someone who is familiar with compilation on your computer.