Data linkage

What is data linkage?

Data linkage brings together multiple sources of data that relate to the same person. When a person attends different educational institutes - say a course at TAFE or an undergraduate degree at university - information on their participation is collected. Linking these bits of information with current LSAY survey responses has the potential to completely transform the value of LSAY to both participants of the survey and the research and policy community that use the data. It will add enormous value to LSAY by:

  • improving the quality of the data by providing more detailed, accurate and objective data about educational participation and attainment
  • increasing the richness and depth of information by linking to data that might be outside the scope of LSAY
  • providing an opportunity to remove some of the more detailed questions from the survey and asking more engaging, attitudinal items. This helps reduce the effort required to complete the survey and enhances the survey experience for participants!

Current projects

Currently LSAY is conducting a series of data linkage projects aimed at linking data from the most recent LSAY cohort (Y15) to various educational data sets. These projects include:

  • Linking LSAY to school results: LSAY participants were asked permission to link survey data with their school results. This means linking NAPLAN scores from Years 3, 5, 7 and 9 and senior secondary school subject results (Years 11 and 12) to the survey data. For more information please read our fact sheet on linking school results to LSAY data.
  • Linking LSAY to VET: When LSAY participants indicate during their survey that they have taken part in vocational education and training (VET), they are asked for permission to link their survey responses to data from the National VET Provider Collection. For more information please read our fact sheet on linking VET records to LSAY data
  • Linking LSAY to Higher Education (university): LSAY participants taking part in Higher Education (university) study are asked for their permission to link their survey responses to data from the Higher Education Information Management System (HEIMS). For more information please read our fact sheet on linking Higher Education (university) records to LSAY data.

How are data linked?

Participants are asked for their consent to link their LSAY records to other educational data sets and only those who consent will have their data linked. Participants can withdraw their consent at any time, however any data that have already been linked will be retained and continue to be available to data users.

LSAY data will be linked via deterministic linking which compares an identifier or a group of identifiers across databases; a link is made when these identifiers match. Types of identifiers used to match LSAY data and other administrative data include contact details (name, address, date of birth), school information (name, suburb, postcode) or a participant’s unique student identifier (USI).

Privacy and data linkage

Personal information is handled in the strictest confidence in accordance with the Australian Privacy Principles.  Respondent contact details are only ever held by Wallis (the LSAY fieldwork contractor) or the agencies authorised to do the linkage and are stored on secure servers located within Australia.

All LSAY linkage projects are conducted using the separation principle which ensures the separation of personal identifying information (e.g. names, addresses, and unique identifiers) from administrative and survey data. A linkage key is developed to link the datasets so at no point are survey data and identifying information contained on the same file. For more information on the linkage process specific to each project see the fact sheets above.

De-identified linked data are stored on secure servers by NCVER and the Australian Data Archive (ADA) where they are made available to researchers. Data use is restricted to research purposes only and cannot be used for commercial or financial gain.  When researchers use the data, information is always grouped together to ensure no individual can be identified. See the our privacy notice for more information.

Overview of the separation principle for LSAY data linkage

How to access Linked Data

LSAY records for the Y15 cohort have now been linked to the following data sources:

  • ACARA My School data
  • National Assessment Program — Literacy and Numeracy (NAPLAN)
  • National VET Provider Collection.

Access to the linked data is restricted and is available via a formal request and registration process managed by the Australian Data Archive (ADA). For information on how to access the data see How to access LSAY data.

For detailed information about the linked data, methodology and additional resources available, refer to the 'Data linkage' section of the LSAY Y15 user guide.