Data Linking Toolkit: Step 1 – Check Data

Figure: Step 1: Check Data

Step 1 is a critical step for all data linking partnership configurations. In Step 1, data linking partners determine if the data exist and whether the quality of the data is adequate for successful linking. Without high-quality data, there is little to no value in linking data. The table below displays the roles of team members potentially involved in Step 1 activities.

Figure: Step 1: Team Members

Activity 1a: Check if data are available

The first step to data linking, usually soon after a question arises that requires linked data to answer, is for each data linking partner (or staff in a single program, if applicable) to determine whether the desired data are available. However, determining the availability of data is not as simple as answering yes or no.

The infographic shows a decision tree that data linking partners must answer regarding the availability of data that might be linked. First, do the data under consideration for data linking exist within each data system? That is, are they collected and stored so that the desired data can be extracted and formatted for eventual data linking?

Figure: Data Availability

Second, to mitigate some of the previously mentioned limitations and potential risks, are the data being considered for data linking of high enough quality to answer the question with confidence? If a significant amount of the data to be linked is of poor quality (incomplete, inaccurate, or untimely), linking the data will not provide a confident answer.

Third, are the data allowed to be shared? Depending on circumstances, available data might not be sharable data. For example, personally identifiable information such as social security numbers, insurance numbers, and family income may be collected for Part C billing and might be extractable. However, these data would not typically be shared with another program in another agency. Additionally, program policies or participant release statements may limit whether data, or what data, can be shared outside of a program.

Finally, if data are available and can be shared, is a data sharing agreement required? If partners already share data, an alternative question is “Does an existing data sharing agreement allow for the newly requested data to be shared for linking?”

TIP: Sharing record-level data containing personally identifiable information almost always requires a formal data sharing agreement. Sharing de-identified record-level data most likely requires a formal data sharing agreement. (See Step 3 for more details about data sharing agreements.)

Activity 1b: Confirm data quality

Before proceeding to Step 2, each data linking partner needs to review their data to determine if it is of sufficient quality for analysis to answer the question of interest. It is important to confirm the validity, reliability, completeness, and timeliness of each partner’s data so that there is confidence in the quality of the linked data and the results of the analysis. Before linking, Part C and Part B 619 program staff and their partners should include data stewards and other subject matter experts (as needed) when they investigate and jointly discuss the quality of their data. Given the significant time and effort required for data linking, if either partner finds their data to be of poor or insufficient quality, the data linking partnership discussions should be postponed until necessary actions are taken to address the data quality issues. (If Part C or Part B 619 data quality is an issue, contact DaSy for assistance.)

TIP: The DaSy Data System Framework supports the design of data systems that collect high quality Part C and Part B 619 data.

Published July 2022.