Model System:


Reference Type:

Accession No.:


Top Spinal Cord Injury Rehabilitation

Year, Volume, Issue, Page(s):

, 26, 4, 221-231


Background: Linking records from the National Spinal Cord Injury Model Systems (SCIMS) database to the National Trauma Data Bank (NTDB) provides a unique opportunity to study early variables in predicting long-term outcomes after traumatic spinal cord injury (SCI). The public use data sets of SCIMS and NTDB are stripped of protected health information, including dates and zip code.

Objectives: To develop and validate a probabilistic algorithm linking data from an SCIMS center and its affiliated trauma registry.

Method: Data on SCI admissions 2011-2018 were retrieved from an SCIMS center (n = 302) and trauma registry (n = 723), of which 202 records had the same medical record number. The SCIMS records were divided equally into two data sets for algorithm development and validation, respectively. We used a two-step approach: blocking and weight generation for linking variables (race, insurance, height, and weight).

Results: In the development set, 257 SCIMS-trauma pairs shared the same sex, age, and injury year across 129 clusters, of which 91 records were true-match. The probabilistic algorithm identified 65 of the 91 true-match records (sensitivity, 71.4%) with a positive predictive value (PPV) of 80.2%. The algorithm was validated over 282 SCIMS-trauma pairs across 127 clusters and had a sensitivity of 73.7% and PPV of 81.1%. Post hoc analysis shows the addition of injury date and zip code improved the specificity from 57.9% to 94.7%.

Conclusion: We demonstrate the feasibility of probabilistic linkage between SCIMS and trauma records, which needs further refinement and validation. Gaining access to injury date and zip code would improve record linkage significantly.


Yuying Chen, Huacong Wen, Russel Griffin, Mary Joan Roach, Michael L Kelly