Investigating the optimal handling of uncertain pregnancy episodes in the CPRD GOLD Pregnancy Register: a methodological study using UK primary care data

Jennifer Campbell, Krishnan Bhaskaran, Sara Thomas, Rachael Williams, Helen I McDonald, Caroline Minassian

Research output: Contribution to journalArticlepeer-review

5 Citations (SciVal)


OBJECTIVES: To investigate why episodes of pregnancy identified from electronic health records may be incomplete or conflicting (overlapping), and provide guidance on how to handle them.

SETTING: Pregnancy Register generated from the Clinical Practice Research Datalink (CPRD) GOLD UK primary care database.

PARTICIPANTS: Female patients with at least one pregnancy episode in the Register (01 January 1937-31 December 2017) which had no recorded outcome or conflicted with another episode.

DESIGN: We identified multiple scenarios potentially explaining why uncertain episodes occur. Criteria were established and systematically applied to determine whether episodes had evidence of each scenario. Linked Hospital Episode Statistics were used to identify pregnancy events not captured in primary care.

RESULTS: Of 5.8 million pregnancy episodes in the Register, 932 604 (16%) had no recorded outcome, and 478 341 (8.5%) conflicted with another episode (251 026 distinct conflicting pairs of episodes among 210 593 women). 826 146 (89%) of the episodes without outcome recorded in primary care and 215 577 (86%) of the conflicting pairs were consistent with one or more of our proposed scenarios. For 689 737 (74%) episodes with recorded outcome missing and 215 544 (86%) of the conflicting pairs (at least one episode), supportive evidence (eg, antenatal records, linked hospital records) suggested they were true and current pregnancies. Furthermore, 516 818 (55 %) and 160 936 (64%), respectively, were during research quality follow-up time. For a sizeable proportion of uncertain episode, there is evidence to suggest that historical outcomes being recorded by the general practitioner during an ongoing pregnancy may offer explanation (73 208 (29.2%) and 349 874 (37.5%)).

CONCLUSIONS: This work provides insight to users of the CPRD Pregnancy Register on why uncertain pregnancy episodes exist and indicates that most of these episodes are likely to be real pregnancies. Guidance is given to help researchers consider whether to include/exclude uncertain pregnancies from their studies, and how to tailor approaches to minimise underestimation and bias.

Original languageEnglish
Pages (from-to)e055773
JournalBMJ Open
Issue number2
Publication statusPublished - 22 Feb 2022
Externally publishedYes


  • Databases, Factual
  • Electronic Health Records
  • Female
  • Hospitals
  • Humans
  • Male
  • Pregnancy
  • Primary Health Care
  • United Kingdom


Dive into the research topics of 'Investigating the optimal handling of uncertain pregnancy episodes in the CPRD GOLD Pregnancy Register: a methodological study using UK primary care data'. Together they form a unique fingerprint.

Cite this