Abstract
Objective. To test the intra- and interobserver variability, among clinicians with an interest in systemic sclerosis (SSc), in defining digital ulcers.
Methods. Thirty-five images of finger lesions, incorporating a wide range of abnormalities at different sites, were duplicated, yielding a data set of 70 images. Physicians with an interest in SSc were invited to take part in the Web-based study, which involved looking through the images in a random sequence. The sequence differed for individual participants and prevented crosschecking with previous images. Participants were asked to grade each image as depicting "ulcer" or "no ulcer," and if "ulcer," then either "inactive" or "active." Images of a range of exemplar lesions were available for reference purposes while participants viewed the test images. Intrarater reliability was assessed using a weighted kappa coefficient with quadratic weights. Interrater reliability was estimated using a multirater weighted kappa coefficient.
Results. Fifty individuals (most of them rheumatologists) from 15 countries participated in the study. There was a high level of intrarater reliability, with a mean weighted kappa value of 0.81 (95% confidence interval [95% CI] 0.77, 0.84). Interrater reliability was poorer (weighted kappa = 0.46 [95% CI 0.35, 0.57]).
Conclusion. The poor interrater reliability suggests that if digital ulceration is to be used as an end point in multicenter clinical trials of SSc, then strict definitions must be developed. The present investigation also demonstrates the feasibility of Web-based studies, for which large numbers of participants can be recruited over a short time frame.
Methods. Thirty-five images of finger lesions, incorporating a wide range of abnormalities at different sites, were duplicated, yielding a data set of 70 images. Physicians with an interest in SSc were invited to take part in the Web-based study, which involved looking through the images in a random sequence. The sequence differed for individual participants and prevented crosschecking with previous images. Participants were asked to grade each image as depicting "ulcer" or "no ulcer," and if "ulcer," then either "inactive" or "active." Images of a range of exemplar lesions were available for reference purposes while participants viewed the test images. Intrarater reliability was assessed using a weighted kappa coefficient with quadratic weights. Interrater reliability was estimated using a multirater weighted kappa coefficient.
Results. Fifty individuals (most of them rheumatologists) from 15 countries participated in the study. There was a high level of intrarater reliability, with a mean weighted kappa value of 0.81 (95% confidence interval [95% CI] 0.77, 0.84). Interrater reliability was poorer (weighted kappa = 0.46 [95% CI 0.35, 0.57]).
Conclusion. The poor interrater reliability suggests that if digital ulceration is to be used as an end point in multicenter clinical trials of SSc, then strict definitions must be developed. The present investigation also demonstrates the feasibility of Web-based studies, for which large numbers of participants can be recruited over a short time frame.
Original language | English |
---|---|
Pages (from-to) | 878-882 |
Number of pages | 5 |
Journal | Arthritis & Rheumatism |
Volume | 60 |
Issue number | 3 |
DOIs | |
Publication status | Published - 2009 |