World-first detailed examination of radiologist interactions with lung CT of suspicious COVID-19 patients

  • March 4, 2021
  • Chest
  • News

A multi-national collaboration of scientists, clinicians and industry led by DetectedX established a novel intelligent educational platform to enhance radiologic detection of COVID-19 appearances on computed tomography images of the lung. Simply readers anywhere can log on and try and diagnose lung CT cases with known truth. The tool was developed in four weeks through redeploying technology used for breast cancer detection and is now available free-of-charge to clinicians world-wide via the URL link: detectedx.com.

The tool has currently been used by 1300 clinicians from 147 countries who have examined 105 lung CT cases, some with the appearances of COVID-19, some without. All radiologic-image interactions have been recorded resulting in the world’s largest database on clinician efficiencies interacting with lung CT demonstrating COVID-19 appearances. Each clinician (reader) was asked to comment on whether each case contained the appearances of ground glass opacity, crazy paving/mosaic attenuation and/or consolidation and on the location of any perceived appearance. Then the reader gave an overall confidence score from 0-5 on whether the case is COVID-19 positive or not. Each reader score was compared against an expert consensus of four senior respiratory radiologists who provided the truth rankings.

This white paper with its interactive diagrams offers early observations around reader efficiency at detecting COVID-19 using CT lung cases by summarising details on:

  • Truth data Distribution of the characteristics of all cases, and cases given a truth ranking of 3-5 indicating a positive COVID-19 case
  • Discrepancies between reader and expert observations
    • Overall confidence scores for positive and normal cases
    • Detection of individual image presentations typical of COVID-19

OBSERVATIONS

1a. Truth data. Distribution of the characteristics of all cases.

This figure demonstrates the truth data by allocating each patient case with an overall truth score of 0-5 (Right vertical column, where 0 = Definitely no COVID-19, 5 = patient with definite COVID-19) to each of the three main CovED appearances. For each of the three appearances, 0 = no appearance present; 1 = appearance present. The graph is interactive so by clicking anywhere, the number of cases meeting a specific appearance pattern is highlighted. For example in the below, the highlighted area (see cursor) indicates that 4 cases had an overall COVID-19 confidence rating of 3, and demonstrated the appearances of ground glass opacity (1), and consolidation (1) but no appearance of crazy paving as a 0 is shown for those 4 cases for that appearance.

Overall, this graph demonstrates that for the 105 cases used to date in the CovED software, a good distribution of truths is shown across the overall ratings and individual appearances.

Figure 1. Distribution of truth cases according to expert ratings.
Figure 1. Distribution of truth cases according to expert ratings.

1b. Truth data. Distribution of cases given a truth ranking of 3-5 indicating a positive COVID-19 case

This sunburst graph , indicates cases, positive for COVID-19 as defined by when the expert panel gave an overall confidence score of 3-5, allocated by the location of appearances. The inner circle demonstrates the score given by the experts for the overall confidence, whilst the outer two rings, indicate whether the cases were peripherally or anteriorly/posteriorly located. By hovering over or clicking on any point on the inner circle or ring, more information is seen. So for example, by hovering on 4 in the inner circle, this indicates that 15 cases were given a score of 4; by clicking on 4, the location of those cases is more clearly seen.

These graphs indicated that 13, 15 and 16 cases were given a score of 3, 4 or 5 respectively by the expert panel and that the majority of cases given the higher score of 4 and 5 demonstrated lesions in both the anterior and posterior portions of the lung as well as both the peripheral and central locations. The cases given the lower (less confident) score of 3 had appearances demonstrated a higher proportion allocated to only one location i.e central or peripheral, and anterior or posterior. A small minority of cases given a score of 3, could not be allocated to a specific location for either the peripheral/central or anterior/posterior locations (see the zero scores in the outer rings).

Figure 2. Allocation of positive truth cases according to location of lesion appearances. The graph on the left demonstrates the distribution across all cases scored 3-5, whereas on the left we can see the distribution of cases given an overall confidence score of 4.
Figure 2. Allocation of positive truth cases according to location of lesion appearances. The graph on the left demonstrates the distribution across all cases scored 3-5, whereas on the left we can see the distribution of cases given an overall confidence score of 4.

2. Discrepancies between the reader and expert observations

2a. Overall confidence scores

Positive cases:
This interactive graph (link) demonstrates the scoring of cases by the majority of readers according to expert scoring of 3, 4 or 5 for overall confidence. For example in the following Figure, we can see that for those cases (n=13) given an expert confidence score of 3, the majority of readers gave six of these images an overall score of 4, whilst the majority of readers gave one image a score of 0. This indicates that certain images with COVID-19 appearances are sufficiently challenging to be awarded a score representing extremely high confidence for normality. The graph also indicates that for the higher the expert scoring of 4 and 5, a higher proportion of images was given a majority score of 3-5.

Figure 3. Reader scores allocated to the expert scorings of 3, 4 and 5 (positive cases). The captions for each item refer to readers interactions with images given each one of the expert rating.
Figure 3. Reader scores allocated to the expert scorings of 3, 4 and 5 (positive cases). The captions for each item refer to readers interactions with images given each one of the expert rating.

Normal cases:
In this graph the allocation of reader scores to cases are shown against the non-CID-19 cases given a score of 0-2 by the experts. In the examples below we can see that of the 19 cases scored 1 by the experts, the majority of readers gave 10 and 3 images a score of 0 and 3 respectively. For all the expert ratings shown for these normal images, we can see that a majority of readers gave a reasonable proportion of those images a positive COVID-19 scoring of 3 or more. As expected, those cases given a lower expert confidence score, received more correct reader scores.

Figure 4. Reader scores allocated to the expert scorings of 0, 1, 2 (negative cases). The captions for each item refer to readers interactions with images given each one of the expert rating.
Figure 4. Reader scores allocated to the expert scorings of 0, 1, 2 (negative cases). The captions for each item refer to readers interactions with images given each one of the expert rating.

2b. Individual presentation scores

These interactive diagrams focus only on the cases that experts agreed demonstrated the appearances of ground glass opacification, crazy paving and consolidation. For each diagram In the centre of these diagrams, the numbers indicate the overall confidence score given by the expert, and the outer ring demonstrates the number of cases for that expert confidence rating that were awarded a positive score for that appearance by 0-10%, 10-30%, 30-50%, 50-70%, 70-90% or above 90% of readers. So for example in the following Figure, for those groung glass opacity positive cases with an expert confidence rating of 2 (n=16), 10 cases were given a positive rating by approximatly 60% of readers, meaning that for those same 10 cases 40% of readers scored these as normal for ground glass. For all the three appearance types, many images were given a normal rating, even though according to the experts, that appearance was there.

These interactive diagrams focus only on the cases that experts agreed demonstrated the appearances of ground glass opacification, crazy paving and consolidation. For each diagram In the centre of these diagrams, the numbers indicate the overall confidence score given by the expert, and the outer ring demonstrates the number of cases for that expert confidence rating that were awarded a positive score for that appearance by 0-10%, 10-30%, 30-50%, 50-70%, 70-90% or above 90% of readers. So for example in the following Figure, for those groung glass opacity positive cases with an expert confidence rating of 2 (n=16), 10 cases were given a positive rating by approximatly 60% of readers, meaning that for those same 10 cases 40% of readers scored these as normal for ground glass. For all the three appearance types, many images were given a normal rating, even though according to the experts, that appearance was there.

Figure 5- These graphs indicate Ground glass opacity, crazy paving and consolidation positive cases. In the centre the overall confidence scores given for those images are given (so one can have a case that is COVID-19 normal (a score of 0-2 in the inner circle) and still have a number of cases with ground glass opacities. Then in the outer ring, one can see the proportion of images given a positive score by (where relevant) 0-10%, 10-30%, 30-50%, 50-70%, and 70-90% of readers. Please note that a category of above 90% of readers does not exist demonstrating that at no time did all readers correctly identify any of the three appearance types.
Figure 5- These graphs indicate Ground glass opacity, crazy paving and consolidation positive cases. In the centre the overall confidence scores given for those images are given (so one can have a case that is COVID-19 normal (a score of 0-2 in the inner circle) and still have a number of cases with ground glass opacities. Then in the outer ring, one can see the proportion of images given a positive score by (where relevant) 0-10%, 10-30%, 30-50%, 50-70%, and 70-90% of readers. Please note that a category of above 90% of readers does not exist demonstrating that at no time did all readers correctly identify any of the three appearance types.

For each of the appearances, the overall performance of readers are shown in the following Figure. Each of these matrices demonstrate the matching of the reader scores with the overall truth. From this figure it can be seen that generally readers performed better with the normal (absent) than with the abnormal (Present) cases. In particular, it is worth noting that for crazy paving and consolidation the majority of readers (users) said the appearance was normal for appearance-positive cases, suggesting that a lot of clinicians are failing to recognise key radiologic features associated with COVID-19.

These matrices for each of the three appearance types demonstrate the proportion of readers who matched the judgements of the Expert Panel.
These matrices for each of the three appearance types demonstrate the proportion of readers who matched the judgements of the Expert Panel.

Overall Conclusions:

  • A good array of COVID-19 cases have been presented to readers, demonstrating all typical COVID-19 positive appearances;
  • COVID-19 appearances are being judged as being normal with high degrees of confidence;
  • Cases without COVID-19 appearances are being judged as positive;
  • For all the three main appearance types, many images were given a normal rating, even though according to the experts, that appearance was there;
  • Generally readers performed better with the normal (absent) than with the abnormal (Present) cases;
  • Clinicians are failing to recognise key radiologic features associated with COVID-19 highlighting the need for effective and available educational solutions.