A Survey of Crowdsourcing in Medical Image Analysis

Silas Nyboe Ørting; Andrew Doyle; Arno van Hilten; Matthias Hirth; Oana Inel; Christopher R Madan; Panagiotis Mavridis; Helen Spiers; Veronika Cheplygina

doi:10.15346/hc.v7i1.1

Authors

Silas Nyboe Ørting University of Copenhagen http://orcid.org/0000-0002-3081-1547
Andrew Doyle McGill Centre for Integrative Neuroscience
Arno van Hilten Erasmus Medical Center
Matthias Hirth Technische Universität Ilmenau
Oana Inel Vrije Universiteit Amsterdam
Christopher R Madan University of Nottingham
Panagiotis Mavridis Delft University of Technology
Helen Spiers University of Oxford, Zooniverse
Veronika Cheplygina Eindhoven University of Technology

DOI:

https://doi.org/10.15346/hc.v7i1.1

Keywords:

Medical Imaging, Crowdsourcing, Citizen Science, Machine Learning

Abstract

Rapid advances in image processing capabilities have been seen across many domains, fostered by the application of machine learning algorithms to "big-data". However, within the realm of medical image analysis, advances have been curtailed, in part, due to the limited availability of large-scale, well-annotated datasets. One of the main reasons for this is the high cost often associated with producing large amounts of high-quality meta-data. Recently, there has been growing interest in the application of crowdsourcing for this purpose; a technique that has proven effective for creating large-scale datasets across a range of disciplines, from computer vision to astrophysics. Despite the growing popularity of this approach, there has not yet been a comprehensive literature review to provide guidance to researchers considering using crowdsourcing methodologies in their own medical imaging analysis. In this survey, we review studies applying crowdsourcing to the analysis of medical images, published prior to July 2018. We identify common approaches, challenges and considerations, providing guidance of utility to researchers adopting this approach. Finally, we discuss future opportunities for development within this emerging domain.

References

Albarqouni, S, Baur, C, Achilles, F, Belagiannis, V, Demirci, S, and Navab, N. (2016)a. AggNet: Deep Learning From Crowds for Mitosis Detection in Breast Cancer Histology Images. IEEE Transactions on Medical Imaging 35, 5 (May 2016), 1313–1321.

Albarqouni, S, Matl, S, Baust, M, Navab, N, and Demirci, S. (2016)b. Playsourcing: a novel concept for knowledge creation in biomedical research. In Deep Learning and Data Labeling for Medical Applications. Springer, 269–277.

Alialy, R, Tavakkol, S, Tavakkol, E, Ghorbani-Aghbologhi, A, Ghaffarieh, A, Kim, S. H, and Shahabi, C. (2018). A review on the applications of crowdsourcing in human pathology. Journal of pathology informatics 9 (2018).

Boorboor, S, Nadeem, S, Park, J. H, Baker, K, and Kaufman, A. (2018). Crowdsourcing lung nodules detection and annotation. In Medical Imaging 2018: Imaging Informatics for Healthcare, Research, and Applications, Vol. 10579. International Society for Optics and Photonics, 105791D.

Borji, A. (2018). Negative results in computer vision: A perspective. Image and Vision Computing 69 (2018), 1–8.

Brady, C. J, Mudie, L. I, Wang, X, Guallar, E, and Friedman, D. S. (2017). Improving consensus scoring of crowdsourced data using the Rasch model: development and refinement of a diagnostic instrument. Journal of medical Internet research 19, 6 (2017).

Brady, C. J, Villanti, A. C, Pearson, J. L, Kirchner, T. R, Gup, O, and Shah, C. (2014). Rapid grading of fundus photos for diabetic retinopathy using crowdsourcing. Investigative Ophthalmology & Visual Science 55, 13 (2014), 4826–4826.

Bruggemann, J, Lander, G. C, and Su, A. I. (2018). Exploring applications of crowdsourcing to cryo-EM. Journal of structural biology 203, 1 (2018), 37–45.

Cabrera-Bean, M, Pages-Zamora, A, Diaz-Vilor, C, Postigo-Camps, M, Cuadrado-Sánchez, D, and Luengo-Oroz, M. A. (2017). Counting Malaria Parasites with a two-stage EM based algorithm using crowsourced data. In Engineering in Medicine and Biology Society (EMBC). IEEE, 2283–2287.

Chandler, J, Mueller, P, and Paolacci, G. (2014). Nonnaïveté among Amazon Mechanical Turk workers: Consequences and solutions for behavioral researchers. Behavior research methods 46, 1 (2014), 112–130.

Chávez-Aragón, A, Lee, W.-S, and Vyas, A. (2013). A crowdsourcing web platform-hip joint segmentation by non-expert contributors. In Medical Measurements and Applications Proceedings (MeMeA), 2013 IEEE International Symposium on. IEEE, 350–354.

Cheplygina, V, de Bruijne, M, and Pluim, J. P. (2018). Not-so-supervised: a survey of semi-supervised, multi-instance, and transfer learning in medical image analysis. arXiv preprint arXiv:1804.06353 (2018).

Cheplygina, V, Perez-Rovira, A, Kuo, W, Tiddens, H, and de Bruijne, M. (2016). Early experiences with crowdsourcing airway annotations in chest CT, In Large-scale Annotation of Biomedical data and Expert Label Synthesis (MICCAI LABELS). Large-scale Annotation of Biomedical data and Expert Label Synthesis (2016), 209–218.

Cheplygina, V and Pluim, J. P. W. (2018). Crowd disagreement about medical images is informative. In Intravascular Imaging and Computer Assisted Stenting and Large-Scale Annotation of Biomedical Data and Expert Label Synthesis (MICCAI LABELS). Springer, 105–111.

de Herrera, A. G. S, Foncubierta-Rodríguez, A, Markonis, D, Schaer, R, and Müller, H. (2014). Crowdsourcing for medical image classification. Swiss Medical Informatics 30 (2014).

Della Mea, V, Maddalena, E, Mizzaro, S, Machin, P, and Beltrami, C. A. (2014). Preliminary results from a crowdsourcing experiment in immunohistochemistry. In Diagnostic pathology, Vol. 9. BioMed Central, S6.

dos Reis, F. J. C, Lynn, S, Ali, H. R, Eccles, D, Hanby, A, Provenzano, E, Caldas, C, Howat, W. J, McDuffus, L.-A, Liu, B, and others, . (2015). Crowdsourcing the general public for large scale molecular pathology studies in cancer. EBioMedicine 2, 7 (2015), 681–689.

Eickhoff, C. (2014). Crowd-powered experts: Helping surgeons interpret breast cancer images. In Gamification for Information Retrieval (GamifIR). ACM, 53–56.

Foncubierta Rodríguez, A and Müller, H. (2012). Ground truth generation in medical imaging: a crowdsourcing-based iterative approach, In ACM Multimedia workshop on Crowdsourcing for Multimedia. Workshop on Crowdsourcing for Multimedia, ACM Multimedia (2012), 9–14.

Ganz, M, Kondermann, D, Andrulis, J, Knudsen, G. M, and Maier-Hein, L. (2017). Crowdsourcing for error detection in cortical surface delineations. International journal of computer assisted radiology and surgery 12, 1 (2017), 161–166.

Greenspan, H, Van Ginneken, B, and Summers, R. M. (2016). Guest editorial deep learning in medical imaging: Overview and future promise of an exciting new technique. IEEE Transactions on Medical Imaging 35, 5 (2016), 1153–1159.

Gur, Y, Moradi, M, Bulu, H, Guo, Y, Compas, C, and Syeda-Mahmood, T. (2017). Towards an efficient way of building annotated medical image collections for big data studies. In Intravascular Imaging and Computer Assisted Stenting, and Large-Scale Annotation of Biomedical Data and Expert Label Synthesis. Springer, 87–95.

Gurari, D, Sameki, M, and Betke, M. (2016). Investigating the influence of data familiarity to improve the design of a crowdsourcing image annotation system. In Human Computation (HCOMP).

Gurari, D, Theriault, D, Sameki, M, Isenberg, B, Pham, T. A, Purwada, A, Solski, P, Walker, M, Zhang, C, Wong, J. Y, and others, . (2015)b. How to collect segmentations for biomedical images? A benchmark evaluating the performance of experts, crowdsourced non-experts, and algorithms. In 2015 IEEE winter conference on applications of computer vision. IEEE, 1169–1176.

Gurari, D, Theriault, D, Sameki, M, Isenberg, B, Pham, T. A, Purwada, A, Solski, P, Walker, M, Zhang, C, Wong, J. Y, and Betke, M. (2015)a. How to collect segmentations for biomedical images? A benchmark evaluating the performance of experts, crowdsourced non-experts, and algorithms. In Winter Conference on Applications of Computer Vision, (WACV). 1169–1176.

Hara, K, Adams, A, Milland, K, Savage, S, Callison-Burch, C, and Bigham, J. (2017). A Data-Driven Analysis of Workers’ Earnings on Amazon Mechanical Turk. arXiv preprint arXiv:1712.05796 (2017).

Heim, E. (2018). Large-scale medical image annotation with quality-controlled crowdsourcing. Ph.D. Dissertation. German Cancer Research Center (DKFZ).

Heller, N, Stanitsas, P, Morellas, V, and Papanikolopoulos, N. (2017). A Web-Based Platform for Distributed Annotation of Computerized Tomography Scans. In Intravascular Imaging and Computer Assisted Stenting, and Large-Scale Annotation of Biomedical Data and Expert Label Synthesis (MICCAI LABELS). Springer, 136–145.

Holst, D, Kowalewski, T. M, White, L. W, Brand, T. C, Harper, J. D, Sorensen, M. D, Truong, M, Simpson, K, Tanaka, A, Smith, R, and others, . (2015). Crowd-sourced assessment of technical skills: differentiating animate surgical skill through the wisdom of crowds. Journal of endourology 29, 10 (2015), 1183–1188.

Howe, J. (2006). The rise of crowdsourcing. Wired magazine 14, 6 (2006), 1–4.

Huang, M and Hamarneh, G. (2017). SwifTree: Interactive Extraction of 3D Trees Supporting Gaming and Crowdsourcing. In Intravascular Imaging and Computer Assisted Stenting, and Large-Scale Annotation of Biomedical Data and Expert Label Synthesis (MICCAI LABELS). Springer, 116–125.

Irshad, H, Montaser-Kouhsari, L, Waltz, G, Bucur, O, Nowak, J, Dong, F, Knoblauch, N. W, and Beck, A. H. (2015). Crowdsourcing image annotation for nucleus detection and segmentation in computational pathology: evaluating experts, automated methods, and the crowd. In Pacific Symposium on Biocomputing. World Scientific, 294–305.

Irshad, H, Oh, E.-Y, Schmolze, D, Quintana, L. M, Collins, L, Tamimi, R. M, and Beck, A. H. (2017). Crowdsourcing scoring of immunohistochemistry images: Evaluating Performance of the Crowd and an Automated Computational Method. Scientific Reports 7 (2017), 43286.

Keshavan, A, Yeatman, J, and Rokem, A. (2018). Combining citizen science and deep learning to amplify expertise in neuroimaging. bioRxiv (2018), 363382.

Kovashka, A, Russakovsky, O, Fei-Fei, L, and Grauman, K. (2016). Crowdsourcing in Computer Vision. Foundations and Trends in Computer Graphics and Vision 10, 3 (2016), 177–243.

Lawson, J, Robinson-Vyas, R. J, McQuillan, J. P, Paterson, A, Christie, S, Kidza-Griffiths, M, McDuffus, L.-A, Moutasim, K. A, Shaw, E. C, Kiltie, A. E, and others, . (2017). Crowdsourcing for translational research: analysis of biomarker expression using cancer microarrays. British journal of cancer 116, 2 (2017), 237.

Lee, A. Y, Lee, C. S, Keane, P. A, and Tufail, A. (2016). Use of Mechanical Turk as a MapReduce framework for macular OCT segmentation. Journal of ophthalmology 2016 (2016).

Lee, A. Y and Tufail, A. (2014). Mechanical Turk based system for macular OCT segmentation. Investigative Ophthalmology & Visual Science 55, 13 (2014), 4787–4787.

Leifman, G, Swedish, T, Roesch, K, and Raskar, R. (2015). Leveraging the crowd for annotation of retinal images. In International Conference of the Engineering in Medicine and Biology Society (EMBC). IEEE, 7736–7739.

Lejeune, L, Christoudias, M, and Sznitman, R. (2017). Expected exponential loss for gaze-based video and volume ground truth annotation. In Intravascular Imaging and Computer Assisted Stenting, and Large-Scale Annotation of Biomedical Data and Expert Label Synthesis (MICCAI LABELS). Springer, 106–115.

Litjens, G, Kooi, T, Bejnordi, B. E, Setio, A. A. A, Ciompi, F, Ghafoorian, M, van der Laak, J. A, Van Ginneken, B, and Sánchez, C. I. (2017). A survey on deep learning in medical image analysis. Medical image analysis 42 (2017), 60–88.

Luengo-Oroz, M. A, Arranz, A, and Frean, J. (2012). Crowdsourcing malaria parasite quantification: an online game for analyzing images of infected thick blood smears. Journal of medical Internet research 14, 6 (2012).

Maier-Hein, L, Kondermann, D, Roß, T, Mersmann, S, Heim, E, Bodenstedt, S, Kenngott, H. G, Sanchez, A, Wagner, M, Preukschas, A, and others, . (2015). Crowdtruth validation: a new paradigm for validating algorithms that rely on image correspondences. International Journal of Computer Assisted Radiology and Surgery 10, 8 (2015), 1201–1212.

Maier-Hein, L, Mersmann, S, Kondermann, D, and others, . (2014)b. Crowdsourcing for reference correspondence generation in endoscopic images. In Medical Image Computing and Computer-Assisted Intervention (MICCAI). Springer, 349–356.

Maier-Hein, L, Mersmann, S, Kondermann, D, Bodenstedt, S, Sanchez, A, Stock, C, Kenngott, H. G, Eisenmann, M, and Speidel, S. (2014)a. Can Masses of Non-Experts Train Highly Accurate Image Classifiers? In Medical Image Computing and Computer-Assisted Intervention (MICCAI). Springer, 438–445.

Maier-Hein, L, Ross, T, Gröhl, J, Glocker, B, Bodenstedt, S, Stock, C, Heim, E, Götz, M, Wirkert, S, Kenngott, H, and others, . (2016). Crowd-algorithm collaboration for large-scale endoscopic image annotation with confidence. In Medical Image Computing and Computer-Assisted Intervention (MICCAI). Springer, 616–623.

Malpani, A, Vedula, S. S, Chen, C. C. G, and Hager, G. D. (2015). A study of crowdsourced segment-level surgical skill assessment using pairwise rankings. International journal of computer assisted radiology and surgery 10, 9 (Sept. 2015), 1435–47.

Mavandadi, S, Dimitrov, S, Feng, S, Yu, F, Sikora, U, Yaglidere, O, Padmanabhan, S, Nielsen, K, and Ozcan, A. (2012). Distributed medical image analysis and diagnosis through crowd-sourced games: A malaria case study. PLoS ONE 7, 5 (2012).

McKenna, M. T, Wang, S, Nguyen, T. B, Burns, J. E, Petrick, N, and Summers, R. M. (2012). Strategies for improved interpretation of computer-aided detections for CT colonography utilizing distributed human intelligence. Medical image analysis 16, 6 (2012), 1280–1292.

Mitry, D, Peto, T, Hayat, S, Blows, P, Morgan, J, Khaw, K.-T, and Foster, P. J. (2015). Crowdsourcing as a Screening Tool to Detect Clinical Features of Glaucomatous Optic Neuropathy from Digital Photography. PLoS ONE 10, 2 (2015), 1–8.

Mitry, D, Peto, T, Hayat, S, Morgan, J. E, Khaw, K.-T, and Foster, P. J. (2013). Crowdsourcing as a novel technique for retinal fundus photography classification: Analysis of Images in the EPIC Norfolk Cohort on behalf of the UKBiobank Eye and Vision Consortium. PLoS ONE 8, 8 (2013), e71154.

Mitry, D, Zutis, K, Dhillon, B, Peto, T, Hayat, S, Khaw, K.-T, Morgan, J. E, Moncur, W, Trucco, E, and Foster, P. J. (2016). The accuracy and reliability of crowdsource annotations of digital retinal images. Translational vision science & technology 5, 5 (2016), 6–6.

Nguyen, T. B, Wang, S, Anugu, V, Rose, N, McKenna, M, Petrick, N, Burns, J. E, and Summers, R. M. (2012). Distributed human intelligence for colonic polyp classification in computer-aided detection for CT colonography. Radiology 262, 3 (2012), 824–833.

O’Neil, A. Q, Murchison, J. T, van Beek, E. J, and Goatman, K. A. (2017). Crowdsourcing Labels for Pathological Patterns in CT Lung Scans: Can Non-experts Contribute Expert-Quality Ground Truth? In Intravascular Imaging and Computer Assisted Stenting, and Large-Scale Annotation of Biomedical Data and Expert Label Synthesis (MICCAI LABELS). Springer, 96–105.

Ørting, S. N, Cheplygina, V, Petersen, J, Thomsen, L. H, Wille, M. M. W, and de Bruijne, M. (2017). Crowdsourced emphysema assessment. In Intravascular Imaging and Computer Assisted Stenting, and Large-Scale Annotation of Biomedical Data and Expert Label Synthesis (MICCAI LABELS). Springer, 126–135.

Park, J. H, Mirhosseini, S, Nadeem, S, Marino, J, Kaufman, A, Baker, K, and Barish, M. (2017). Crowdsourcing for identification of polyp-free segments in virtual colonoscopy videos. In Medical Imaging 2017: Imaging Informatics for Healthcare, Research, and Applications, Vol. 10138. International Society for Optics and Photonics, 101380V.

Park, J. H, Nadeem, S, Marino, J, Baker, K, Barish, M, and Kaufman, A. (2018). Crowd-assisted polyp annotation of virtual colonoscopy videos. In Medical Imaging 2018: Imaging Informatics for Healthcare, Research, and Applications, Vol. 10579. International Society for Optics and Photonics, 105790M.

Park, J. H, Nadeem, S, Mirhosseini, S, and Kaufman, A. (2016). C 2 A: Crowd consensus analytics for virtual colonoscopy. In 2016 IEEE Conference on Visual Analytics Science and Technology (VAST). IEEE, 21–30.

Rajchl, M, Koch, L. M, Ledig, C, Passerat-Palmbach, J, Misawa, K, Mori, K, and Rueckert, D. (2017). Employing Weak Annotations for Medical Image Analysis Problems. arXiv preprint arXiv:1708.06297 (2017).

Rajchl, M, Lee, M. C, Schrans, F, Davidson, A, Passerat-Palmbach, J, Tarroni, G, Alansary, A, Oktay, O, Kainz, B, and Rueckert, D. (2016). Learning under Distributed Weak Supervision. arXiv preprint arXiv:1606.01100 (2016).

Ranard, B. L, Ha, Y. P, Meisel, Z. F, Asch, D. A, Hill, S. S, Becker, L. B, Seymour, A. K, and Merchant, R. M. (2014). Crowdsourcing: harnessing the masses to advance health and medicine, a systematic review. Journal of General Internal Medicine 29, 1 (Jan. 2014), 187–203.

Roethlingshoefer, V, Bittel, S, Kenngott, H, Wagner, M, Bodenstedt, S, Ross, T, Speidel, S, and L, M.-H. (2017). How to Create the Largest In-Vivo Endoscopic Dataset. In Intravascular Imaging and Computer Assisted Stenting, and Large-Scale Annotation of Biomedical Data and Expert Label Synthesis (MICCAI LABELS).

Sameki, M, Gurari, D, and Betke, M. (2016). ICORD: Intelligent Collection of Redundant Data ? A Dynamic System for Crowdsourcing Cell Segmentations Accurately and Efficiently. In Computer Vision and Pattern Recognition Workshops (CVPRW). 1380–1389.

Sharma, M, Saha, O, Sriraman, A, Hebbalaguppe, R, Vig, L, and Karande, S. (2017). Crowdsourcing for chromosome segmentation and deep classification. In Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE, 786–793.

Smittenaar, P, Walker, A. K, McGill, S, Kartsonaki, C, Robinson-Vyas, R. J, McQuillan, J. P, Christie, S, Harris, L, Lawson, J, Henderson, E, and others, . (2018). Harnessing citizen science through mobile phone technology to screen for immunohistochemical biomarkers in bladder cancer. British journal of cancer 119, 2 (2018), 220.

Sonabend, A. M, Zacharia, B. E, Cloney, M. B, Sonabend, A, Showers, C, Ebiana, V, Nazarian, M, Swanson, K. R, Baldock, A, Brem, H, and others, . (2017). Defining glioblastoma resectability through the wisdom of the crowd: a proof-of-principle study. Neurosurgery 80, 4 (2017), 590–601.

Sullivan, D. P, Winsnes, C. F, Åkesson, L, Hjelmare, M, Wiking, M, Schutten, R, Campbell, L, Leifsson, H, Rhodes, S, Nordgren, A, and others, . (2018). Deep learning is combined with massive-scale citizen science to improve large-scale image classification. Nature biotechnology 36, 9 (2018), 820.

Timmermans, B, Szlávik, Z, and Sips, R.-J. (2016). Crowdsourcing ground truth data for analysing brainstem tumors in children. In Belgium Netherlands Artificial Intelligence Conference (BNAIC).

Wazny, K. (2017). Crowdsourcing ten years in: A review. Journal of global health 7, 2 (2017).