Accurator: Nichesourcing for Cultural Heritage

Chris Dijkshoorn, Victor de Boer, Lora Aroyo, Guus Schreiber


With the increase of cultural heritage data published online, the usefulness of data in this open context hinges on the quality and diversity of descriptions of collection objects. In many cases, existing descriptions are not sufficient for retrieval and research tasks, resulting in the need for more specific annotations. However, eliciting such annotations is a challenge since it often requires domain-specific knowledge. Where crowdsourcing can be successfully used to execute simple annotation tasks, identifying people with the required expertise might prove troublesome for more complex and domain-specific tasks. Nichesourcing addresses this problem, by tapping into the expert knowledge available in niche communities. This paper presents Accurator, a methodology for conducting nichesourcing campaigns for cultural heritage institutions, by addressing communities, organizing events and tailoring a web-based annotation tool to a domain of choice. The contribution of this paper is fourfold: 1) a nichesourcing methodology, 2) an annotation tool for experts, 3) validation of the methodology in three case studies and 4) a dataset including the obtained annotations. The three domains of the case studies are birds on art, bible prints and fashion images. We compare the quality and quantity of obtained annotations in the three case studies, showing that the nichesourcing methodology in combination with the image annotation tool can be used to collect high-quality annotations in a variety of domains. A user evaluation indicates the tool is suited and usable for domain-specific annotation tasks.


Crowdsourcing; Nichesourcing; Methodology; Knowledge-intensive Tasks; Cultural Heritage

Full Text:



Bevan, A, Pett, D, Bonacchi, C, Keinan-Schoonbaert, A, Lombraña González, D, Sparks, R, Wexler, J, and Wilkin, N. (2014). Citizen Archaeologists. Online Collaborative Research about the Human Past. Human Computation Journal 1, 2 (2014), 185–199. DOI:

Ceolin, D, Nottamkandath, A, and Fokkink, W. (2012). Automated evaluation of annotators for museum collections using subjective logic. In Proceedings of the 6th IFIP Trust Management Conference (IFIPTM ’12). Springer, Berlin, Heidelberg, 232–239. DOI: 3- 642- 29852- 3_18

Chamberlain, J. (2014). Groupsourcing: distributed problem solving using social networks. In Proceedings of the 2nd AAAI Conference on Human Computation and Crowdsourcing (HCOMP ’14). The AAAI Press, Palo Alto, CA, USA, 22–29.

Chun, S, Cherry, R, Hiwiller, D, Trant, J, and Wyman, B. (2006). an ongoing experiment in social tagging, folksonomy, and museums. In Proceedings of the Museums and the Web conference, Jennifer Trant and David Bearman (Eds.). http://www.

Cosley, D, Frankowski, D, Terveen, L, and Riedl, J. (2007). SuggestBot: using intelligent task routing to help people find work in Wikipedia. In Proceedings of the 12th International Conference on Intelligent User Interfaces (IUI ’07). ACM, New York, New York, USA, 32–41. DOI:

de Boer, V, Hildebrand, M, Aroyo, L, Leenheer, P. D, Dijkshoorn, C, Tesfa, B, and Schreiber, G. (2012)a. Nichesourcing: harnessing the power of crowds of experts. In Proceedings of the 18th International Conference on Knowledge Engineering and Knowledge Management (EKAW ’12), Annette ten Teije, Johanna Völker, Siegfried Handschuh, Heiner Stuckenschmidt, Mathieu d’Acquin, Andriy Nikolov, Nathalie Aussenac-Gilles, and Nathalie Hernandez (Eds.). Springer, Berlin, Heidelberg, 16–20. DOI:http://dx.doi. org/10.1007/978- 3- 642- 33876- 2_3

de Boer, V, Wielemaker, J, van Gent, J, Hildebrand, M, Isaac, A, van Ossenbruggen, J, and Schreiber, G. (2012)b. Supporting Linked Data Production for Cultural Heritage Institutes: The Amsterdam Museum Case Study. In Proceedings of the 9th Extended Semantic Web Conference, Elena Simperl, Philipp Cimiano, Axel Polleres, Oscar Corcho, and Valentina Presutti (Eds.). ESWC ’12, Vol. 7295. Springer Berlin, 733–747. DOI:

Difallah, D. E, Demartini, G, and Cudré-Mauroux, P. (2013). Pick-a-crowd: tell me what you like, and I’ll tell you what to do. In Proceedings of the 22nd International Conference on World Wide Web (WWW ’13). ACM, New York, NY, USA, 367–374. DOI:

Dijkshoorn, C, Bucur, C.-L, Brinkerink, M, Pieterse, S, and Aroyo, L. (2017). DigiBird: On The Fly Collection Integra- tion Supported By The Crowd. In Proceedings of the Museums and the Web conference. digibird- on- the- fly- collection- integration- supported- by- the- crowd/

Dijkshoorn, C, Oosterman, J, Aroyo, L, and Houben, G.-J. (2012). Personalization in crowd-driven annotation for cultural heritage collections. In Proceedings of the Personal Access to Cultural Heritage Workshop (CEUR Workshop Proceedings)., 1–13.

Doan, A, Ramakrishnan, R, and Halevy, A. Y. (2011). Crowdsourcing systems on the world-wide web. Commun. ACM 54, 4 (April 2011), 86–96. DOI:

Ellis, A, Gluckman, D, Cooper, A, and Andrew, G. (2012). Your Paintings: a nation’s oil paintings go online, tagged by the public. In Proceedings of the Museums and the Web conference. nation_s_oil_paintings_go_onl

Gligorov, R, Hildebrand, M, Ossenbruggen, J, Aroyo, L, and Schreiber, G. (2013). An evaluation of labelling-game data for video retrieval. In Proceedings of the 35th European Conference on IR Research (ECIR ’13), Pavel Serdyukov, Pavel Braslavski, SergeiO. Kuznetsov, Jaap Kamps, Stefan Rüger, Eugene Agichtein, Ilya Segalovich, and Emine Yilmaz (Eds.). Springer, Berlin Heidelberg, 50–61. DOI:

Gligorov, R, Hildebrand, M, van Ossenbruggen, J, Schreiber, G, and Aroyo, L. (2011). On the role of user-generated metadata in audio visual collections. In Proceedings of the 6th international Conference on Knowledge Capture (K-CAP ’11). ACM, New York, New York, USA, 145–152. DOI:

Goto, S, Ishida, T, and Lin, D. (2016). Understanding crowdsourcing workflow: modeling and optimizing iterative and parallel processes. In Proceedings of the 4th AAAI Conference on Human Computation and Crowdsourcing (HCOMP ’16). The AAAI Press, Palo Alto, CA, USA, 52–58.

Inel, O, Khamkham, K, Cristea, T, Dumitrache, A, Rutjes, A, van der Ploeg, J, Romaszko, L, Aroyo, L, and Sips, R.-J. (2014). CrowdTruth: machine-human computation framework for harnessing disagreement in gathering annotated data. In Proceedings of the 13th International Semantic Web Conference (ISWC ’14), Peter Mika, Tania Tudorache, Abraham Bernstein, Chris Welty, Craig Knoblock, Denny Vrandecˇic ́, Paul Groth, Natasha Noy, Krzysztof Janowicz, and Carole Goble (Eds.). Springer, Cham, 486–504. DOI: 3- 319- 11915- 1_31

Kulkarni, A, Narula, P, Rolnitzky, D, and Kontny, N. (2014). Wish: amplifying creative ability with expert crowds. In Proceedings of the 2nd AAAI Conference on Human Computation and Crowdsourcing (HCOMP ’14). The AAAI Press, Palo Alto, CA, USA, 112–120.

Mouromtsev, D, Haase, P, Cherny, E, Pavlov, D, Andreev, A, and Spiridonova, A. (2015). Towards the Russian Linked Culture Cloud: Data Enrichment and Publishing. In Proceedings of the 12th Extended Semantic Web Conference (ESWC ’15), Fabien Gandon, Marta Sabou, Harald Sack, Claudia d’Amato, Philippe Cudré-Mauroux, and Antoine Zimmermann (Eds.). Springer International Publishing, Cham, 637–651. DOI:

Noordegraaf, J, Bartholomew, A, and Eveleigh, A. (2014). Modeling crowdsourcing for cultural heritage. In Proceedings of the Museums and the Web conference.

Oomen, J and Aroyo, L. (2011). Crowdsourcing in the cultural heritage domain: opportunities and challenges. In Proceedings of the 5th International Conference on Communities and Technologies (C&T ’11). ACM, New York, New York, USA, 138–149. DOI:

Oosterman, J and Houben, G.-J. (2016). On the Invitation of Expert Contributors from Online Communities for Knowledge Crowd- sourcing Tasks. In Proceedings of the 16th International Conference on Web Engineering (ICWE16), Alessandro Bozzon, Philippe Cudre-Maroux, and Cesare Pautasso (Eds.). Springer, Cham, 413–421. DOI:

Quinn, A. J and Bederson, B. B. (2011). Human computation: a survey and taxonomy of a growing field. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’11). ACM, New York, New York, USA, 1403–1412. DOI:

Raddick, M. J, Bracey, G, Gay, P. L, Lintott, C. J, Murray, P, Schawinski, K, Szalay, A. S, and Vandenberg, J. (2010). Galaxy Zoo: exploring the motivations of citizen science volunteers. Astronomy Education Review 9, 1 (2010), 1–18. DOI: 3847/AER2009036

Ridge, M. (2013). From tagging to theorizing: deepening engagement with cultural heritage through crowdsourcing. Curator: The Museum Journal 56, 4 (2013), 435–450. DOI:

Sarasua, C, Simperl, E, Noy, N, Bernstein, A, and Leimeister, J. M. (2015). Crowdsourcing and the Semantic Web: a research manifesto. Human Computation Journal 2, 1 (2015), 3–17. DOI:

Shadbolt, N, Berners-Lee, T, and Hall, W. (2006). The Semantic Web Revisited. IEEE Intelligent Systems 21, 3 (May 2006), 96–101. DOI:

Simon, R, Haslhofer, B, Robitza, W, and Roochi, E. M. (2011). Semantically augmented annotations in digitized map collections. In Proceedings of the 11th Annual International ACM/IEEE Joint Conference on Digital Libraries (JCDL ’11). ACM, New York, New York, USA, 199–202. DOI:

Stronks, E. (2011). Negotiating differences: word, image and religion in the Dutch Republic. Vol. 155. Brill.

Szekely, P, Knoblock, C. A, Yang, F, Zhu, X, Fink, E. E, Allen, R, and Goodlander, G. (2013). Connecting the Smithsonian American Art Museum to the Linked Data Cloud. In Proceedings of the 10th Extended Semantic Web Conference (ESWC ’13), Philipp Cimiano, Oscar Corcho, Valentina Presutti, Laura Hollink, and Sebastian Rudolph (Eds.), Vol. 7882. Springer Berlin Heidelberg, Berlin, Heidelberg, 593–607. DOI:

Traub, M. C, van Ossenbruggen, J, He, J, and Hardman, L. (2014). Measuring the effectiveness of gamesourcing expert oil painting annotations. In Proceedings of the 36th European Conference on IR Research (ECIR ’14), Maarten de Rijke, Tom Kenter, Arjen P. de Vries, ChengXiang Zhai, Franciska de Jong, Kira Radinsky, and Katja Hofmann (Eds.). Springer, Cham, 112–123. DOI:http: // 3- 319- 06028- 6_10

von Ahn, L and Dabbish, L. (2004). Labeling images with a computer game. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’04). ACM, New York, NY, USA, 319–326. DOI:

Wenger, E, McDermott, R. A, and Snyder, W. (2002). Cultivating communities of practice: A guide to managing knowledge. Harvard Business Press.

Wielemaker, J, Beek, W, Hildebrand, M, and van Ossenbruggen, J. (2016). ClioPatria: a SWI-Prolog infrastructure for the Semantic Web. Semantic Web Journal 7, 5 (2016), 529–541. DOI:


  • There are currently no refbacks.