Accurator: Nichesourcing for Cultural Heritage
Keywords:Crowdsourcing, Nichesourcing, Methodology, Knowledge-intensive Tasks, Cultural Heritage
AbstractWith the increase of cultural heritage data published online, the usefulness of data in this open context hinges on the quality and diversity of descriptions of collection objects. In many cases, existing descriptions are not sufficient for retrieval and research tasks, resulting in the need for more specific annotations. However, eliciting such annotations is a challenge since it often requires domain-specific knowledge. Where crowdsourcing can be successfully used to execute simple annotation tasks, identifying people with the required expertise might prove troublesome for more complex and domain-specific tasks. Nichesourcing addresses this problem, by tapping into the expert knowledge available in niche communities. This paper presents Accurator, a methodology for conducting nichesourcing campaigns for cultural heritage institutions, by addressing communities, organizing events and tailoring a web-based annotation tool to a domain of choice. The contribution of this paper is fourfold: 1) a nichesourcing methodology, 2) an annotation tool for experts, 3) validation of the methodology in three case studies and 4) a dataset including the obtained annotations. The three domains of the case studies are birds on art, bible prints and fashion images. We compare the quality and quantity of obtained annotations in the three case studies, showing that the nichesourcing methodology in combination with the image annotation tool can be used to collect high-quality annotations in a variety of domains. A user evaluation indicates the tool is suited and usable for domain-specific annotation tasks.
Bevan, A, Pett, D, Bonacchi, C, Keinan-Schoonbaert, A, Lombraña González, D, Sparks, R, Wexler, J, and Wilkin, N. (2014). Citizen Archaeologists. Online Collaborative Research about the Human Past. Human Computation Journal 1, 2 (2014), 185–199. DOI: http://dx.doi.org/10.15346/hc.v1i2.9
Ceolin, D, Nottamkandath, A, and Fokkink, W. (2012). Automated evaluation of annotators for museum collections using subjective logic. In Proceedings of the 6th IFIP Trust Management Conference (IFIPTM ’12). Springer, Berlin, Heidelberg, 232–239. DOI: http://dx.doi.org/10.1007/978- 3- 642- 29852- 3_18
Chamberlain, J. (2014). Groupsourcing: distributed problem solving using social networks. In Proceedings of the 2nd AAAI Conference on Human Computation and Crowdsourcing (HCOMP ’14). The AAAI Press, Palo Alto, CA, USA, 22–29.
Chun, S, Cherry, R, Hiwiller, D, Trant, J, and Wyman, B. (2006). Steve.museum: an ongoing experiment in social tagging, folksonomy, and museums. In Proceedings of the Museums and the Web conference, Jennifer Trant and David Bearman (Eds.). http://www. archimuse.com/mw2006/papers/wyman/wyman.html
Cosley, D, Frankowski, D, Terveen, L, and Riedl, J. (2007). SuggestBot: using intelligent task routing to help people find work in Wikipedia. In Proceedings of the 12th International Conference on Intelligent User Interfaces (IUI ’07). ACM, New York, New York, USA, 32–41. DOI:http://dx.doi.org/10.1145/1216295.1216309
de Boer, V, Hildebrand, M, Aroyo, L, Leenheer, P. D, Dijkshoorn, C, Tesfa, B, and Schreiber, G. (2012)a. Nichesourcing: harnessing the power of crowds of experts. In Proceedings of the 18th International Conference on Knowledge Engineering and Knowledge Management (EKAW ’12), Annette ten Teije, Johanna Völker, Siegfried Handschuh, Heiner Stuckenschmidt, Mathieu d’Acquin, Andriy Nikolov, Nathalie Aussenac-Gilles, and Nathalie Hernandez (Eds.). Springer, Berlin, Heidelberg, 16–20. DOI:http://dx.doi. org/10.1007/978- 3- 642- 33876- 2_3
de Boer, V, Wielemaker, J, van Gent, J, Hildebrand, M, Isaac, A, van Ossenbruggen, J, and Schreiber, G. (2012)b. Supporting Linked Data Production for Cultural Heritage Institutes: The Amsterdam Museum Case Study. In Proceedings of the 9th Extended Semantic Web Conference, Elena Simperl, Philipp Cimiano, Axel Polleres, Oscar Corcho, and Valentina Presutti (Eds.). ESWC ’12, Vol. 7295. Springer Berlin, 733–747. DOI:http://dx.doi.org/10.1007/978-3-642-30284-8_56
Difallah, D. E, Demartini, G, and Cudré-Mauroux, P. (2013). Pick-a-crowd: tell me what you like, and I’ll tell you what to do. In Proceedings of the 22nd International Conference on World Wide Web (WWW ’13). ACM, New York, NY, USA, 367–374. DOI: http://dx.doi.org/10.1145/2488388.2488421
Dijkshoorn, C, Bucur, C.-L, Brinkerink, M, Pieterse, S, and Aroyo, L. (2017). DigiBird: On The Fly Collection Integra- tion Supported By The Crowd. In Proceedings of the Museums and the Web conference. http://mw17.mwconf.org/paper/ digibird- on- the- fly- collection- integration- supported- by- the- crowd/
Dijkshoorn, C, Oosterman, J, Aroyo, L, and Houben, G.-J. (2012). Personalization in crowd-driven annotation for cultural heritage collections. In Proceedings of the Personal Access to Cultural Heritage Workshop (CEUR Workshop Proceedings). CEUR-WS.org, 1–13.
Doan, A, Ramakrishnan, R, and Halevy, A. Y. (2011). Crowdsourcing systems on the world-wide web. Commun. ACM 54, 4 (April 2011), 86–96. DOI:http://dx.doi.org/10.1145/1924421.1924442
Ellis, A, Gluckman, D, Cooper, A, and Andrew, G. (2012). Your Paintings: a nation’s oil paintings go online, tagged by the public. In Proceedings of the Museums and the Web conference. http://www.museumsandtheweb.com/mw2012/papers/your_paintings_a_ nation_s_oil_paintings_go_onl
Gligorov, R, Hildebrand, M, Ossenbruggen, J, Aroyo, L, and Schreiber, G. (2013). An evaluation of labelling-game data for video retrieval. In Proceedings of the 35th European Conference on IR Research (ECIR ’13), Pavel Serdyukov, Pavel Braslavski, SergeiO. Kuznetsov, Jaap Kamps, Stefan Rüger, Eugene Agichtein, Ilya Segalovich, and Emine Yilmaz (Eds.). Springer, Berlin Heidelberg, 50–61. DOI:http://dx.doi.org/10.1007/978-3-642-36973-5_5
Gligorov, R, Hildebrand, M, van Ossenbruggen, J, Schreiber, G, and Aroyo, L. (2011). On the role of user-generated metadata in audio visual collections. In Proceedings of the 6th international Conference on Knowledge Capture (K-CAP ’11). ACM, New York, New York, USA, 145–152. DOI:http://dx.doi.org/10.1145/1999676.1999702
Goto, S, Ishida, T, and Lin, D. (2016). Understanding crowdsourcing workflow: modeling and optimizing iterative and parallel processes. In Proceedings of the 4th AAAI Conference on Human Computation and Crowdsourcing (HCOMP ’16). The AAAI Press, Palo Alto, CA, USA, 52–58.
Inel, O, Khamkham, K, Cristea, T, Dumitrache, A, Rutjes, A, van der Ploeg, J, Romaszko, L, Aroyo, L, and Sips, R.-J. (2014). CrowdTruth: machine-human computation framework for harnessing disagreement in gathering annotated data. In Proceedings of the 13th International Semantic Web Conference (ISWC ’14), Peter Mika, Tania Tudorache, Abraham Bernstein, Chris Welty, Craig Knoblock, Denny Vrandecˇic ́, Paul Groth, Natasha Noy, Krzysztof Janowicz, and Carole Goble (Eds.). Springer, Cham, 486–504. DOI:http://dx.doi.org/10.1007/978- 3- 319- 11915- 1_31
Kulkarni, A, Narula, P, Rolnitzky, D, and Kontny, N. (2014). Wish: amplifying creative ability with expert crowds. In Proceedings of the 2nd AAAI Conference on Human Computation and Crowdsourcing (HCOMP ’14). The AAAI Press, Palo Alto, CA, USA, 112–120.
Mouromtsev, D, Haase, P, Cherny, E, Pavlov, D, Andreev, A, and Spiridonova, A. (2015). Towards the Russian Linked Culture Cloud: Data Enrichment and Publishing. In Proceedings of the 12th Extended Semantic Web Conference (ESWC ’15), Fabien Gandon, Marta Sabou, Harald Sack, Claudia d’Amato, Philippe Cudré-Mauroux, and Antoine Zimmermann (Eds.). Springer International Publishing, Cham, 637–651. DOI:http://dx.doi.org/10.1007/978-3-319-18818-8_39
Noordegraaf, J, Bartholomew, A, and Eveleigh, A. (2014). Modeling crowdsourcing for cultural heritage. In Proceedings of the Museums and the Web conference. http://mw2014.museumsandtheweb.com/paper/modeling-crowdsourcing-for-cultural-heritage/
Oomen, J and Aroyo, L. (2011). Crowdsourcing in the cultural heritage domain: opportunities and challenges. In Proceedings of the 5th International Conference on Communities and Technologies (C&T ’11). ACM, New York, New York, USA, 138–149. DOI: http://dx.doi.org/10.1145/2103354.2103373
Oosterman, J and Houben, G.-J. (2016). On the Invitation of Expert Contributors from Online Communities for Knowledge Crowd- sourcing Tasks. In Proceedings of the 16th International Conference on Web Engineering (ICWE16), Alessandro Bozzon, Philippe Cudre-Maroux, and Cesare Pautasso (Eds.). Springer, Cham, 413–421. DOI:http://dx.doi.org/10.1007/978-3-319-38791-8_27
Quinn, A. J and Bederson, B. B. (2011). Human computation: a survey and taxonomy of a growing field. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’11). ACM, New York, New York, USA, 1403–1412. DOI: http://dx.doi.org/10.1145/1978942.1979148
Raddick, M. J, Bracey, G, Gay, P. L, Lintott, C. J, Murray, P, Schawinski, K, Szalay, A. S, and Vandenberg, J. (2010). Galaxy Zoo: exploring the motivations of citizen science volunteers. Astronomy Education Review 9, 1 (2010), 1–18. DOI:http://dx.doi.org/10. 3847/AER2009036
Ridge, M. (2013). From tagging to theorizing: deepening engagement with cultural heritage through crowdsourcing. Curator: The Museum Journal 56, 4 (2013), 435–450. DOI:http://dx.doi.org/10.1111/cura.12046
Sarasua, C, Simperl, E, Noy, N, Bernstein, A, and Leimeister, J. M. (2015). Crowdsourcing and the Semantic Web: a research manifesto. Human Computation Journal 2, 1 (2015), 3–17. DOI:http://dx.doi.org/10.15346/hc.v2i1.2
Shadbolt, N, Berners-Lee, T, and Hall, W. (2006). The Semantic Web Revisited. IEEE Intelligent Systems 21, 3 (May 2006), 96–101. DOI:http://dx.doi.org/10.1109/MIS.2006.62
Simon, R, Haslhofer, B, Robitza, W, and Roochi, E. M. (2011). Semantically augmented annotations in digitized map collections. In Proceedings of the 11th Annual International ACM/IEEE Joint Conference on Digital Libraries (JCDL ’11). ACM, New York, New York, USA, 199–202. DOI:http://dx.doi.org/10.1145/1998076.1998114
Stronks, E. (2011). Negotiating differences: word, image and religion in the Dutch Republic. Vol. 155. Brill.
Szekely, P, Knoblock, C. A, Yang, F, Zhu, X, Fink, E. E, Allen, R, and Goodlander, G. (2013). Connecting the Smithsonian American Art Museum to the Linked Data Cloud. In Proceedings of the 10th Extended Semantic Web Conference (ESWC ’13), Philipp Cimiano, Oscar Corcho, Valentina Presutti, Laura Hollink, and Sebastian Rudolph (Eds.), Vol. 7882. Springer Berlin Heidelberg, Berlin, Heidelberg, 593–607. DOI:http://dx.doi.org/10.1007/978-3-642-38288-8_40
Traub, M. C, van Ossenbruggen, J, He, J, and Hardman, L. (2014). Measuring the effectiveness of gamesourcing expert oil painting annotations. In Proceedings of the 36th European Conference on IR Research (ECIR ’14), Maarten de Rijke, Tom Kenter, Arjen P. de Vries, ChengXiang Zhai, Franciska de Jong, Kira Radinsky, and Katja Hofmann (Eds.). Springer, Cham, 112–123. DOI:http: //dx.doi.org/10.1007/978- 3- 319- 06028- 6_10
von Ahn, L and Dabbish, L. (2004). Labeling images with a computer game. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’04). ACM, New York, NY, USA, 319–326. DOI:http://dx.doi.org/10.1145/985692.985733
Wenger, E, McDermott, R. A, and Snyder, W. (2002). Cultivating communities of practice: A guide to managing knowledge. Harvard Business Press.
Wielemaker, J, Beek, W, Hildebrand, M, and van Ossenbruggen, J. (2016). ClioPatria: a SWI-Prolog infrastructure for the Semantic Web. Semantic Web Journal 7, 5 (2016), 529–541. DOI:http://dx.doi.org/10.3233/SW-150191
How to Cite
LicenseAuthors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).