Crowdsourcing and the Semantic Web: A Research Manifesto

Authors

  • Cristina Sarasua University of Koblenz-Landau
  • Elena Simperl University of Southampton
  • Natasha F. Noy Google Inc.
  • Abraham Bernstein University of Zurich
  • Jan Marco Leimeister University of St. Gallen, University of Kassel

DOI:

https://doi.org/10.15346/hc.v2i1.2

Keywords:

Crowdsourcing, Semantic Web, Human Computation, Ontologies, Machine-readable metedata

Abstract

Our goal with this research manifesto is to define a roadmap to guide the evolution of the new research field that is emerging at the intersection between crowdsourcing and the Semantic Web. We analyze the confluence of these two disciplines by exploring their relationship. First, we focus on how the application of crowdsourcing techniques can enhance the machine-driven execution of Semantic Web tasks. Second, we look at the ways in which machine-processable semantics can benefit the design and management of crowdsourcing projects. As a result, we are able to describe a list of successful or promising scenarios for both perspectives, identify scientific and technological challenges, and compile a set of recommendations to realize these scenarios effectively.

References

Acosta, M, Simperl, E, Flöck, F, Vidal, M.-E, and Studer, R. (2015). RDF-Hunter: Automatically Crowdsourcing the Execution of Queries Against RDF Data Sets. Arxiv preprint arXiv:1503.02911 (2015).

Acosta, M, Zaveri, A, Simperl, E, Kontokostas, D, Auer, S, and Lehmann, J. (2013). Crowdsourcing Linked Data Quality Assessment. In The Semantic Web - ISWC 2013 - 12th International Semantic Web Conference, Sydney, NSW, Australia, October 21-25, 2013, Proceedings, Part II. 260–276.

Atzmueller, M, Becker, M, Kibanov, M, Scholz, C, Doerfel, S, Hotho, A, Macek, B.-E, Mitzlaff, F, Mueller, J, and Stumme, G. (2014). Ubicon and its Applications for Ubiquitous Social Computing. New Review of Hypermedia and Multimedia 1, 20 (2014), 53–77. DOI:http://dx.doi.org/10.1080/13614568.2013.873488

Baker, T, Noy, N, Swick, R, and Herman, I. (2012). Semantic Web Case Studies and Use Cases. (2012). http://www.w3.org/2001/sw/sweo/public/UseCases/

Bernstein, A. (2012). The global brain semantic web Interleaving Human-Machine knowledge and computation. In Workshop on What will the Semantic Web Look Like 10 Years From Now? at ISCW 2012, Boston, MA.

Bernstein, M. S. (2013). Crowd-Powered Systems. KI 27, 1 (2013), 69–73.

Bigham, J. P, Bernstein, M. S, and Adar, E. (2015). HCI Human-Computer Interaction and Collective Intelligence.

Blohm, I, Leimeister, J. M, and Krcmar, H. (2013). Crowdsourcing: How to Benefit from (Too) Many Great Ideas. MIS Quarterly Executive 12, 4 (2013).

Boudreau, K. J and Lakhani, K. R. (2013). Using the crowd as an innovation partner. Harvard business review 91, 4 (2013), 60–69.

Burke, J. A, Estrin, D, Hansen, M, Parker, A, Ramanathan, N, Reddy, S, and Srivastava, M. B. (2006). Participatory sensing. Center for Embedded Network Sensing (2006).

Cardoso, J and Sheth, A. (2003). Semantic e-workflow composition. Journal of Intelligent Information Systems 21, 3 (2003), 191–225.

Celino, I. (2013). Human Computation VGI Provenance: SemanticWeb-Based Representation and Publishing. IEEE T. Geoscience and Remote Sensing 51, 11 (2013), 5137–5144.

Celino, I, Cerizza, D, Contessa, S, Corubolo, M, Dell’Aglio, D, Valle, E. D, and Fumeo, S. (2012)a. Urbanopoly - A Social and Location-Based Game with a Purpose to Crowdsource Your Urban Data. In 2012 International Conference on Privacy, Security, Risk and Trust, PASSAT 2012, and 2012 International Confernece on Social Computing, SocialCom 2012, Amsterdam, Netherlands, September 3-5, 2012. 910–913. DOI:http://dx.doi.org/10.1109/SocialCom-PASSAT.2012.138

Celino, I, Contessa, S, Corubolo, M, Dell’Aglio, D, Valle, E. D, Fumeo, S, and Krüger, T. (2012)b. Linking Smart Cities Datasets with Human Computation - The Case of UrbanMatch. In The Semantic Web - ISWC 2012 - 11th International Semantic Web Conference, Boston, MA, USA, November 11-15, 2012, Proceedings, Part II. 34–49. DOI:http://dx.doi.org/10.1007/978-3-642-35173-0_3

Demartini, G, Difallah, D. E, and Cudré-Mauroux, P. (2012). ZenCrowd: leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking. In Proceedings of the 21st World Wide Web Conference 2012, WWW 2012, Lyon, France, April 16-20, 2012. 469–478. DOI:http://dx.doi.org/10.1145/2187836.2187900

Demartini, G, Trushkowsky, B, Kraska, T, and Franklin, M. J. (2013). CrowdQ: Crowdsourced Query Understanding. In CIDR 2013, Sixth Biennial Conference on Innovative Data Systems Research, Asilomar, CA, USA, January 6-9, 2013, Online Proceedings.

Difallah, D. E, Demartini, G, and Cudré-Mauroux, P. (2013). Pick-a-crowd: tell me what you like, and i’ll tell you what to do.. In WWW, Daniel Schwabe, Virgílio A. F. Almeida, Hartmut Glaser, Ricardo A. Baeza-Yates, and Sue B. Moon (Eds.). International World Wide Web Conferences Steering Committee / ACM, 367–374.

Doan, A, Ramakrishnan, R, and Halevy, A. Y. (2011). Crowdsourcing systems on the World-Wide Web. Commun. ACM 54, 4 (2011), 86–96.

Falconer, S, Tudorache, T, and Noy, N. F. (2011). An analysis of collaborative patterns in large-scale ontology development projects. In Proceedings of the sixth international conference on Knowledge capture. ACM, 25–32.

Feldman, M and Bernstein, A. (2014). Cognition-based Task Routing:Towards Highly-Effective Task-Assignments in Crowdsourcing Settings. In 35th International Conference on Information Systems (ICIS 2014). s.n., Auckland, New Zealand.

Fensel, D, Facca, F. M, Simperl, E, and Toma, I. (2011). Semantic web services. Springer.

Hanika, F, Wohlgenannt, G, and Sabou, M. (2014). The uComp Protégé Plugin: Crowdsourcing Enabled Ontology Engineering. Semantic Web Journal (EKAW2014) (2014).

Howe, J. (2008). Crowdsourcing: Why the Power of the Crowd Is Driving the Future of Business. Crown Publishing Group.

Inel, O, Khamkham, K, Cristea, T, Dumitrache, A, Rutjes, A, van der Ploeg, J, Romaszko, L, Aroyo, L, and Sips, R.-J. (2014).

CrowdTruth: Machine-Human Computation Framework for Harnessing Disagreement in Gathering Annotated Data. In The Semantic Web–ISWC 2014. Springer, 486–504.

Ipeirotis, P. G and Gabrilovich, E. (2014). Quizz: targeted crowdsourcing with a billion (potential) users. In 23rd International World Wide Web Conference, WWW ’14, Seoul, Republic of Korea, April 7-11, 2014. 143–154. DOI:http://dx.doi.org/10.1145/2566486. 2567988

Kern, R. (2014). Dynamic Quality Management for Cloud Labor Services. Methods and Applications for Gaining ReliableWork Results with an On-Demand Workforce. Series: Lecture Notes in Business Information Processing, Vol. 192. Springer.

Kern, R, Thies, H, Zirpins, C, and Satzger, G. (2012). Dynamic and Goal-Based Quality Management for Human-Based Electronic Services. Int. J. Cooperative Inf. Syst. 21, 1 (2012), 3–29.

Kim, A. J. (2000). Community building on the Web: Secret strategies for successful online communities. Addison-Wesley Longman Publishing Co., Inc.

Kittur, A, Nickerson, J. V, Bernstein, M. S, Gerber, E, Shaw, A. D, Zimmerman, J, Lease, M, and Horton, J. (2013). The future of crowd work.. In CSCW, Amy Bruckman, Scott Counts, Cliff Lampe, and Loren G. Terveen (Eds.). ACM, 1301–1318.

Kontokostas, D, Zaveri, A, Auer, S, and Lehmann, J. (2013). TripleCheckMate: A Tool for Crowdsourcing the Quality Assessment of Linked Data. In Knowledge Engineering and the Semantic Web - 4th International Conference, KESW 2013, St. Petersburg, Russia, October 7-9, 2013. Proceedings. 265–272.

Kraut, R. E, Resnick, P, Kiesler, S, Burke, M, Chen, Y, Kittur, N, Konstan, J, Ren, Y, and Riedl, J. (2012). Building successful online communities: Evidence-based social design. MIT Press.

Krummenacher, R, Norton, B, and Marte, A. (2010). Towards linked open services and processes. In Future Internet-FIS 2010. Springer, 68–77.

Leimeister, J. M, Huber, M, Bretschneider, U, and Krcmar, H. (2009). Leveraging Crowdsourcing: Activation-Supporting components for IT-based ideas competition. Journal of Management Information Systems (JMIS) 26, 1 (2009), 197–224. http://pubs.wi-kassel. de/wp-content/uploads/2013/03/JML_145.pdf 138 (22-09).

Malone T. Laubacher, R and Johns, T. (2011). The Big Idea: The Age of Hyperspecialization. Harvard business review July 2011 (2011).

Minder, P and Bernstein, A. (2012). How to translate a book within an hour: towards general purpose programmable human computers with crowdlang. In Proceedings of the 3rd Annual ACM Web Science Conference. ACM, 209–212.

Morishima, A, Shinagawa, N, Mitsuishi, T, Aoki, H, and Fukusumi, S. (2012). CyLog/Crowd4U: A Declarative Platform for Complex Data-centric Crowdsourcing. PVLDB 5, 12 (2012), 1918–1921. http://vldb.org/pvldb/vol5/p1918_atsuyukimorishima_vldb2012.pdf

Mortensen, J, Alexander, P. R, Musen, M. A, and Noy, N. F. (2013)a. Crowdsourcing Ontology Verification. In Proceedings of the 4th International Conference on Biomedical Ontology, ICBO 2013, Montreal, Canada, July 7-12, 2013. 40–45. http://ceur-ws.org/ Vol-1060/icbo2013_submission_51.pdf

Mortensen, J, Musen, M. A, and Noy, N. F. (2013)b. Crowdsourcing the Verification of Relationships in Biomedical Ontologies. In AMIA 2013, American Medical Informatics Association Annual Symposium, Washington, DC, USA, November 16-20, 2013. http://knowledge.amia.org/amia-55142-a2013e-1.580047/t-09-1.582024/f-009-1.582025/a-345-1.582084/a-363-1.582079

Moussawi, S and Koufaris, M. (2013). The Crowd on the Assembly Line: Designing Tasks for a Better Crowdsourcing Experience. (2013).

Noy, N. F, Mortensen, J, Musen, M. A, and Alexander, P. R. (2013). Mechanical turk as an ontology engineer?: using microtasks as a component of an ontology-engineering workflow. In Web Science 2013 (co-located with ECRC), WebSci ’13, Paris, France, May 2-4, 2013. 262–271. DOI:http://dx.doi.org/10.1145/2464464.2464482

Quinn, A. J and Bederson, B. B. (2011). Human Computation: A Survey and Taxonomy of a Growing Field. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems.

Raddick, M. J, Bracey, G, Carney, K, Gyuk, G, Borne, K, Wallin, J, Jacoby, S, and Planetarium, A. (2009). Citizen science: status and research directions for the coming decade. AGB Stars and Related Phenomenastro 2010: The Astronomy and Astrophysics Decadal Survey (2009), 46P.

Raimond, Y, Ferne, T, Smethurst, M, and Adams, G. (2014). The BBC World Service Archive Prototype. Web Semantics: Science, Services and Agents on the World Wide Web 27, 1 (2014). http://www.websemanticsjournal.org/index.php/ps/article/view/378

Sarasua, C, Simperl, E, and Noy, N. F. (2012). CrowdMap: Crowdsourcing Ontology Alignment with Microtasks. In The Semantic Web - ISWC 2012 - 11th International Semantic Web Conference, Boston, MA, USA, November 11-15, 2012, Proceedings, Part I. 525–541. DOI:http://dx.doi.org/10.1007/978-3-642-35176-1_33

Sarasua, C and Thimm, M. (2014). Crowd Work CV: Recognition for Micro Work. In Proceedings of the 3rd International Workshop on Social Media for Crowdsourcing and Human Computation (SoHuman’14).

Schulze, T, Krug, S, and Schader, M. (2012). Workers’ Task Choice in Crowdsourcing and Human Computation Markets.. In ICIS. Association for Information Systems.

Shadbolt, N, Hall, W, and Berners-Lee, T. (2006). The semantic web revisited. Intelligent Systems, IEEE 21, 3 (2006), 96–101. Shadbolt, N. R, Smith, D. A, Simperl, E, Kleek, M. V, Yang, Y, and Hall, W. (2013). Towards a classification framework for social machines. In 22nd International World Wide Web Conference, WWW ’13, Rio de Janeiro, Brazil, May 13-17, 2013, Companion Volume. 905–912.

Sheng, V, Provost, F, and Ipeirotis, P. G. (2008). Get Another Label? Improving Data Quality and Data Mining Using Multiple, Noisy Labelers. In KDD ’08: Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, New York, NY, USA, 614–622. http://archive.nyu.edu/bitstream/2451/25882/4/kdd2008.pdf isbn = 978-1-60558-193-4,

location = Las Vegas, Nevada, USA, doi = http://doi.acm.org/10.1145/1401890.1401965.

Simperl, E, Cuel, R, and Stein, M. (2013). Morgan & Claypool Publishers.

Singhal, A. (2012). Introducing the Knowledge Graph: things, not strings. (2012). http://googleblog.blogspot.com/2012/05/ introducingknowledge-graph-things-not.html.

Siorpaes, K and Simperl, E. (2010). Human intelligence in the process of semantic content creation. World Wide Web 13, 1-2 (2010), 33–59.

Strohmaier, M, Walk, S, Pöschko, J, Lamprecht, D, Tudorache, T, Nyulas, C, Musen, M. A, and Noy, N. F. (2013). How ontologies are made: Studying the hidden social dynamics behind collaborative ontology engineering projects. Web Semantics: Science, Services and Agents on the World Wide Web 20 (2013), 18–34.

Thaler, S, Simperl, E. P. B, and Siorpaes, K. (2011). SpotTheLink: playful alignment of ontologies. In Proceedings of the 2011 ACM Symposium on Applied Computing (SAC), TaiChung, Taiwan, March 21 - 24, 2011. 1711–1712. DOI:http://dx.doi.org/10.1145/ 1982185.1982542

Tudorache, T, Nyulas, C, Noy, N. F, and Musen, M. A. (2013)a. Using Semantic Web in ICD-11: Three Years Down the Road. In The Semantic Web - ISWC 2013 - 12th International Semantic Web Conference, Sydney, NSW, Australia, October 21-25, 2013, Proceedings, Part II. 195–211.

Tudorache, T, Nyulas, C, Noy, N. F, and Musen, M. A. (2013)b. WebProtégé: A collaborative ontology editor and knowledge acquisition tool for the web. Semantic web 4, 1 (2013), 89–99.

Tudorache, T, Nyulas, C, Noy, N. F, Redmond, T, and Musen, M. A. (2011). iCAT: A Collaborative Authoring Tool for ICD-11. In Workshop â ˘ AIJOntologies come of Age in the Semantic Webâ˘A˙I(OCAS2011) 10 th International Semantic Web Conference Bonn, Germany, October 24, 2011. 72.

von Ahn, L and Dabbish, L. (2008). Designing games with a purpose. Commun. ACM 51, 8 (2008), 58–67.

Vrandecic, D and Krötzsch, M. (2014). Wikidata: A Free Collaborative Knowledgebase. Commun. ACM 57, 10 (2014), 78–85.

Vukovic, M and Bartolini, C. (2010). Towards a Research Agenda for Enterprise Crowdsourcing.. In ISoLA (1) (Lecture Notes in Computer Science), Tiziana Margaria and Bernhard Steffen (Eds.), Vol. 6415. Springer, 425–434.

Waitelonis, J, Ludwig, N, Knuth, M, and Sack, H. (2011). WhoKnows? Evaluating linked data heuristics with a quiz that cleans up DBpedia. Interact. Techn. Smart Edu. 8, 4 (2011), 236–248.

Walk, S, Singer, P, Strohmaier, M, Tudorache, T, Musen, M. A, and Noy, N. F. (2014). Discovering beaten paths in collaborative ontology-engineering projects using markov chains. Journal of biomedical informatics 51 (2014), 254–271.

Zogaj, S and Bretschneider, U. (2014). Analyzing Governance Mechanisms for Crowdsourcing Information Systems: A Multiple Case Analysis. (2014).

Zogaj, S, Bretschneider, U, and Leimeister, J. M. (2014). Managing crowdsourced software testing: a case study based insight on the challenges of a crowdsourcing intermediary. Journal of Business Economics 84, 3 (2014), 375–405.

Downloads

Published

2015-08-10

How to Cite

Sarasua, C., Simperl, E., Noy, N. F., Bernstein, A., & Leimeister, J. M. (2015). Crowdsourcing and the Semantic Web: A Research Manifesto. Human Computation, 2(1). https://doi.org/10.15346/hc.v2i1.2

Issue

Section

Opinions