Crowds of Crowds: Performance based Modeling and Optimization over Multiple Crowdsourcing Platforms

Sakyajit Bhattacharya; Laura Elisa Celis; Deepthi Chander; Koustuv Dasgupta; Saraschandra Karanam; Vaibhav Rajan

doi:10.15346/hc.v2i1.6

Authors

Sakyajit Bhattacharya Xerox Research Centre India
Laura Elisa Celis EPFL, Switzerland
Deepthi Chander Xerox Research Centre India
Koustuv Dasgupta Xerox Research Centre India
Saraschandra Karanam Xerox Research Centre India
Vaibhav Rajan Xerox Research Centre India

DOI:

https://doi.org/10.15346/hc.v2i1.6

Keywords:

Algorithms, Scheduling, Modeling, Performance, Crowdsourcing

Abstract

The dynamic nature of crowdsourcing platforms poses interesting problems for users who wish to schedule large batches of tasks on these platforms. Of particular interest is that of scheduling the right number of tasks with the right price at the right time, in order to achieve the best performance with respect to accuracy and completion time. Research results, however, have shown that the performance exhibited by online platforms are both dynamic and largely unpredictable. This is primarily attributed to the fact that, unlike a traditional organizational workforce, a crowd platform is inherently composed of a dynamic set of workers with varying performance characteristics. Thus, any effort to optimize the performance needs to be complemented by a deep understanding and robust techniques to model the behaviour of the underlying platform(s). To this end, the research in this paper studies the above interrelated facets of crowdsourcing in two parts. The first part comprises the aspects of manual and automated statistical modeling of the crowd-workers' performance; the second part deals with optimization via intelligent scheduling over multiple platforms. %based on simulation testbed generated by the statistical models.Detailed experimentation with competing techniques, under varying operating conditions, validate the efficacy of our proposed algorithms while posting tasks either on a single crowd platform or multiple platforms. Our research has led to the development of a platform recommendation tool that is now being used by a large enterprise for performance optimization of voluminous crowd tasks.

Author Biographies

Sakyajit Bhattacharya, Xerox Research Centre India

Research Scientist

Deepthi Chander, Xerox Research Centre India

Research Scientist

Koustuv Dasgupta, Xerox Research Centre India

Senior Research Scientist

Saraschandra Karanam, Xerox Research Centre India

Research Scientist

Vaibhav Rajan, Xerox Research Centre India

Research Scientist

References

Akaike, H. (1974). A new look at the statistical model identification. IEEE Trans. Automat. Control 19, 6 (1974), 716–723.

Barowy, D. W, Curtsinger, C, Berger, E. D, and McGregor, A. (2012).

AutoMan: a platform for integrating human-based and digital computation. In Proceedings of the ACM international conference on Object oriented programming systems languages and applications (OOPSLA ’12). ACM, New York, NY, USA, 639–654. DOI:http://dx.doi.org/10.1145/2384616.2384663

Bernstein, M. S, Karger, D. R, Miller, R. C, and Brandt, J. (2012). Analytic Methods for Optimizing Realtime Crowdsourcing. CoRR abs/1204.2995 (2012).

Berry, D. A and Fristedt, B (Eds.). (1985). Bandit problems: sequential allocation of experiments. Chapman and Hall.

Bücheler, T. A, Lonigro, R, Füchslin, R. M, and Pfeifer, R. (2011). Modeling and Simulating Crowdsourcing as a Complex Biological System: Human Crowds Manifesting Collective Intelligence on the Internet. In ECAL 2011. The Eleventh European Conference on the Synthesis and Simulation of Living Systems (ECAL), Tom Lenaerts, Mario Giacobini, Hugues Bersini, Paul Bourgine, Marco Dorigo, and Renï£¡ Doursat (Eds.). MIT Press, Boston, Mass., 109–116.

Celis, E, Dasgupta, K, and Rajan, V. (2013). Adaptive Crowdsourcing for Temporal Crowds. In 3rd TemporalWeb AnalyticsWorkshop, Rio de Janeiro, Brazil Accepted, to appear.

Chapelle, O and Li, L. (2011). An empirical evaluation of thompson sampling. In Neural Information Processing Systems (NIPS).

Dasgupta, K, Rajan, V, Karanam, S, Ponnavaikko, K, Balamurugan, C, and Piratla, N. M. (2013). CrowdUtility: Know the crowd that works for you. In CHI Extended Abstracts. 145–150.

Difallah, D. E, Demartini, G, and Cudré-Mauroux, P. (2013). Pick-a-crowd: tell me what you like, and i’ll tell you what to do. In WWW. 367–374.

Eickhoff, C and Vries, A. P. (2013). Increasing cheat robustness of crowdsourcing tasks. Inf. Retr. 16, 2 (April 2013), 121–137. DOI: http://dx.doi.org/10.1007/s10791-011-9181-9

Faridani, S, Hartmann, B, and Ipeirotis, P. G. (2011). What0s the right price? pricing tasks for finishing on time. In Proc. of AAAI Workshop on Human Computation.

Hossfeld, T, Hirth, M, and Tran-Gia, P. (2011). Modeling of crowdsourcing platforms and granularity of work organization in future internet. In Proceedings of the 23rd International Teletraffic Congress (ITC ’11). ITCP, 142–149. http://dl.acm.org/citation.cfm?id=2043468.2043491

Khazankin, R, Psaier, H, Schall, D, and Dustdar, S. (2011). QoS-Based task scheduling in crowdsourcing environments. In Proceedings of the 9th international conference on Service-Oriented Computing (ICSOC’11). Springer-Verlag, Berlin, Heidelberg, 297–311. DOI:http://dx.doi.org/10.1007/978-3-642-25535-9_20

Khazankin, R, Schall, D, and Dustdar, S. (2012). Predicting qos in scheduled crowdsourcing. In Proceedings of the 24th international conference on Advanced Information Systems Engineering (CAiSE’12). Springer-Verlag, Berlin, Heidelberg, 460–472. DOI:http://dx.doi.org/10.1007/978-3-642-31095-9_30

Liu, X, Lu, M, Ooi, B. C, Shen, Y, Wu, S, and Zhang, M. (2012). CDAS: a crowdsourcing data analytics system. Proc. VLDB Endow.5, 10 (June 2012), 1040–1051. http://dl.acm.org/citation.cfm?id=2336664.2336676

Long, J. S. (1997). Regression Models for Categorical and Limited Dependent Variables. Thousand Oaks: Sage Publications.

Minder, P, Seuken, S, Bernstein, A, and Zollinger, M. (2012). CrowdManager - Combinatorial allocation and pricing of crowdsourcing tasks with time constraints. In Workshop on Social Computing and User Generated Content in conjunction with ACM Conference on Electronic Commerce (ACM-EC 2012). Valencia, Spain, 1–18.

Rajan, V, Bhattacharya, S, Celis, L. E, Chander, D, Dasgupta, K, and Karanam, S. (2013). CrowdControl: An online learning approach for optimal task scheduling in a dynamic crowd platform. In ICML Workshop: Machine Learning Meets Crowdsourcing.

Roy, S. B, Lykourentzou, I, Thirumuruganathan, S, Amer-Yahia, S, and Das, G. (2013). Crowds, not Drones: Modeling Human Factors in Interactive Crowdsourcing. In DBCrowd. 39–42.

Saxton, G, Oh, O, and Kishore, R. (2013). Rules of Crowdsourcing: Models, Issues, and Systems of Control. Inf. Sys. Manag. 30, 1(Jan. 2013), 2–20. DOI:http://dx.doi.org/10.1080/10580530.2013.739883

Schwartz, G. (1978). Estimating the dimensions of a model. Annals of Statistics 6 (1978), 461–464.

Srinivas, N, Krause, A, Kakade, S, and Seeger, M. (2010). Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design. In Proceedings of the 29th International Conference on Machine Learning (ICML 2010). 1015–1022.

Tran-Thanh, L, Stein, S, Rogers, A, and Jennings, N. R. (2012). Efficient crowdsourcing of unknown experts using multi-armed bandits. In European Conference on Artificial Intelligence. 768–773.

Wang, J, Faridani, S, and Ipeirotis, P. (2011)a. Estimating the completion time of crowdsourced tasks using survival analysis models. Crowdsourcing for search and data mining (CSDM 2011) 31 (2011).