Crowds of Crowds: Performance based Modeling and Optimization over Multiple Crowdsourcing Platforms
DOI:
https://doi.org/10.15346/hc.v2i1.6Keywords:
Algorithms, Scheduling, Modeling, Performance, CrowdsourcingAbstract
The dynamic nature of crowdsourcing platforms poses interesting problems for users who wish to schedule large batches of tasks on these platforms. Of particular interest is that of scheduling the right number of tasks with the right price at the right time, in order to achieve the best performance with respect to accuracy and completion time. Research results, however, have shown that the performance exhibited by online platforms are both dynamic and largely unpredictable. This is primarily attributed to the fact that, unlike a traditional organizational workforce, a crowd platform is inherently composed of a dynamic set of workers with varying performance characteristics. Thus, any effort to optimize the performance needs to be complemented by a deep understanding and robust techniques to model the behaviour of the underlying platform(s). To this end, the research in this paper studies the above interrelated facets of crowdsourcing in two parts. The first part comprises the aspects of manual and automated statistical modeling of the crowd-workers' performance; the second part deals with optimization via intelligent scheduling over multiple platforms. %based on simulation testbed generated by the statistical models.Detailed experimentation with competing techniques, under varying operating conditions, validate the efficacy of our proposed algorithms while posting tasks either on a single crowd platform or multiple platforms. Our research has led to the development of a platform recommendation tool that is now being used by a large enterprise for performance optimization of voluminous crowd tasks.References
Akaike, H. (1974). A new look at the statistical model identification. IEEE Trans. Automat. Control 19, 6 (1974), 716–723.
Barowy, D. W, Curtsinger, C, Berger, E. D, and McGregor, A. (2012).
AutoMan: a platform for integrating human-based and digital computation. In Proceedings of the ACM international conference on Object oriented programming systems languages and applications (OOPSLA ’12). ACM, New York, NY, USA, 639–654. DOI:http://dx.doi.org/10.1145/2384616.2384663
Bernstein, M. S, Karger, D. R, Miller, R. C, and Brandt, J. (2012). Analytic Methods for Optimizing Realtime Crowdsourcing. CoRR abs/1204.2995 (2012).
Berry, D. A and Fristedt, B (Eds.). (1985). Bandit problems: sequential allocation of experiments. Chapman and Hall.
Bücheler, T. A, Lonigro, R, Füchslin, R. M, and Pfeifer, R. (2011). Modeling and Simulating Crowdsourcing as a Complex Biological System: Human Crowds Manifesting Collective Intelligence on the Internet. In ECAL 2011. The Eleventh European Conference on the Synthesis and Simulation of Living Systems (ECAL), Tom Lenaerts, Mario Giacobini, Hugues Bersini, Paul Bourgine, Marco Dorigo, and Ren Doursat (Eds.). MIT Press, Boston, Mass., 109–116.
Celis, E, Dasgupta, K, and Rajan, V. (2013). Adaptive Crowdsourcing for Temporal Crowds. In 3rd TemporalWeb AnalyticsWorkshop, Rio de Janeiro, Brazil Accepted, to appear.
Chapelle, O and Li, L. (2011). An empirical evaluation of thompson sampling. In Neural Information Processing Systems (NIPS).
Dasgupta, K, Rajan, V, Karanam, S, Ponnavaikko, K, Balamurugan, C, and Piratla, N. M. (2013). CrowdUtility: Know the crowd that works for you. In CHI Extended Abstracts. 145–150.
Difallah, D. E, Demartini, G, and Cudré-Mauroux, P. (2013). Pick-a-crowd: tell me what you like, and i’ll tell you what to do. In WWW. 367–374.
Eickhoff, C and Vries, A. P. (2013). Increasing cheat robustness of crowdsourcing tasks. Inf. Retr. 16, 2 (April 2013), 121–137. DOI: http://dx.doi.org/10.1007/s10791-011-9181-9
Faridani, S, Hartmann, B, and Ipeirotis, P. G. (2011). What0s the right price? pricing tasks for finishing on time. In Proc. of AAAI Workshop on Human Computation.
Hossfeld, T, Hirth, M, and Tran-Gia, P. (2011). Modeling of crowdsourcing platforms and granularity of work organization in future internet. In Proceedings of the 23rd International Teletraffic Congress (ITC ’11). ITCP, 142–149. http://dl.acm.org/citation.cfm?id=2043468.2043491
Khazankin, R, Psaier, H, Schall, D, and Dustdar, S. (2011). QoS-Based task scheduling in crowdsourcing environments. In Proceedings of the 9th international conference on Service-Oriented Computing (ICSOC’11). Springer-Verlag, Berlin, Heidelberg, 297–311. DOI:http://dx.doi.org/10.1007/978-3-642-25535-9_20
Khazankin, R, Schall, D, and Dustdar, S. (2012). Predicting qos in scheduled crowdsourcing. In Proceedings of the 24th international conference on Advanced Information Systems Engineering (CAiSE’12). Springer-Verlag, Berlin, Heidelberg, 460–472. DOI:http://dx.doi.org/10.1007/978-3-642-31095-9_30
Liu, X, Lu, M, Ooi, B. C, Shen, Y, Wu, S, and Zhang, M. (2012). CDAS: a crowdsourcing data analytics system. Proc. VLDB Endow.5, 10 (June 2012), 1040–1051. http://dl.acm.org/citation.cfm?id=2336664.2336676
Long, J. S. (1997). Regression Models for Categorical and Limited Dependent Variables. Thousand Oaks: Sage Publications.
Minder, P, Seuken, S, Bernstein, A, and Zollinger, M. (2012). CrowdManager - Combinatorial allocation and pricing of crowdsourcing tasks with time constraints. In Workshop on Social Computing and User Generated Content in conjunction with ACM Conference on Electronic Commerce (ACM-EC 2012). Valencia, Spain, 1–18.
Rajan, V, Bhattacharya, S, Celis, L. E, Chander, D, Dasgupta, K, and Karanam, S. (2013). CrowdControl: An online learning approach for optimal task scheduling in a dynamic crowd platform. In ICML Workshop: Machine Learning Meets Crowdsourcing.
Roy, S. B, Lykourentzou, I, Thirumuruganathan, S, Amer-Yahia, S, and Das, G. (2013). Crowds, not Drones: Modeling Human Factors in Interactive Crowdsourcing. In DBCrowd. 39–42.
Saxton, G, Oh, O, and Kishore, R. (2013). Rules of Crowdsourcing: Models, Issues, and Systems of Control. Inf. Sys. Manag. 30, 1(Jan. 2013), 2–20. DOI:http://dx.doi.org/10.1080/10580530.2013.739883
Schwartz, G. (1978). Estimating the dimensions of a model. Annals of Statistics 6 (1978), 461–464.
Srinivas, N, Krause, A, Kakade, S, and Seeger, M. (2010). Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design. In Proceedings of the 29th International Conference on Machine Learning (ICML 2010). 1015–1022.
Tran-Thanh, L, Stein, S, Rogers, A, and Jennings, N. R. (2012). Efficient crowdsourcing of unknown experts using multi-armed bandits. In European Conference on Artificial Intelligence. 768–773.
Wang, J, Faridani, S, and Ipeirotis, P. (2011)a. Estimating the completion time of crowdsourced tasks using survival analysis models. Crowdsourcing for search and data mining (CSDM 2011) 31 (2011).
Downloads
Published
How to Cite
Issue
Section
License
Authors who publish with this journal agree to the following terms:- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).