Read-Agree-Predict: A Crowdsourced Approach to Discovering Relevant Primary Sources for Historians
Keywords:Applications, Techniques, Algorithms
AbstractHistorians spend significant time evaluating the relevance of primary sources that they encounter in digitized archives and through web searches. One reason this task is time-consuming is that historians’ research interests are often highly abstract and specialized. These topics are unlikely to be manually indexed and are difficult to identify with automated text analysis techniques. In this article, we investigate the potential of a new crowdsourcing model in which the historian delegates to a novice crowd the task of evaluating the relevance of primary sources with respect to her unique research interests. The model employs a novel crowd workflow, Read-AgreePredict (RAP), that allows novice crowd workers to perform as well as expert historians. As a useful byproduct, RAP also reveals and prioritizes crowd confusions as targeted learning opportunities. We demonstrate the value of our model with two experiments with paid crowd workers (n=170), with the future goal of extending our work to classroom students and public history interventions. We also discuss broader implications for historical research and education.
Aggarwal, C. C., & Zhai, C. (2012). A Survey of Text Classification Algorithms. In C. C. Aggarwal & C. Zhai (Eds.), Mining Text Data (pp. 163–222). Springer US. https://doi.org/10.1007/978-1-4614-3223-4_6
Anderson, J. R., Boyle, C. F., Corbett, A. T., & Lewis, M. W. (1990). Cognitive modeling and intelligent tutoring. Artificial Intelligence, 42(1), 7–49.
Anderson, J. R., Boyle, C. F., & Reiser, B. J. (1985). Intelligent tutoring systems. Science(Washington), 228(4698), 456–462.
André, P., Kittur, A., & Dow, S. P. (2014). Crowd Synthesis: Extracting Categories and Clusters from Complex Data. In Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work & Social Computing (pp. 989–998). New York, NY, USA: ACM. https://doi.org/10.1145/2531602.2531653
Banko, M., & Brill, E. (2001). Scaling to Very Very Large Corpora for Natural Language Disambiguation. In Proceedings of the 39th Annual Meeting on Association for Computational Linguistics (pp. 26–33). Stroudsburg, PA, USA: Association for Computational Linguistics. https://doi.org/10.3115/1073012.1073017
Bernstein, M. S., Little, G., Miller, R. C., Hartmann, B., Ackerman, M. S., Karger, D. R., … Panovich, K. (2010). Soylent: A Word Processor with a Crowd Inside. In Proceedings of the 23Nd Annual ACM Symposium on User Interface Software and Technology (pp. 313–322). New York, NY, USA: ACM. https://doi.org/10.1145/1866029.1866078
Bobrow, S. A., & Bower, G. H. (1969). Comprehension and recall of sentences. Journal of Experimental Psychology, 80(3, Pt.1), 455–461. https://doi.org/10.1037/h0027461
Brands, H. W. (2008). Response to Hochschild. Historically Speaking, 9(4), 6–7. https://doi.org/10.1353/hsp.2008.0063
Bretzing, B. H., & Kulhavy, R. W. (1979). Notetaking and depth of processing. Contemporary Educational Psychology, 4(2), 145–153. https://doi.org/10.1016/0361-476X(79)90069-9
Brown, A. L. (1992). Design experiments: Theoretical and methodological challenges in creating complex interventions in classroom settings. The Journal of the Learning Sciences, 2(2), 141–178.
Cai, C. J., Iqbal, S. T., & Teevan, J. (2016). Chain Reactions: The Impact of Order on Microtask Chains. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (pp. 3143–3154). New York, NY, USA: ACM. https://doi.org/10.1145/2858036.2858237
Chi, E. H., Hong, L., Heiser, J., & Card, S. K. (2006). Scentindex: Conceptually Reorganizing Subject Indexes for Reading. In 2006 IEEE Symposium On Visual Analytics Science And Technology (pp. 159–166). https://doi.org/10.1109/VAST.2006.261418
Chi, Ed H., Hong, L., Gumbrecht, M., & Card, S. K. (2005). ScentHighlights: Highlighting Conceptually-related Sentences During Reading. In Proceedings of the 10th International Conference on Intelligent User Interfaces (pp. 272–274). New York, NY, USA: ACM. https://doi.org/10.1145/1040830.1040895
Council, N. R., & others. (2000). How people learn: Brain, mind, experience, and school: Expanded edition. National Academies Press. Retrieved from https://books.google.com/books?hl=en&lr=&id=QZb7PnTgSCgC&oi=fnd&pg=PR1&dq=bransford+how+people+learn&ots=FsQVkIesZE&sig=qESNaxmqFmysC8uqFFdNdTvJ2LI
Craik, F. I. M., & Lockhart, R. S. (1972). Levels of processing: A framework for memory research. Journal of Verbal Learning and Verbal Behavior, 11(6), 671–684. https://doi.org/10.1016/S0022-5371(72)80001-X
Craik, F. I. M., & Tulving, E. (1975). Depth of processing and the retention of words in episodic memory. Journal of Experimental Psychology: General, 104(3), 268–294. https://doi.org/10.1037/0096-34184.108.40.2068
Davis, M. S. (1971). That’s Interesting: Towards a Phenomenology of Sociology and a Sociology of Phenomenology. Philosophy of the Social Sciences, 1(4), 309–344.
Dawid, A. P., & Skene, A. M. (1979). Maximum likelihood estimation of observer error-rates using the EM algorithm. Applied Statistics, 20–28.
Doctorow, M., C, M., & Marks, C. (1978). Generative processes in reading comprehension. Journal of Educational Psychology, 70(2), 109–118. https://doi.org/10.1037/0022-06220.127.116.11
Dow, S., Kulkarni, A., Klemmer, S., & Hartmann, B. (2012). Shepherding the Crowd Yields Better Work. In Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work (pp. 1013–1022). New York, NY, USA: ACM. https://doi.org/10.1145/2145204.2145355
Drapeau, R., Chilton, L. B., Bragg, J., & Weld, D. S. (2016). MicroTalk: Using Argumentation to Improve Crowdsourcing Accuracy. In Proceedings of the 4th AAAI Conference on Human Computation and Crowdsourcing (HCOMP). Retrieved from http://www.cs.washington.edu/ai/pubs/drapeau-hcomp16.pdf
Glassman, E. L., Kim, J., Monroy-Hernández, A., & Morris, M. R. (2015). Mudslide: A Spatially Anchored Census of Student Confusion for Online Lecture Videos. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (pp. 1555–1564). New York, NY, USA: ACM. https://doi.org/10.1145/2702123.2702304
Glassman, E. L., Lin, A., Cai, C. J., & Miller, R. C. (2016). Learnersourcing Personalized Hints. In Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing (pp. 1626–1636). New York, NY, USA: ACM. https://doi.org/10.1145/2818048.2820011
Hosseini, M., Cox, I. J., Milić-Frayling, N., Kazai, G., & Vinay, V. (2012). On Aggregating Labels from Multiple Crowd Workers to Infer Relevance of Documents. In Advances in Information Retrieval (pp. 182–194). Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28997-2_16
Hynd, C., Holschuh, J. P., & Hubbard, B. P. (2004). Thinking like a historian: College students’ reading of multiple historical documents. Journal of Literacy Research, 36(2), 141–176.
Ipeirotis, P. G., Provost, F., & Wang, J. (2010). Quality Management on Amazon Mechanical Turk. In Proceedings of the ACM SIGKDD Workshop on Human Computation (pp. 64–67). New York, NY, USA: ACM. https://doi.org/10.1145/1837885.1837906
Kavzoglu, T., & Colkesen, I. (2012). The effects of training set size for performance of support vector machines and decision trees. In Proceeding of the 10th international symposium on spatial accuracy assessment in natural resources and environmental sciences, July (pp. 10–13).
Kim, J., Miller, R. C., & Gajos, K. Z. (2013). Learnersourcing Subgoal Labeling to Support Learning from How-to Videos. In CHI ’13 Extended Abstracts on Human Factors in Computing Systems (pp. 685–690). New York, NY, USA: ACM. https://doi.org/10.1145/2468356.2468477
Kim, J., & others. (2015). Learnersourcing: improving learning with collective learner activity. Massachusetts Institute of Technology. Retrieved from http://dspace.mit.edu/handle/1721.1/101464
Kintsch, W., & van Dijk, T. A. (1978). Toward a model of text comprehension and production. Psychological Review, 85(5), 363–394. https://doi.org/10.1037/0033-295X.85.5.363
Law, E., Gajos, K. Z., Wiggins, A., Gray, M. L., & Williams, A. (2017). Crowdsourcing As a Tool for Research: Implications of Uncertainty. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing (pp. 1544–1561). New York, NY, USA: ACM. https://doi.org/10.1145/2998181.2998197
Lee, D. J., Lo, J., Kim, M., & Paulos, E. (2016). Crowdclass: Designing classification-based citizen science learning modules. HCOMP. Retrieved from http://dorisjunglinlee.com/files/crowdclass.pdf
Linden, M., & Wittrock, M. C. (1981). The Teaching of Reading Comprehension according to the Model of Generative Learning. Reading Research Quarterly, 17(1), 44–57. https://doi.org/10.2307/747248
Little, G., Chilton, L. B., Goldman, M., & Miller, R. C. (2010). Exploring Iterative and Parallel Human Computation Processes. In Proceedings of the ACM SIGKDD Workshop on Human Computation (pp. 68–76). New York, NY, USA: ACM. https://doi.org/10.1145/1837885.1837907
Mandell, N. (2008). Thinking like a Historian: A Framework for Teaching and Learning. OAH Magazine of History, 22(2), 55–59. https://doi.org/10.1093/maghis/22.2.55
McDonnell, T., Lease, M., Elsayad, T., & Kutlu, M. (2016). Why Is That Relevant? Collecting Annotator Rationales for Relevance Judgments. In Proceedings of the 4th AAAI Conference on Human Computation and Crowdsourcing (HCOMP). Retrieved from https://www.ischool.utexas.edu/~ml/papers/mcdonnell-hcomp16.pdf
Merrill, D. C., Reiser, B. J., Ranney, M., & Trafton, J. G. (1992). Effective Tutoring Techniques: A Comparison of Human Tutors and Intelligent Tutoring Systems. Journal of the Learning Sciences, 2(3), 277–305. https://doi.org/10.1207/s15327809jls0203_2
Mitros, P. (2015). Learnersourcing of Complex Assessments. In Proceedings of the Second (2015) ACM Conference on Learning @ Scale (pp. 317–320). New York, NY, USA: ACM. https://doi.org/10.1145/2724660.2728683
Nawrotzki, K. (Ed.). (2013). Writing History in the Digital Age. University of Michigan Press. Retrieved from http://hdl.handle.net/2027/spo.12230987.0001.001
Nist, S. L., & Hogrebe, M. C. (1987). The Role of Underlining and Annotating in Remembering Textual Information. Reading Research and Instruction, 27(1), 12–25. https://doi.org/10.1080/19388078709557922
Peterson, S. E. (1991). The cognitive functions of underlining as a study technique. Reading Research and Instruction, 31(2), 49–56. https://doi.org/10.1080/19388079209558078
Pirolli, P., & Card, S. (2005). The sensemaking process and leverage points for analyst technology as identified through cognitive task analysis. In Proceedings of international conference on intelligence analysis (Vol. 5, pp. 2–4). Retrieved from https://www.e-education.psu.edu/geog885/sites/www.e-education.psu.edu.geog885/files/geog885q/file/Lesson_02/Sense_Making_206_Camera_Ready_Paper.pdf
Prelec, D., Seung, H. S., & McCoy, J. (2017). A solution to the single-question crowd wisdom problem. Nature, 541(7638), 532–535. https://doi.org/10.1038/nature21054
Russell, D. M., Stefik, M. J., Pirolli, P., & Card, S. K. (1993). The Cost Structure of Sensemaking. In Proceedings of the INTERACT ’93 and CHI ’93 Conference on Human Factors in Computing Systems (pp. 269–276). New York, NY, USA: ACM. https://doi.org/10.1145/169059.169209
Rutner, J., & Schonfeld, R. (2012). Supporting the Changing Research Practices of Historians. New York: Ithaka S+R. Retrieved from http://sr.ithaka.org/?p=22532
Schnell, T., & Rocchio, D. (1978). A Comparison of Underlying Strategies for Improving Reading Comprehension and Retention. Reading Horizons, 18(2). Retrieved from http://scholarworks.wmich.edu/reading_horizons/vol18/iss2/4
Sebastiani, F. (2002). Machine Learning in Automated Text Categorization. ACM Comput. Surv., 34(1), 1–47. https://doi.org/10.1145/505282.505283
Sheng, V. S., Provost, F., & Ipeirotis, P. G. (2008). Get Another Label? Improving Data Quality and Data Mining Using Multiple, Noisy Labelers. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 614–622). New York, NY, USA: ACM. https://doi.org/10.1145/1401890.1401965
Šimko, J., Šimko, M., Bieliková, M., Ševcech, J., & Burger, R. (2013). Classsourcing: Crowd-Based Validation of Question-Answer Learning Objects. In International Conference on Computational Collective Intelligence (pp. 62–71). Springer. Retrieved from http://link.springer.com/chapter/10.1007/978-3-642-40495-5_7
Smart, K. L., & Bruning, J. L. (1973). An examination of the practical importance of the von Restorff effect. In annual meeting of the American Psychological Association, Montreal, Canada.
Smith, R. (2007). An Overview of the Tesseract OCR Engine. In Ninth International Conference on Document Analysis and Recognition (ICDAR 2007) (Vol. 2, pp. 629–633). https://doi.org/10.1109/ICDAR.2007.4376991
Smith, R. W. (2009). Hybrid Page Layout Analysis via Tab-Stop Detection. In Proceedings of the 2009 10th International Conference on Document Analysis and Recognition (pp. 241–245). Washington, DC, USA: IEEE Computer Society. https://doi.org/10.1109/ICDAR.2009.257
Smith, Ray, Antonova, D., & Lee, D.-S. (2009). Adapting the Tesseract Open Source OCR Engine for Multilingual OCR. In Proceedings of the International Workshop on Multilingual OCR (pp. 1:1–1:8). New York, NY, USA: ACM. https://doi.org/10.1145/1577802.1577804
Snow, R., O’Connor, B., Jurafsky, D., & Ng, A. Y. (2008). Cheap and Fast—but is It Good?: Evaluating Non-expert Annotations for Natural Language Tasks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (pp. 254–263). Stroudsburg, PA, USA: Association for Computational Linguistics. Retrieved from http://dl.acm.org/citation.cfm?id=1613715.1613751
Stearns, P. N., Seixas, P. C., & Wineburg, S. (2000). Knowing, teaching, and learning history: National and international perspectives. NYU Press. Retrieved from https://books.google.com/books?hl=en&lr=&id=viQVCgAAQBAJ&oi=fnd&pg=PR9&dq=+Knowing,teaching,+and+learning+history&ots=gPjNC0qroE&sig=RxJx6hzT9Cq0-CTOdyk8RhDbTBs
Tally, B., & Goldenberg, L. B. (2005). Fostering historical thinking with digitized primary sources. Journal of Research on Technology in Education, 38(1), 1–21.
Venkatesan, R., Er, M. J., Dave, M., Pratama, M., & Wu, S. (2016). A novel online multi-label classifier for high-speed streaming data applications. Evolving Systems, 1–13.
Weir, S., Kim, J., Gajos, K. Z., & Miller, R. C. (2015). Learnersourcing Subgoal Labels for How-to Videos. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing (pp. 405–416). New York, NY, USA: ACM. https://doi.org/10.1145/2675133.2675219
Wineburg, S. (2010). Thinking like a historian. Teaching with Primary Sources Quarterly, 3(1), 2–4.
Wineburg, S. S. (1991). On the Reading of Historical Texts: Notes on the Breach Between School and Academy. American Educational Research Journal, 28(3), 495–519. https://doi.org/10.3102/00028312028003495
Wittrock, M. C., & Alesandrini, K. (1990). Generation of Summaries and Analogies and Analytic and Holistic Abilities. American Educational Research Journal, 27(3), 489–502. https://doi.org/10.3102/00028312027003489
Wittrock, Merlin C. (1989). Generative Processes of Comprehension. Educational Psychologist, 24(4), 345–376. https://doi.org/10.1207/s15326985ep2404_2
Xu, A., Rao, H., Dow, S. P., & Bailey, B. P. (2015). A Classroom Study of Using Crowd Feedback in the Iterative Design Process. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing (pp. 1637–1648). New York, NY, USA: ACM. https://doi.org/10.1145/2675133.2675140
Yu, L., Kittur, A., & Kraut, R. E. (2014). Distributed Analogical Idea Generation: Inventing with Crowds. In Proceedings of the 32Nd Annual ACM Conference on Human Factors in Computing Systems (pp. 1245–1254). New York, NY, USA: ACM. https://doi.org/10.1145/2556288.2557371
Zhang, M.-L., & Zhou, Z.-H. (2014). A review on multi-label learning algorithms. IEEE Transactions on Knowledge and Data Engineering, 26(8), 1819–1837.
Zhang, X., Zhao, J., & LeCun, Y. (2015). Character-level convolutional networks for text classification. In Advances in neural information processing systems (pp. 649–657). Retrieved from http://papers.nips.cc/paper/5782-character-level-convolutional-networks-for-text-classification
Zhu, H., Dow, S. P., Kraut, R. E., & Kittur, A. (2014). Reviewing Versus Doing: Learning and Performance in Crowd Assessment. In Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work & Social Computing (pp. 1445–1455). New York, NY, USA: ACM. https://doi.org/10.1145/2531602.2531718
How to Cite
LicenseAuthors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).