Applying machine learning to automatically assess scientific models

Involving students in scientific modeling practice is one of the most effective approaches to achieving the next generation science education learning goals. Given the complexity and multirepresentational features of scientific models, scoring student‐developed models is time‐ and cost‐intensive, re...

Full description

Saved in:
Bibliographic Details
Published inJournal of research in science teaching Vol. 59; no. 10; pp. 1765 - 1794
Main Authors Zhai, Xiaoming, He, Peng, Krajcik, Joseph
Format Journal Article
LanguageEnglish
Published Hoboken, USA John Wiley & Sons, Inc 01.12.2022
Wiley
Wiley Subscription Services, Inc
Subjects
Online AccessGet full text
ISSN0022-4308
1098-2736
DOI10.1002/tea.21773

Cover

More Information
Summary:Involving students in scientific modeling practice is one of the most effective approaches to achieving the next generation science education learning goals. Given the complexity and multirepresentational features of scientific models, scoring student‐developed models is time‐ and cost‐intensive, remaining one of the most challenging assessment practices for science education. More importantly, teachers who rely on timely feedback to plan and adjust instruction are reluctant to use modeling tasks because they could not provide timely feedback to learners. This study utilized machine learning (ML), the most advanced artificial intelligence (AI), to develop an approach to automatically score student‐drawn models and their written descriptions of those models. We developed six modeling assessment tasks for middle school students that integrate disciplinary core ideas and crosscutting concepts with the modeling practice. For each task, we asked students to draw a model and write a description of that model, which gave students with diverse backgrounds an opportunity to represent their understanding in multiple ways. We then collected student responses to the six tasks and had human experts score a subset of those responses. We used the human‐scored student responses to develop ML algorithmic models (AMs) and to train the computer. Validation using new data suggests that the machine‐assigned scores achieved robust agreements with human consent scores. Qualitative analysis of student‐drawn models further revealed five characteristics that might impact machine scoring accuracy: Alternative expression, confusing label, inconsistent size, inconsistent position, and redundant information. We argue that these five characteristics should be considered when developing machine‐scorable modeling tasks.
Bibliography:Funding information
National Science Foundation, Grant/Award Numbers: 2101104, 2100964; Lappan‐Phillips Chair in the College of Natural Science at Michigan State University
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0022-4308
1098-2736
DOI:10.1002/tea.21773