コンテンツメニュー

Generative AI in Writing Evaluation: Where We Stand and What Lies Ahead

Academic Archives of Yamaguchi Prefectural University Volume 19 Page 485-494
published_at 2026-03-31
01. grad27_IWANAKA.pdf
[fulltext] 3.17 MB
Title
Generative AI in Writing Evaluation: Where We Stand and What Lies Ahead
Abstract
Generative artificial intelligence is reshaping educational assessment; however, high-stakes evaluations of student writing remain contentious. This study proposes an LLM-derived similarity metric—cosine similarity between essay-level embedding vectors of student essays and expert model texts (e.g., instructor-written benchmark essays)—as an automated indicator of L2 English writing proficiency. Using a longitudinal design, about 35 Japanese university students will produce argumentative essays at three time points over a 15-week semester. Essays will be scored by trained human raters and analyzed for linguistic features, including lexical diversity, syntactic complexity, and cohesion. The author will examine (a) convergent validity via correlations between the similarity metric and human scores, (b) sensitivity to developmental change using repeated-measures models, and (c) incremental predictive validity through hierarchical regression by adding the similarity metric to models based on surface linguistic features. It is hypothesized that the similarity metric will show strong positive associations with human ratings, detect significant longitudinal gains, and explain unique variance beyond traditional feature-based predictors. If validated, this approach could support scalable diagnostics that complement human judgment and improve the reliability and pedagogical utility of L2 writing assessment.
Creators IWANAKA Takahiro
Source Identifiers [EISSN] 2189-4825
Resource Type departmental bulletin paper
Date Issued 2026-03-31
File Version Version of Record
Access Rights open access
Relations
[EISSN]2189-4825