汉语口语互动分级语料库的构建

介绍了一个汉语口语互动分级语料库的构建工作。该语料库为国内首个汉语口语互动分级语料库,记录了测试环境下学生口语互动的实际情况。语料库由超过1200名学生的对话录制而成,时长超过3000min,样例分布范围从小学一年级到高中三年级。该语料库能为口语互动研究者提供经过转写和标注的真实语料,在语料调查的基础上可实现对口语互动的量化分析。另外,该语料库回避了通常根据任务难易度进行分级的做法,而是根据会话特征进行互动分级,以供研究者参考。这对口语互动分级标准的确立和互动教材的编纂等也将有参考意义。...

Full description

Saved in:
Bibliographic Details
Published in计算机工程与科学 Vol. 38; no. 2; pp. 395 - 400
Main Author 王跃龙
Format Journal Article
LanguageChinese
Published 华侨大学文学院,福建泉州,362021 2016
Subjects
Online AccessGet full text
ISSN1007-130X
DOI10.3969/j.issn.1007-130X.2016.02.029

Cover

More Information
Summary:介绍了一个汉语口语互动分级语料库的构建工作。该语料库为国内首个汉语口语互动分级语料库,记录了测试环境下学生口语互动的实际情况。语料库由超过1200名学生的对话录制而成,时长超过3000min,样例分布范围从小学一年级到高中三年级。该语料库能为口语互动研究者提供经过转写和标注的真实语料,在语料调查的基础上可实现对口语互动的量化分析。另外,该语料库回避了通常根据任务难易度进行分级的做法,而是根据会话特征进行互动分级,以供研究者参考。这对口语互动分级标准的确立和互动教材的编纂等也将有参考意义。
Bibliography:WANG Yue-long (College of Humanities, Huaqiao University, Quanzhou 362021, China)
We construct a spoken interaction (SI) corpus of Mandarin Chinese in this paper, which is the first hierarchical corpus of SI in China. This corpus is a valuable language resource in which the spoken interactions among more than 1200 students are recorded, and the whole duration time is more than 3000 minutes. The students range from Grade 1 in primary school to Grade 3 in high school. This corpus provides researchers with materials with transcriptions and annotations, by which the quantitative analysis for SI can be realized. In addition, the materials are graded according to Conversation Analysis (CA)rather than task levels, providing reference for researchers. So the textbook compiling and the establishment of SI grading standards can both benefit from this corpus.
43-1258/TP
spoken interaction; graded corpus ; grading standard
ISSN:1007-130X
DOI:10.3969/j.issn.1007-130X.2016.02.029