节点文献
对外汉语教学阅读文本素材库的选材及标注
Material Select and Labeling Principle for Reading Material Corpus
【作者】 陈俊;
【导师】 郭曙纶;
【作者基本信息】 上海交通大学 , 语言学及应用语言学, 2011, 硕士
【摘要】 本文共分为五个部分。第一章是引言,概述了建设对外汉语教学文本素材库的原因、目的以及相关的文献综述。建设对外汉语教学文本素材库是非常有必要的。其外部原因是对外汉语教材的蓬勃发展;其内部原因是选择素材对编写教材的巨大影响,和对外汉语教学素材库的积极作用。第二章是素材库的选材原则。本章概述了素材库的五大选材原则:完整性原则、适中性原则、实用性原则、趣味性原则和非时效性原则。其中完整性原则、适中性原则和实用性原则是素材库选材的主要标准,而趣味性原则和非时效性原则是素材库选材时参考的辅助性原则。第三章是素材库的标注原则。本章主要概述了素材库的三大标注原则:规范性原则、区别性原则和系统性原则。其中系统性原则是素材标注时要考虑的主要原则,而在建立标注系统时则要考虑规范性原则和区别性原则。第四章是素材库的标注情况。本章主要概述了素材库的标注情况。在本文中将分为人工标注和机器标注两个部分进行介绍。人工标注部分的属性本文主要介绍:标题、作者、作者国别、摘要、关键词、体裁、题材、来源、适合国别、适合年龄、是否只适合华裔、趣味性等13个属性;机器标注部分的属性本文主要介绍:标题、作者、作者国别、摘要、平均句长、文本长度、文本字种数、生词率(生词密度)、难度级别、生词重现率、超纲词等11个属性。第五章是结论与展望。本章总结了本文的研究成果与不足,并展望今后的研究工作。本文综合运用语料库、统计和对比的研究方法,建立并加工了一个7万多字的对外汉语教学阅读文本素材库,在此基础上进行了素材选材和标注的分析和总结。本文提供了大量的数据和素材资源,可为教师今后编写对外汉语教材提供参考。
【Abstract】 This thesis can be divided into five parts.The first chapter is Introduction. It briefly introduces the key reason to build the Chinese Material Corpus and the referred relevant documents. It is essential to set up the Chinese Material Corpus due to fast development of Chinese language Teaching text books and the big influence of material selection for compiling a text book.The second chapter is the material selection’s principle for the Corpus. In this chapter, 5 material selection principles are described: integrality, moderateness, practicability, interestingness and no timeliness. Among these 5 principles, integrality, moderateness and practicability are main standards for Chinese material selection, while interestingness and no timeliness are accessorial rules for reference.The third chapter is the labeling principle. There are 3 labeling principles described in this chapter: normalization, distinctiveness and Systematicness. Systematicness is the key principle for consideration while labeling, and normalization and distinctiveness should be thought about when setting up the labeling system.The fourth chapter reviews some of the works the author has done. In this chapter, the labeling of material corpus consists of two parts: artificial labeling and computer labeling. The property of artificial labeling includes: headings, authors, authors’ nationality, abstracts, key words, character types, literature types, subject matters, nationality adaption, age adaption, adaption for ethnic Chinese and interestingness. Computer labeling consist of headings, authors, authors’ nationality, abstracts, average sentence length, text length, new words density, difficulty, new words recurrence rate and words beyond compendium.The final chapter is summary and expectation. Author summarized the research achievement and deficiency, while looked into the future works. This paper set up a News Reading Corpus over 70000 words for Chinese language teaching based on research method of statistics and comparison, and author also completed material selection and labeling analysis. This paper provides a large amount of data and material for the compiling reference of Chinese teaching text books in the future.
【Key words】 Corpus; Teaching Chinese for Foreign Learners; Reading Corpus; Material Selection; Labeling;