Building a Corpus of 2L English for Automatic Assessment: the CLEC Corpus

Identificadores
URI: http://hdl.handle.net/10498/17823
DOI: 10.1016/j.sbspro.2015.07.474
URL: http://www.sciencedirect.com/science/article/pii/S1877042815044754
Statistics
Metrics and citations
Share
Metadata
Show full item recordDate
2015-01-01Department
Filología Francesa e InglesaSource
Procedia. Social and Behavioral Sciences 198 (2015) 515-525Abstract
In this paper we describe the CLEC corpus, an ongoing project set up at the University of Cádiz with the purpose of building up a large corpus of English as a 2L classified according to CEFR proficiency levels and formed to train statistical models for automatic proficiency assessment. The goal of this corpus is twofold: on the one hand it will be used as a data resource for the development of automatic text classification systems and, on the other, it has been used as a means of teaching innovation techniques.