Building a Corpus of 2L English for Automatic Assessment: the CLEC Corpus
MetadataShow full item record
DepartmentFilología Francesa e Inglesa
SourceProcedia. Social and Behavioral Sciences 198 (2015) 515-525
In this paper we describe the CLEC corpus, an ongoing project set up at the University of Cádiz with the purpose of building up a large corpus of English as a 2L classified according to CEFR proficiency levels and formed to train statistical models for automatic proficiency assessment. The goal of this corpus is twofold: on the one hand it will be used as a data resource for the development of automatic text classification systems and, on the other, it has been used as a means of teaching innovation techniques.