Skip to Main content Skip to Navigation
Book sections

Methodology and steps towards the construction of EPEC, a corpus of written Basque tagged at morphological and syntactic levels for the automatic processing

Abstract : This article describes the different steps in the construction of EPEC (Reference Corpus for the Processing of Basque). EPEC is a corpus of standard written Basque that has been manually tagged at different levels (morphology, surface syntax, phrases) and is currently being hand tagged at deep syntax level following the Dependency Structure-based Scheme. It is aimed to be a "reference" corpus for the development and improvement of several NLP tools for Basque. This corpus has already been used for the construction of some tools such as a morphological analyser, a lemmatiser, or a shallow syntactic analyser.
Complete list of metadata

https://artxiker.ccsd.cnrs.fr/artxibo-00080508
Contributor : Izaskun Aldezabal <>
Submitted on : Monday, June 19, 2006 - 11:43:52 AM
Last modification on : Thursday, June 27, 2019 - 10:20:07 AM
Long-term archiving on: : Monday, April 5, 2010 - 9:32:57 PM

Identifiers

  • HAL Id : artxibo-00080508, version 1

Citation

Itziar Aduriz, Maxux Aranzabe, Jose Maria Arriola, Atziber Atutxa, Díaz de Ilarraza A., et al.. Methodology and steps towards the construction of EPEC, a corpus of written Basque tagged at morphological and syntactic levels for the automatic processing. 56, Rodopi. Book series: Language and Computers., pp.1-15, 2006. ⟨artxibo-00080508v1⟩

Share

Metrics

Record views

44

Files downloads

153