Methodology and steps towards the construction of EPEC, a corpus of written Basque tagged at morphological and syntactic levels for the automatic processing - Artxiker
Chapitre D'ouvrage Corpus Linguistics Around the World.
Ed. Andrew Wilson, Paul Rayson, and Dawn Archer
Année : 2006

Methodology and steps towards the construction of EPEC, a corpus of written Basque tagged at morphological and syntactic levels for the automatic processing

Itziar I. Aduriz
  • Fonction : Auteur
Maxux M. Aranzabe
  • Fonction : Auteur
Jose Maria J. Arriola
  • Fonction : Auteur
Atziber A. Atutxa
  • Fonction : Auteur
Nerea N. Ezeiza
  • Fonction : Auteur
Koldo K. Gojenola
  • Fonction : Auteur
Maite M. Oronoz
  • Fonction : Auteur
Aitor A. Soroa
  • Fonction : Auteur
Ruben R. Urizar
  • Fonction : Auteur

Résumé

This article describes the different steps in the construction of EPEC (Reference Corpus for the Processing of Basque). EPEC is a corpus of standard written Basque that has been manually tagged at different levels (morphology, surface syntax, phrases) and is currently being hand tagged at deep syntax level following the Dependency Structure-based Scheme. It is aimed to be a "reference" corpus for the development and improvement of several NLP tools for Basque. This corpus has already been used for the construction of some tools such as a morphological analyser, a lemmatiser, or a shallow syntactic analyser.
Fichier principal
Vignette du fichier
CLAW2006.pdf (310.78 Ko) Télécharger le fichier
Loading...

Dates et versions

artxibo-00080508 , version 1 (19-06-2006)
artxibo-00080508 , version 2 (22-06-2006)

Identifiants

  • HAL Id : artxibo-00080508 , version 2

Citer

Itziar I. Aduriz, Maxux M. Aranzabe, Jose Maria J. Arriola, Atziber A. Atutxa, Arantza Díaz de Ilarraza, et al.. Methodology and steps towards the construction of EPEC, a corpus of written Basque tagged at morphological and syntactic levels for the automatic processing. 56, Rodopi. Book series: Language and Computers., pp.1-15, 2006. ⟨artxibo-00080508v2⟩
447 Consultations
837 Téléchargements

Partager

More