Link Search Menu Expand Document

Version 6.3.0

Building a corpus-specific dataset

TL;DR

  • you can build an initial parser from existing datasets
  • modify or add vocabulary or inflectional rules by editing delimited-text files

Fuller contents

  • defining an orthography (alphabet.fst)
  • organizing tabular files
  • tabulae’s grammatical vocabulary for each anlaytical type (“part of speech”)

Table of contents