Kanones

☛ Build customized morphological parsers for ancient Greek

  • all data managed in delimited-text tables you can modify or add to with any text editor
  • rigorously defined orthography. Example parsers include standard orthography of printed editions of literary Greek, and a parser for epigraphic Greek in the alphabet used by Athens before 403 BCE.
  • implemented entirely in the Julia language (no other technical prerequisites):
    • platform independent
    • fast: a parser built with Kanones on consumer-grade hardware can typically parse a token in 1-9 milliseconds.
    • code library can be used from command line, in scripts, in web apps, directly in an editor like Visual Studio Code

Quick example

Building a parser

Load a dataset, and build a parser from it. All the examples in this documentation use the literarygreek-rules dataset found in the datasets directory of the Kanones github repository.

using Kanones, CitableParserBuilder
srcdir = joinpath(repo, "datasets", "literarygreek-rules")
kds = dataset([srcdir])
p = stringParser(kds)
p isa StringParser
true

Interactive parsing

Parse a string: in this case, there is only one result.

s = "ἀνθρώπῳ"
parses = parsetoken(s, p)
1-element Vector{CitableParserBuilder.Analysis}:
 CitableParserBuilder.Analysis("ἀνθρώπῳ", lsj.n8909, forms.2010001300, nounstems.n8909, nouninfl.os_ou3)

Extract a GreekMorphologicalForm from each analysis, and apply the label function to each:

parses .|> greekForm .|> label
1-element Vector{String}:
 "noun: masculine dative singular"

Extract URNs for the lexeme from each analysis:

lexemelist = parses .|> lexemeurn
1-element Vector{CitableParserBuilder.LexemeUrn}:
 lsj.n8909

After downloading LSJ labels for lexemes in the lsj collection, label lexemes with LSJ labels included.

lsj = lemmatadict()
lemmalabel(lexemelist[1], dict = lsj)
"lsj.n8909@ἄνθρωπος"

What sections of these pages to read

Practical information:

  • build a parser from an existing dataset and parse forms interactively: this page
  • understand what Kanones is about: Background
  • manage or modify a Kanones dataset, and build a new parser: User's guide to Kanones data
  • use Kanones analyses in Julia code: Using morphological objects in Julia

Details of implementation:

  • see how Kanones builds a dataset from files: Implementation: building a dataset
  • see how Kanones generates form: Implementation: generating forms

Reference documentation: API docs