Kanones
☛ Build customized morphological parsers for ancient Greek
- all data managed in delimited-text tables you can modify or add to with any text editor
- rigorously defined orthography. Example parsers include standard orthography of printed editions of literary Greek, and a parser for epigraphic Greek in the alphabet used by Athens before 403 BCE.
- implemented entirely in the Julia language (no other technical prerequisites):
- platform independent
- fast: a parser built with Kanones on consumer-grade hardware can typically parse a token in 1-9 milliseconds.
- code library can be used from command line, in scripts, in web apps, directly in an editor like Visual Studio Code
Quick example
Building a parser
Load a dataset, and build a parser from it. All the examples in this documentation use the literarygreek-rules
dataset found in the datasets
directory of the Kanones github repository.
using Kanones, CitableParserBuilder
srcdir = joinpath(repo, "datasets", "literarygreek-rules")
kds = dataset([srcdir])
p = stringParser(kds)
p isa StringParser
true
Interactive parsing
Parse a string: in this case, there is only one result.
s = "ἀνθρώπῳ"
parses = parsetoken(s, p)
1-element Vector{CitableParserBuilder.Analysis}:
CitableParserBuilder.Analysis("ἀνθρώπῳ", lsj.n8909, forms.2010001300, nounstems.n8909, nouninfl.os_ou3)
Extract a GreekMorphologicalForm
from each analysis, and apply the label
function to each:
parses .|> greekForm .|> label
1-element Vector{String}:
"noun: masculine dative singular"
Extract URNs for the lexeme from each analysis:
lexemelist = parses .|> lexemeurn
1-element Vector{CitableParserBuilder.LexemeUrn}:
lsj.n8909
After downloading LSJ labels for lexemes in the lsj
collection, label lexemes with LSJ labels included.
lsj = lemmatadict()
lemmalabel(lexemelist[1], dict = lsj)
"lsj.n8909@ἄνθρωπος"
What sections of these pages to read
Practical information:
- build a parser from an existing dataset and parse forms interactively: this page
- understand what Kanones is about: Background
- manage or modify a Kanones dataset, and build a new parser: User's guide to Kanones data
- use Kanones analyses in Julia code: Using morphological objects in Julia
Details of implementation:
- see how Kanones builds a dataset from files: Implementation: building a dataset
- see how Kanones generates form: Implementation: generating forms
Reference documentation: API docs