API documentation

From a vector of AnalyzedTokens and an index of tokens in a corpus, construct a dictionary keyed by lexemes, mapping to a further dictionary of surface forms to passages.

lexemedictionary(parses, tokenindex)

source

Working with `AbbreviatedUrn`s

CitableParserBuilder.abbreviate — Function

Constructs an AbbreviatedUrn string from a Cite2Urn.

abbreviate(urn)

Example:

julia> abbreviate(Cite2Urn("urn:cite2:kanones:lsj.v1:n123"))
"lsj.n123"

Example: a pipeline abbreviating a Cite2Urn and forming a LexemeUrn from the abbreviated string value.

julia> Cite2Urn("urn:cite2:kanones:lsj.v1:n123") |> abbreviate |> LexemeUrn
LexemeUrn("lsj", "n123")

source

CitableParserBuilder.expand — Function

Constructs a Cite2Urn from an AbbreviatedUrn and a dictionary mapping collection identifiers in AbbreviatedUrns's to full Cite2Urns for a versioned collection.

source

CitableParserBuilder.fstsafe — Function

Compose SFST representation of an AbbreviatedUrn.

fstsafe(au)

Example:

julia> LexemeUrn("lexicon.lex123") |> fstsafe
"<u>lexicon\.lex123</u>"

source

Working with `Stem`s and `Rule`s

CitableParserBuilder.lexeme — Function

Function required to get lexeme value of a Stem implementation.

source

CitableParserBuilder.id — Function

Function required to get ID value of a Stem implementation.

source

Function required to get ID value of a Rule implementation.

source

CitableParserBuilder.inflectiontype — Function

Function required to get string value for inflection class of a Stem implementation.

source

Function required to get string value for inflection class of a Rule implementation.

source

Serialization

CitableParserBuilder.readfst — Function

Read SFST output from file f, and parse into a dictionary keying tokens to a (possibly empty) array of SFST strings.

readfst(f)

source

CitableParserBuilder.relationsblock — Function

Compose a CEX relationset block for a set of analyses.

relationsblock(urn, label, v)
relationsblock(urn, label, v, delim; registry)

source

CitableParserBuilder.delimited — Function

Serialize an Analysis to delimited text. Abbreviated URNs are expanded to full CITE2 URNs using registry as the expansion dictionary.

delimited(a; delim, registry)

source

Serialize a Vector of Analysis objects as delimited text.

delimited(v; delim, registry)

source

Serialize a single AnalyzedToken as one or more lines of delimited text.

delimited(at; delim, registry)

source

Serialize a Vector of AnalyzedTokens as delimited text.

delimited(v; delim, registry)

source

Serialize an AnalyzedTokens object as delimited text (required for Citable interface).

delimited(atcollection; delim, registry)

Uses abbreviated URNs. These can be expanded to full CITE2 URNs when read back with a URN registry, or the delimited function can be used with a URN registry to write full CITE2 URNs.

source

Missing docstring.

Missing docstring for cex. Check Documenter's build log for details.

API documentation

Structures

Parsing

Working with vectors of AnalyzedTokens

Working with AbbreviatedUrns

Working with Stems and Rules

Serialization

Working with vectors of `AnalyzedToken`s

Working with `AbbreviatedUrn`s

Working with `Stem`s and `Rule`s