API documentation

Structures

CitableParserBuilder.AnalysisType

Citable analysis of a string value.

An Analysis has five members: a token string value, and four abbreviated URNs, one each for the lexeme, form, rule and stem.

source

Parsing

CitableParserBuilder.parsetokenFunction

Delegate to specific functions based on type's citable trait value.

parsetoken(s, x; data)
source

It is an error to invoke the parsetoken using types that are not a parser.

parsetoken(, s, x; data)
source

Citable parsers must implement parsetoken.

parsetoken(, s, x; data)
source

Parse String s by looking it up in a given dictionary.

source
CitableParserBuilder.parsepassageFunction

Parse a CitablePassage with text for a single token with a CitableParser.

parsepassage(cn, p; data)

Returns a single AnalyzedToken.

source

Parse a CitablePassage with text for a single token with a CitableParser.

parsepassage(ct, p; data)

Returns a single AnalyzedToken.

source
CitableParserBuilder.parsecorpusFunction

Use a CitableParser to parse a CitableTextCorpus with each citable node containing containg a single token of type LexicalToken.

parsecorpus(c, p; data, countinterval)

Returns anAnalyzedTokens object.

source

Working with vectors of AnalyzedTokens

CitableParserBuilder.lexemesFunction

Extract a list of lexemes from a Vector of Analysis objects.

lexemes(v)
source

Extract a list of lexemes from a Vector of AnalyzedToken objects.

lexemes(v)
source

Extract a list of lexemes from an AnalyzedTokens object.

lexemes(atokens)
source
CitableParserBuilder.lexemedictionaryFunction

From a vector of AnalyzedTokens and an index of tokens in a corpus, construct a dictionary keyed by lexemes, mapping to a further dictionary of surface forms to passages.

lexemedictionary(parses, tokenindex)
source

Working with AbbreviatedUrns

CitableParserBuilder.abbreviateFunction

Constructs an AbbreviatedUrn string from a Cite2Urn.

abbreviate(urn)

Example:

julia> abbreviate(Cite2Urn("urn:cite2:kanones:lsj.v1:n123"))
"lsj.n123"

Example: a pipeline abbreviating a Cite2Urn and forming a LexemeUrn from the abbreviated string value.

julia> Cite2Urn("urn:cite2:kanones:lsj.v1:n123") |> abbreviate |> LexemeUrn
LexemeUrn("lsj", "n123")
source
CitableParserBuilder.expandFunction

Constructs a Cite2Urn from an AbbreviatedUrn and a dictionary mapping collection identifiers in AbbreviatedUrns's to full Cite2Urns for a versioned collection.

source
CitableParserBuilder.fstsafeFunction

Compose SFST representation of an AbbreviatedUrn.

fstsafe(au)

Example:

julia> LexemeUrn("lexicon.lex123") |> fstsafe
"<u>lexicon\.lex123</u>"
source

Working with Stems and Rules

Serialization

CitableParserBuilder.readfstFunction

Read SFST output from file f, and parse into a dictionary keying tokens to a (possibly empty) array of SFST strings.

readfst(f)
source
CitableParserBuilder.delimitedFunction

Serialize an Analysis to delimited text. Abbreviated URNs are expanded to full CITE2 URNs using registry as the expansion dictionary.

delimited(a; delim, registry)
source

Serialize a Vector of Analysis objects as delimited text.

delimited(v; delim, registry)
source

Serialize a single AnalyzedToken as one or more lines of delimited text.

delimited(at; delim, registry)
source

Serialize a Vector of AnalyzedTokens as delimited text.

delimited(v; delim, registry)
source

Serialize an AnalyzedTokens object as delimited text (required for Citable interface).

delimited(atcollection; delim, registry)

Uses abbreviated URNs. These can be expanded to full CITE2 URNs when read back with a URN registry, or the delimited function can be used with a URN registry to write full CITE2 URNs.

source
Missing docstring.

Missing docstring for cex. Check Documenter's build log for details.