The Lexicon type

The Lexicon type wraps a Vector of LexiconArticles. You can load Liddell-Scott's Greek Lexicon from Christopher Blackwell's github repository with lsj() and Lewis and Short's Latin Dictionary with lewis-short(). Performance will depend primarily on the throughput of your internet connection.

Both repositories are licensed under the terms of the CC 3.0 BY-NC-SA license. If you have downloaded a copy of one of these files, you can load it from a local file with the lexicon function. The following code block loads a copy of the LSJ lexicon in the test/resources directory of the SimpleLexica.jl github repository.

using SimpleLexica
f = joinpath(root, "test", "resources", "lsj_chicago.cex")
greeklex = lexicon(f)
Lexicon with 116854 articles.

The Lexicon type is an iterable table, so you can directly apply functions like map to it.

map(entry -> titlecase(lemma(entry)), greeklex)
116854-element Vector{String}:
 "Α Α"
 "Ἀ-"
 "Ἀ-"
 "Ἆ"
 "Ἃ Ἃ"
 "Ἄα·"
 "Ἀάατος"
 "Ἀάβακτοι·"
 "Ἀᾱγής"
 "Ἄαδα·"
 ⋮
 "Χειρόπουν"
 "Χειροτεχνκή"
 "Χηρδύπτης"
 "Κιθωνίσκος"
 "Ἐπὶ3"
 "Χρυσαορικὸν"
 "Ψαμμῖτις"
 "Ψέλλιον1"
 "Ὠκυτοκεύς2"

Looking up articles

The lookup function returns a single LexiconArticle or nothing. It uses URN containment to find an article in the lexicon, so you can refer to articles with version-independent URNs.

using CitableObject
beer_urn = Cite2Urn("urn:cite2:hmt:lsj:n46358")
lookup(beer_urn, greeklex)
<urn:cite2:hmt:lsj.chicago_md:n46358> ζῦθος

Searching

By default, the SimpleLexica package searches both the lemma and the article body for entries matching a given string.

beer = search(greeklex, "ζυθος")
Lexicon with 4 articles.

You can limit the search to one or the other field by setting the searchscope parameter to SimpleLexica.LEMMA or SimpleLexica.ARTICLE.

 search(greeklex, "ζυθος", searchscope=SimpleLexica.LEMMA)
Lexicon with 1 article.
 search(greeklex, "ζυθος", searchscope=SimpleLexica.ARTICLE)
Lexicon with 4 articles.

You can use the articles, lemmata, and urns functions to extract a list of those fields from a lexicon's entries.

articles(beer)
4-element Vector{SubString{String}}:
 "**ζύθιον**, τό, Dim. of ζῦθος, " ⋯ 82 bytes ⋯ "n *Ind.Lect.Rost.* 1892/3p.12.)"
 "**ζῦθος**, ου, ὁ (also -εος, τό" ⋯ 432 bytes ⋯ " D.Chr. 32.82, Colum. 10.116.)"
 "**ζῦτος**, ὁ, = ζῦθος, *PCair.Z" ⋯ 192 bytes ⋯ "), Glauc. ap. *POxy.* 1802.42."
 "**χίθος**, = `A` **cilicia**, Gloss. (also written λίθος and ζύθος, ib.)."
lemmata(beer)
4-element Vector{SubString{String}}:
 "ζύθιον"
 "ζῦθος"
 "ζῦτος"
 "χίθος"
urns(beer)
4-element Vector{CitableObject.Cite2Urn}:
 urn:cite2:hmt:lsj.chicago_md:n46356
 urn:cite2:hmt:lsj.chicago_md:n46358
 urn:cite2:hmt:lsj.chicago_md:n46379
 urn:cite2:hmt:lsj.chicago_md:n113982

Optimizing searches

SimpleLexica uses a tidied up, parallel lexicon to search on, then returns the results from the initial, fully formatted lexicon. If no parallel lexicon is provided, it creates one by stripping the text of lemmata and article bodies to alphabetic characters with all diacritics removed. You can create a parallel lexicon for your own use stripped down in this way with the simplify method.

searchable = simplify(greeklex)
Lexicon with 116854 articles.

Since this can take a full second or even more on a consumer-level laptop for a lexicon as large as Liddle-Scott, you can reuse a searchable lexicon by providing it in an optional simplified parameter. This can greatly increase the performance of searches.

 search(greeklex, "ζυθος", simplified = searchable)
Lexicon with 4 articles.