The Lexicon
type
The Lexicon
type wraps a Vector of LexiconArticles
. You can load Liddell-Scott's Greek Lexicon from Christopher Blackwell's github repository with lsj()
and Lewis and Short's Latin Dictionary with lewis-short()
. Performance will depend primarily on the throughput of your internet connection.
Both repositories are licensed under the terms of the CC 3.0 BY-NC-SA license. If you have downloaded a copy of one of these files, you can load it from a local file with the lexicon
function. The following code block loads a copy of the LSJ lexicon in the test/resources
directory of the SimpleLexica.jl
github repository.
using SimpleLexica
f = joinpath(root, "test", "resources", "lsj_chicago.cex")
greeklex = lexicon(f)
Lexicon with 116854 articles.
The Lexicon
type is an iterable table, so you can directly apply functions like map
to it.
map(entry -> titlecase(lemma(entry)), greeklex)
116854-element Vector{String}:
"Α Α"
"Ἀ-"
"Ἀ-"
"Ἆ"
"Ἃ Ἃ"
"Ἄα·"
"Ἀάατος"
"Ἀάβακτοι·"
"Ἀᾱγής"
"Ἄαδα·"
⋮
"Χειρόπουν"
"Χειροτεχνκή"
"Χηρδύπτης"
"Κιθωνίσκος"
"Ἐπὶ3"
"Χρυσαορικὸν"
"Ψαμμῖτις"
"Ψέλλιον1"
"Ὠκυτοκεύς2"
Looking up articles
The lookup
function returns a single LexiconArticle
or nothing
. It uses URN containment to find an article in the lexicon, so you can refer to articles with version-independent URNs.
using CitableObject
beer_urn = Cite2Urn("urn:cite2:hmt:lsj:n46358")
lookup(beer_urn, greeklex)
<urn:cite2:hmt:lsj.chicago_md:n46358> ζῦθος
Searching
By default, the SimpleLexica
package searches both the lemma and the article body for entries matching a given string.
beer = search(greeklex, "ζυθος")
Lexicon with 4 articles.
You can limit the search to one or the other field by setting the searchscope
parameter to SimpleLexica.LEMMA
or SimpleLexica.ARTICLE
.
search(greeklex, "ζυθος", searchscope=SimpleLexica.LEMMA)
Lexicon with 1 article.
search(greeklex, "ζυθος", searchscope=SimpleLexica.ARTICLE)
Lexicon with 4 articles.
You can use the articles
, lemmata
, and urns
functions to extract a list of those fields from a lexicon's entries.
articles(beer)
4-element Vector{SubString{String}}:
"**ζύθιον**, τό, Dim. of ζῦθος, " ⋯ 82 bytes ⋯ "n *Ind.Lect.Rost.* 1892/3p.12.)"
"**ζῦθος**, ου, ὁ (also -εος, τό" ⋯ 432 bytes ⋯ " D.Chr. 32.82, Colum. 10.116.)"
"**ζῦτος**, ὁ, = ζῦθος, *PCair.Z" ⋯ 192 bytes ⋯ "), Glauc. ap. *POxy.* 1802.42."
"**χίθος**, = `A` **cilicia**, Gloss. (also written λίθος and ζύθος, ib.)."
lemmata(beer)
4-element Vector{SubString{String}}:
"ζύθιον"
"ζῦθος"
"ζῦτος"
"χίθος"
urns(beer)
4-element Vector{CitableObject.Cite2Urn}:
urn:cite2:hmt:lsj.chicago_md:n46356
urn:cite2:hmt:lsj.chicago_md:n46358
urn:cite2:hmt:lsj.chicago_md:n46379
urn:cite2:hmt:lsj.chicago_md:n113982
Optimizing searches
SimpleLexica
uses a tidied up, parallel lexicon to search on, then returns the results from the initial, fully formatted lexicon. If no parallel lexicon is provided, it creates one by stripping the text of lemmata and article bodies to alphabetic characters with all diacritics removed. You can create a parallel lexicon for your own use stripped down in this way with the simplify
method.
searchable = simplify(greeklex)
Lexicon with 116854 articles.
Since this can take a full second or even more on a consumer-level laptop for a lexicon as large as Liddle-Scott, you can reuse a searchable lexicon by providing it in an optional simplified
parameter. This can greatly increase the performance of searches.
search(greeklex, "ζυθος", simplified = searchable)
Lexicon with 4 articles.