This page specifies two systems for directly transcribing characters used in the Attic writing system to Unicode code points in form NFKC.
Until the archonship of Euclid in 403 BCE, official Athenian inscriptions are regularly written with 21, case-insensitive alphabetic characters. Twenty of them corresponding to characters in the Ionic alphabet are mapped to Unicode characters in the upper-case ASCII range as specified in the following table:
Correspondence in Ionic literary orthography | Typical visual display | ASCII representation |
---|---|---|
alpha (long or short vowel) | α | A |
beta | β | B |
gamma | γ | G |
delta | δ | D |
short epsilon, long eta, or diphthong ei | ε | E |
zeta | ζ | Z |
theta | θ | Q |
iota (long or short vowel) | ι | I |
kappa | κ | K |
lamda | λ | L |
mu | μ | M |
nu | ν | N |
short omicron, long omega, or diphthong ou | ο | O |
pi | π | P |
rho | ρ | R |
sigma | σ | S |
tau | τ | T |
upsilon (long or short vowel) | υ | U |
phi | φ | F |
chi | χ | X |
The Attic alphabet has a further alphabetic character for the aspirate corresponding to the rough breathing mark in Ionic. This is conventionally transcribed with h
in print editions of Attic inscriptions.
Correspondence in Ionic literary orthography | Typical visual display | ASCII representation |
---|---|---|
rough breathing (aspirate) | h | H |
Version 1.4.0 of this specification recognizes 2 punctuation characters, mapped to the following ASCII characters:
Usage | ASCII representation | Unicode code point (decimal) |
---|---|---|
A major break, or full stop; form can resemble two dots or a colon | . | 46 |
A less significant break; form can resemble two or three vertical dots | : | 58 |
The same twenty characters listed above are mapped to lower-case characters in the Greek range of Unicode as specified in the following table. All values are in Unicode NFKC form.
Correspondence in Ionic literary orthography | Mapping to Greek range of Unicode | Mapping to ASCII |
---|---|---|
alpha (long or short vowel) | α | A |
beta | β | B |
gamma | γ | G |
delta | δ | D |
short epsilon, long eta, or diphthong ei | ε | E |
zeta | ζ | Z |
theta | θ | Q |
iota (long or short vowel) | ι | I |
kappa | κ | K |
lamda | λ | L |
mu | μ | M |
nu | ν | N |
short omicron, long omega, or diphthong ou | ο | O |
pi | π | P |
rho | ρ | R |
sigma | ς | S |
tau | τ | T |
upsilon (long or short vowel) | υ | U |
fi | φ | F |
chi | χ | X |
In mapping sigma to the Greek range of Unicode, the mapping depends on context. When sigma terminates a token as defined in the specification of Attic Greek string values, it is mapped to (lower-case) terminal sigma (codepoint 962 decimal); otherwise is mapped to (lower-case) initial/medial sigma (codepoint 963 decimal).
As listed above, an isolated sigma is represented by S in the ASCII mapping, and terminal form of sigma ς in the Unicode mapping. By contrast, the cluster sigma+tau is represented by the ASCII mapping ST and the Unicode mapping στ.
See further examples in the specification for Attic Greek string values.
The two punctuation marks supported in version 1.4.0 are mapped to the period and the mid-dot characters as follows:
Usage | ASCII representation | Unicode code point (decimal) |
---|---|---|
A major break, or full stop; form can resemble two dots or a colon | . | 46 |
A less significant break; form can resemble two or three vertical dots | · | 183 |