A library for working with Greek in the pre-403 BCE Attic alphabet > Orthography of Attic Greek >

Representation of the characters in the Attic writing system

This page specifies two systems for directly transcribing characters used in the Attic writing system to Unicode code points in form NFKC.

Encoding Attic characters in the ASCII range of Unicode

Until the archonship of Euclid in 403 BCE, official Athenian inscriptions are regularly written with 21, case-insensitive alphabetic characters. Twenty of them corresponding to characters in the Ionic alphabet are mapped to Unicode characters in the upper-case ASCII range as specified in the following table:

Correspondence in Ionic literary orthography	Typical visual display	ASCII representation
alpha (long or short vowel)	α	A
beta	β	B
gamma	γ	G
delta	δ	D
short epsilon, long eta, or diphthong ei	ε	E
zeta	ζ	Z
theta	θ	Q
iota (long or short vowel)	ι	I
kappa	κ	K
lamda	λ	L
mu	μ	M
nu	ν	N
short omicron, long omega, or diphthong ou	ο	O
pi	π	P
rho	ρ	R
sigma	σ	S
tau	τ	T
upsilon (long or short vowel)	υ	U
phi	φ	F
chi	χ	X

The Attic alphabet has a further alphabetic character for the aspirate corresponding to the rough breathing mark in Ionic. This is conventionally transcribed with h in print editions of Attic inscriptions.

Correspondence in Ionic literary orthography	Typical visual display	ASCII representation
rough breathing (aspirate)	h	H

Version 1.4.0 of this specification recognizes 2 punctuation characters, mapped to the following ASCII characters:

Usage	ASCII representation	Unicode code point (decimal)
A major break, or full stop; form can resemble two dots or a colon	.	46
A less significant break; form can resemble two or three vertical dots	:	58

Encoding Attic characters in the Greek range of Unicode

The same twenty characters listed above are mapped to lower-case characters in the Greek range of Unicode as specified in the following table. All values are in Unicode NFKC form.

Correspondence in Ionic literary orthography	Mapping to Greek range of Unicode	Mapping to ASCII
alpha (long or short vowel)	α	A
beta	β	B
gamma	γ	G
delta	δ	D
short epsilon, long eta, or diphthong ei	ε	E
zeta	ζ	Z
theta	θ	Q
iota (long or short vowel)	ι	I
kappa	κ	K
lamda	λ	L
mu	μ	M
nu	ν	N
short omicron, long omega, or diphthong ou	ο	O
pi	π	P
rho	ρ	R
sigma	ς	S
tau	τ	T
upsilon (long or short vowel)	υ	U
fi	φ	F
chi	χ	X

In mapping sigma to the Greek range of Unicode, the mapping depends on context. When sigma terminates a token as defined in the specification of Attic Greek string values, it is mapped to (lower-case) terminal sigma (codepoint 962 decimal); otherwise is mapped to (lower-case) initial/medial sigma (codepoint 963 decimal).

Example

As listed above, an isolated sigma is represented by S in the ASCII mapping, and terminal form of sigma ς in the Unicode mapping. By contrast, the cluster sigma+tau is represented by the ASCII mapping ST and the Unicode mapping στ.

See further examples in the specification for Attic Greek string values.

The two punctuation marks supported in version 1.4.0 are mapped to the period and the mid-dot characters as follows:

Usage	ASCII representation	Unicode code point (decimal)
A major break, or full stop; form can resemble two dots or a colon	.	46
A less significant break; form can resemble two or three vertical dots	·	183