Orthography
An orthography is defined by the following functional requirements. It is possible to:
- enumerate its complete character set
- evaluate if a sequence of characters is orthographically valid
- enumerate a set of token types
- parse a stream of valid characters into a sequence of classified tokens, associating a substring of the character stream and a token type
This implies that the orthography can also parse a citable text citable at the level of the token (i.e., extending the canonical citation hierarchy one level) into a series of classified tokens, associating a token type with each citable token. s This definition is generic enough to appy to many languages (or perhaps any language?).