Esperanto orthography
Esperanto is written in a Latin-script alphabet of twenty-eight letters, with upper and lower case. This is supplemented by punctuation marks and by various logograms, such as the digits 0–9, currency signs such as $, and mathematical symbols. The creator of Esperanto, L. L. Zamenhof, declared a principle of "one letter, one sound", though this general guideline is not strictly followed.[1]
Twenty-two of the letters are identical in form to letters of the English alphabet (q, w, x, and y being omitted). The remaining six have diacritic marks, ĉ, ĝ, ĥ, ĵ, ŝ, and ŭ (that is, c, g, h, j, and s circumflex, and u breve).
In handwritten Esperanto, the diacritics pose no problem. However, since they do not appear on standard alphanumeric keyboards, various alternative methods have been devised for representing them in printed and typed text. The original method was a set of digraphs now known as the "h-system", but with the rise of computer word processing, the so-called "x-system" has become equally popular. These systems are described below. However, with the advent of Unicode, the need for such workarounds has lessened.
Alphabet
Sound values
The letters have approximately the sound values of the IPA, with the exception of c [t͡s] and the letters with diacritics: ĉ [t͡ʃ], ĝ [d͡ʒ], ĥ [x], ĵ [ʒ], ŝ [ʃ], ŭ [u̯]. J transcribes two allophonic sounds, consonantal [j] (the English y sound) and vocalic [i̯].[1]
Majuscule forms (also called uppercase or capital letters) | |||||||||||||||||||||||||||
A | B | C | Ĉ | D | E | F | G | Ĝ | H | Ĥ | I | J | Ĵ | K | L | M | N | O | P | R | S | Ŝ | T | U | Ŭ | V | Z |
Minuscule forms (also called lowercase or small letters) | |||||||||||||||||||||||||||
a | b | c | ĉ | d | e | f | g | ĝ | h | ĥ | i | j | ĵ | k | l | m | n | o | p | r | s | ŝ | t | u | ŭ | v | z |
IPA value | |||||||||||||||||||||||||||
a | b | t͡s | t͡ʃ | d | e | f | ɡ | d͡ʒ | h | x | i | j, i̯ | ʒ | k | l | m | n | o | p | r | s | ʃ | t | u | u̯ | v | z |
There is a nearly one-to-one correspondence of letter to sound. For those who consider /d͡z/ to be a phoneme, Esperanto contains one digraph, ⟨dz⟩.[2] Beside the dual use of ⟨j⟩, allophony is found in place assimilation of /m/ and /n/, the latter of which for example is frequently pronounced [ŋ] before g and k.
Phonemic change is perhaps limited to voicing assimilation, as in the sequence kz of ekzemplo, ('(An) example') which is frequently pronounced /ɡz/. In Zamenhof's writing, obstruents with different voicing do not meet in compound words, but rather are separated by an epenthetic vowel such as o, to avoid this effect.
Non-Esperantized names are given an Esperanto approximation of their original pronunciation, at least by speakers without command of the original language. Hard ⟨c⟩ is read as k, ⟨qu⟩ as kv, ⟨w⟩ as v, ⟨x⟩ as ks, and ⟨y⟩ as j if a consonant, or as i if a vowel. The English digraph ⟨th⟩ is read as t. When there is no close equivalent, the difficult sounds may be given the Esperanto values of the letters in the orthography or roman transcription, accommodating the constraints of Esperanto phonology. So, for example, Winchester (the English city) is pronounced (and may be spelled) Vinĉester /vint͡ʃester/, as Esperanto ŭ tends not to be found at the beginning of words.[3] Changzhou generally becomes Ĉanĝo /t͡ʃand͡ʒo/, as Esperanto has no ng or ou sound. There are no strict rules, however; speakers may try for greater authenticity, for example by pronouncing the g and u in Changzhou: Ĉangĝoŭ /t͡ʃaŋɡd͡ʒou̯/. The original stress may be kept, if it is known.
Origin
The script resembles Western Slavic Latin alphabets but uses circumflexes instead of carons for the letters ĉ, ĝ, ĥ, ĵ, and ŝ. Also, the non-Slavic bases of the letters ĝ and ĵ, rather than Slavic dž and ž, help preserve the printed appearance of Latinate and Germanic vocabulary such as ĝenerala "general" (adjective) and ĵurnalo "journal". The letter v stands for either v or w of other languages. The letter ŭ of the diphthongs aŭ and eŭ resemble the Belarusian Łacinka alphabet.
Zamenhof took advantage of the fact that typewriters for the French language (which, in his lifetime, was still a kind of international lingua franca for educated people) possess a dead key for the circumflex and umlaut/diaeresis diacritics: thus, anyone who could avail himself of a French typewriter could type ĉ ĝ ĥ ĵ ŝ and their uppercase counterparts with no problem. French typewriters also include the letter ù which Francophone Esperantists have long used as an "approximation" to Esperanto ŭ. With the advent of personal computers, French-language keyboards still possess a dead-key ^ but whether it can be used to type Esperanto consonants may depend on the underlying software. This same choice of accented letters was familiar to the speakers of some Slavic languages, for instance, Czech and Slovak, where the sounds of Esperanto ĉ and ŝ are represented by the letters č and š respectively; and Belarussian, because Esperanto ŭ bears the same relation to u as Belarussian Cyrillic ў bears to у.
Geographic names diverge from English especially for the English x, w, qu and gu, as in Vaŝingtono "Washington, D.C.", Meksiko "Mexico", or Gvatemalo "Guatemala". Other spelling differences appear when Esperanto spelling is based on the pronunciation of English names which have undergone the Great Vowel Shift, as in Brajtono for Brighton.
Names of the letters of the alphabet
Zamenhof simply tacked an -o onto each consonant to create the name of the letter, with the vowels representing themselves: a, bo, co, ĉo, do, e, fo, etc. The diacritics are frequently mentioned overtly. For instance, ĉ may be called ĉo ĉapela or co ĉapela, from ĉapelo (a hat), and ŭ may be called ŭo luneta or u luneta, from luno (a moon) plus the diminutive -et-. This is the only system that is widely accepted and in practical use.
The letters of the ISO basic Latin alphabet not found in the Esperanto alphabet have distinct names, much as letters of the Greek alphabet do. ⟨q⟩, ⟨x⟩, ⟨y⟩ are kuo, ikso, ipsilono; ⟨w⟩ has been called duobla vo (double V), vavo (using Waringhien's name of va below), vuo (proposed by Sergio Pokrovskij), germana vo (German V), and ĝermana vo (Germanic V).[4]
However, while this is fine for initialisms such as ktp [kotopo] for etc., it can be problematic when spelling out names. For example, several consonantal distinctions are difficult for many nationalities, who normally rely on the fact that Esperanto seldom uses these sounds to distinguish words (that is, they do not form many minimal pairs). Thus the pairs of letter names ĵo–ĝo, ĥo–ho (or ĥo–ko), co–ĉo (or co–so, co–to), lo–ro, and ŭo–vo (or vo–bo) are problematic. In addition, over a noisy telephone connection, it quickly becomes apparent that voicing distinctions can be difficult to make out: noise confounds the pairs po–bo, to–do, ĉo–ĝo, ko–go, fo–vo, so–zo, ŝo–ĵo, as well as the nasals mo–no.
There have been several proposals to resolve this problem. Gaston Waringhien proposed changing the vowel of voiced obstruents to a, so that at least voicing is not problematic. Also changed to a are h, n, r, distinguishing them from ĥ, m, l. The result is perhaps the most common alternative in use:
- a, ba, co, ĉo, da, e, fo, ga, ĝa, ha, ĥo, i, jo, ĵa, ko, lo, mo, na, o, po, ra, so, ŝo, to, u, ŭo, va, za
However, this still requires overt mention of the diacritics, and even so does not reliably distinguish ba–va, co–so, ĉo–ŝo, or ĝa–ĵa.
The proposal closest to international norms (and thus the easiest to remember) that clarifies all the above distinctions is a modification of a proposal by Kálmán Kalocsay. As with Zamenhof, vowels stand for themselves, but it follows the international standard of placing vowel e after a consonant by default (be, ce, de, ge), but before sonorants (el, en) and voiceless fricatives (ef, es). The vowel a is used for ⟨h⟩ and the voiceless plosives ⟨p⟩, ⟨t⟩, ⟨k⟩, after the international names ha for ⟨h⟩ and ka for ⟨k⟩; the French name ĵi is used for ⟨ĵ⟩, the Greek name ĥi (chi) for ⟨ĥ⟩, and the English name ar for ⟨r⟩. The letter ⟨v⟩ has the i vowel of ĵi, distinguishing it from ⟨b⟩, but the other voiced fricative, ⟨z⟩, does not, to avoid the problem of it palatalizing and being confused with ĵi. The diphthong offglide ⟨ŭ⟩ is named eŭ, the only real possibility given Esperanto phonotactics besides aŭ, which as the word for "or" would cause confusion. The letter ⟨m⟩ is called om to distinguish it from ⟨n⟩; the vowel o alliterates well in the alphabetical sequence el, om, en, o, pa. There are other patterns to the vowels in the ABC rhyme: The lines start with a i a i and finish with a a e e. The letters with diacritics are placed at the end of the rhyme, taking the place of w, x, y in other Latin alphabets, so as not to disrupt the pattern of letters many people learned as children. All this makes the system more easily memorized than competing proposals. The modified Kalocsay abecedary is:
- a, be, ce, de, e, ef, ge, ha,
- i, je, ka, el, om, en, o, pa,
- ar, es, ta, u, vi, ĉa, ĝe,
- ĥi kaj ĵi, eŝ, eŭ kaj ze,
- plus ku', ikso, ipsilono,
- jen la abece-kolono.
(kaj means "and". The last line reads: here is the ABC column)
Where letters are still confused, such as es vs eŝ or a vs ha, mention can be made of the diacritic (eŝ ĉapela), or to the manner of articulation of the sound (ha brueta "breathy aitch"). Quite commonly, however, people will use the aitch as in house strategy used in English. Another possibility is to use a spelling alphabet (literuma alfabeto), which substitutes ordinary words for letters. The following words are sometimes seen:[5]
- Asfalto, Barbaro, Centimetro, Ĉefo, Doktoro, Elemento, Fabriko, Gumo, Ĝirafo, Hotelo, Ĥaoso, Insekto, Jubileo, Ĵurnalo, Kilogramo, Legendo, Maŝino, Naturo, Oktobro, Papero, Rekordo, Salato, Ŝilingo, Triumfo, Universo, Universo-hoko (ŭ), Vulkano, Zinko.[note 1]
ASCII transliteration systems
There are two alternative orthographies in common use, which replace the circumflex letters with either h digraphs or x digraphs. Other systems are occasionally proposed, such as the "yw-system", which uses y for the palatal consonants (e.g. syanco for ŝanco) and w for ŭ or the "q-system" which replaces the h or x with q, but none appear to have caught on.[6] There are also work-arounds such as approximating the circumflexes with carets.
H-system
The original method of working around the diacritics was developed by the creator of Esperanto himself, L. L. Zamenhof. He recommended using u in place of ŭ, and using digraphs with h for the circumflex letters. For example, ŝ is represented by sh, as in shi for ŝi (she), and shanco for ŝanco (chance).
Unfortunately, this method suffers from several problems:
- h is already a consonant in the language, so digraphs occasionally make words ambiguous, especially HH (though this can be substituted with KH). The usual example is the word flughaveno (airport), which while using the H-system could be misinterpreted as *fluĝaveno.
- when ŭ is changed to u, not only is there the occasional ambiguity, but a naive reading may place the stress on the wrong syllable (though it is possible to simply substitute with W);
- simplistic ASCII-based rules for sorting words fail badly for sorting h-digraphs, because lexicographically words in ĉ should follow all words in c and precede words in d. The word ĉu should be placed after ci, but sorted in the h-system, chu would appear before ci.
X-system
A more recent system for typing in Esperanto is the so-called "x-system", which uses x instead of h for the digraphs, including ux for ŭ. For example, ŝ is represented by sx, as in sxi for ŝi and sxanco for ŝanco.
X-digraphs solve those problems of the h-system:
- x is not a letter in the Esperanto alphabet, so its use introduces no ambiguity.
- The digraphs are now nearly always correctly sorted after their single-letter counterparts; for example, sxanco (for ŝanco) comes after super, while h-system shanco comes before it. The sorting only fails in the infrequent case of a z in compound or unassimilated words; for example, the compound word reuzi ("to reuse") would be sorted after reuxmatismo (for reŭmatismo "rheumatism").
The x-system has become as popular as the h-system, but it has long been perceived as being contrary to the Fundamento de Esperanto. However, in its 2007 decision, the Akademio de Esperanto has issued general permission for the use of surrogate systems for the representation of the diacritical letters of Esperanto, under the condition that this is being done only when the circumstances do not permit the use of proper diacritics, and when due to a special need the h-system fixed in the Fundamento is not convenient.[7] This provision covers situations such as using the x-system as a technical solution (to store data in plain ASCII) yet still displaying proper Unicode characters to the end user.
A practical problem of digraph substitution that the x-system does not completely resolve is in the complication of bilingual texts. Ux for ŭ is especially problematic when used alongside French text, because many French words end in aux or eux. Aux, for example, is a word in both languages (aŭ in Esperanto). Any automatic conversion of the text will alter the French words as well as the Esperanto. A few English words like "auxiliary" and "Euxine" can also suffer from such search-and-replace routines. One common solution, such as the one used in Wikipedia's MediaWiki software since the intervention of Brion Vibber in January 2002, is to use xx to escape the ux to ŭ conversion, e.g. "auxx" produces "aux".[8][9] A few people have also proposed using "vx" instead of "ux" for ŭ to resolve this problem, but this variant of the system is rarely used.
Graphic work-arounds
There are several ad hoc workarounds used in email or on the internet, where the proper letters are often not supported, as seen also in non-ASCII orthographies such as German. These "slipped-hat" conventions make use of the caret (^) or greater than sign (>) to represent the circumflex. For example, ŝanco may be written ^sanco, s^anco, or s>anco.[10] However, they have generally fallen out of favor. Before the internet age, Stefano la Colla had proposed shifting the caret onto the following vowel, since French circumflex vowels are supported in printing houses. That is, one would write ehôsângôj cîujâude for eĥoŝanĝoj ĉiuĵaŭde.[11] However, this proposal has never been adopted.
Reform proposals
Some people presented proposals for reformed systems for spelling Esperanto. These are often proposals to remove the diacritics from Esperanto. In addition, there is also a systematic way to transliterate Esperanto to the Cyrillic alphabet and adaptations of the Shavian alphabet to write Esperanto.
The following table shows various ways to write Esperanto, namely the standard way, IPA - transcription, qwxy-system, h-system, x-system, 4 reform proposals and Cyrillic transcription:
ISO 15924 |
System | Letters or digrams | Number of letters |
Note | |||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Latn | Standard Esperanto alphabet | a | b | c | ĉ | d | e | f | g | ĝ | h | ĥ | i | j | ĵ | k | l | m | n | o | p | r | s | ŝ | t | u | ŭ | v | z | 28 | 6 diacritics (5 circumflexes, 1 breve) |
Latn | IPA | ä | b | t͡s | t͡ʃ | d | e̞ | f | ɡ | d͡ʒ | h | x | i | j, i̯ | ʒ | k | l | m | n | o̞ | p | r | s | ʃ | t | u | w, u̯ | v | z | 28 (or 25) | 1 affricate = 1 letter |
Latn | IPA approximate | a | b | ts | tʃ | d | e | f | g | dʒ | h | x | i | j | ʒ | k | l | m | n | o | p | r | s | ʃ | t | u | w | v | z | 25 | 2 non-ASCII letters |
Latn | QWXY-system | a | b | c | tx | d | e | f | g | dy | h | q | i | j | y | k | l | m | n | o | p | r | s | x | t | u | w | v | z | 26 | 2 digraphs |
Latn | H-system | a | b | c | ch | d | e | f | g | gh | h | hh | i | j | jh | k | l | m | n | o | p | r | s | sh | t | u | u | v | z | 22 | 5 digraphs |
Latn | X-system | a | b | c | cx | d | e | f | g | gx | h | hx | i | j | jx | k | l | m | n | o | p | r | s | sx | t | u | ux | v | z | 23 | 6 digraphs |
Latn | W-system [12] | a | b | c | ĉ | d | e | f | g | ĝ | h | ĥ | i | j | ĵ | k | l | m | n | o | p | r | s | ŝ | t | u | w | v | z | 28 | 5 diacritics (5 circumflexes) |
Latn | International | a | b | c | q | d | e | f | g | j | h | hh | i | y | zz | k | l | m | n | o | p | r | s | x | t | u | w | v | z | 26 | 2 digraphs |
Latn | Nova Help-Alfabeto | a | b | c | ch | d | e | f | g | j | h | kh | i | y | zh | k | l | m | n | o | p | r | s | sh | t | u | w/u | v | z | 24 | |
Latn | Sefosas[13] | a | b | ts | tc | d | e | f | g | dj | h | x | i | y | j | k | l | m | n | o | p | r | s | c | t | u | w | v | z | 25 | 3 non-IPA letters (y,j,c) |
Cyrl | Cyrillic (upper case) | А | Б | Ц | Ч | Д | Е | Ф | Г | Џ | Һ | Х | И | Ј | Ж | К | Л | М | Н | О | П | Р | С | Ш | Т | У | Ў | В | З | 28 | |
Cyrl | Cyrillic (lower case) | а | б | ц | ч | д | е | ф | г | џ | һ | х | и | ј | ж | к | л | м | н | о | п | р | с | ш | т | у | ў | в | з | 28 |
English is the only European language written in Latin alphabet without diacritical marks. That, among political reasons, is why it's the most used language in electronic data processing. Virtually all systems in the world will have an ASCII-derived input system and will therefore allow the letters from the English alphabet to be input. Thinking of ways to make Esperanto writable in more systems, people began creating transliteration systems for Esperanto to be written in the English alphabet, many of them with the pretension of reforming the language permanently:
Nova Help-Alfabeto
New Help Alphabet or Nova Help-Alfabeto (NHA) was promoted by André Albault, which spent the rest of his life trying to reform Esperanto spelling to it. It consists in using letters and digraphs that made sense phonetically according to other languages written in the Latin alphabet and romanization systems of Asian languages:
ĉ → ch
ĝ → j
ĥ → kh
j → y
ĵ → zh
ŝ → sh
ŭa → wa
ŭe → we
ŭo → wo
aŭ → au
eŭ → eu
oŭ → ou
There was criticism though. It uses 4 digraphs, while leaving 2 letters of the English alphabet out. The digraphs that it uses will exclude the possibility for the phonetic sequences t͡sh, kh, zh, sh, äu̯, e̞u̯ and o̞u̯ from the language. Of the 12 sequences it replaces, 6 are sequences that occur in words in the standard spelling, like ekshibici, klashoro, ekhavis, Stokholmo, grizhara, rizherbo, praula, posteularo, reuzado, lingvouzo, noumeno, etc... So those would have to be written in a different way to maintain pronunciation.
Sefosas
Sefosas (SenEscepte Fonetika Ortografio Sen Aldonaj Signoj) (Without Exceptions Phonetic Spelling Without Additional Signs) is a spelling system used in a dialect of Esperanto that was created for writing science-fiction stories. Its objective is to write all affricates phonetically in two letters. By doing so, it's found that the phonetics of Esperanto comprises 25 distinguishable sounds, while basic Latin alphabet consists of 26 letters. It is therefore possible to write without exception according to the principle "one sound one letter, one letter one sound" without diacritic signs:
j → y
ĵ → j
ŭ → w
ĥ → x
ŝ → c
ĉ → tc
ĝ → dj
c → ts
The problem with this system is that the sequences tŝ (replaced by tc), dĵ (replaced by dj) and ts occur in some words even in the standard alphabet, like gratsono, platŝtonaro, gadĵo, etc... The principle of "one phoneme - one letter" means that the affricates, namely c, ĉ, and ĝ, must be considered to be separate phonemes, so distinguishable from combinations of plosive and fricative (ts, tŝ, dĵ). This distinction occurs for example in the Polish language, between trzy (three) and czy (whether), but for many others it is a difficult distinction. The lexicon avoids these combinations in roots, but they could occur in compounds, such as katŝato (taste for cats). Although such combinations are rare and possibly also avoidable by inserting o (katoŝato, etc...), they may nevertheless exist. It is even possible to construct with compounds, some minimal pairs between which the only contrast consists of such a difference, for example, atencata and atentsata (satisfied with attention). Therefore, one must insist that Esperantists are able, at least in principle, to pronounce and hear differences between ts and c, between tŝ and ĉ, and between dĵ and ĝ, or one must moderate the claim that one letter represents one phoneme in Esperanto.
International
The international system was proposed in America and it rearranges sequences in the following configuration:
j → y
ĵ → zz
ŭ → w
ĥ → hh
ŝ → x
ĉ → q
ĝ → j
zz → z'z
hh → h'h
The idea is to introduce the 4 letters that were lacking to reduce the number of diacritics, while exchanging the y and j so that it makes more sense phonetically according to European languages. The digraphs were made in a way to replace the least occurring letters in the language. The problem with this system is that it violates the principle of "one phoneme - one letter". Since we want to use the maximum of 26 letters and there are 28 phonemes in Esperanto, at least 2 digraphs are absolutely needed. although the sequences zz and hh don't occur in the language in standard spelling, it's advised that if they ever occur by advent of a compound word, they are replaced by z'z and h'h to indicate they are not digraphs. With this rule followed, it leaves no ambiguity.
Punctuation
As with most languages, punctuation is not completely standardized, but in Esperanto there is the additional complication of multiple competing national traditions.
Commas are frequently used to introduce subordinate clauses (that is, before ke "that" or the ki- correlatives):
- Mi ne scias, kiel fari tion. (I don't know how to do that.)
The comma is also used for the decimal point, while thousands are separated by non-breaking spaces: 12345678,9.
The question mark (?) and the exclamation mark (!) are used at the end of a clause and may be internal to a sentence. Question words generally come at the beginning of a question, obviating the need for Spanish-style inverted question marks.
Periods may be used to indicate initialisms: k.t.p. or ktp for kaj tiel plu (et cetera), but not abbreviations that retain the grammatical suffixes. Instead, a hyphen optionally replaces the missing letters: D-ro or Dro for Doktoro (Dr). With ordinal numerals, the adjectival a and accusative n may be superscripted: 13a or 13a (13th). The abbreviation k is used without a period for kaj (and); the ampersand (&) is not found. Roman numerals are also avoided.
The hyphen is also occasionally used to clarify compounds, and to join grammatical suffixes to proper names that haven't been Esperantized or don't have a nominal -o suffix, such as the accusative on Kalocsay-n or Kálmán-on. The proximate particle ĉi used with correlatives, such as ĉi tiu 'this one' and ĉi tie 'here', may be poetically used with nouns and verbs as well (ĉi jaro 'this year', esti ĉi 'to be here'), but if these phrases are then changed to adjectives or adverbs, a hyphen is used: ĉi-jare 'this year', ĉi-landa birdo 'a bird of this land'.[14]
Quotation marks show the greatest variety of any punctuation. The use of Esperanto quotation marks was never stated in Zamenhof's work; it was assumed that a printer would use whatever he had available (usually the national standard of the printer's country). — Dashes, « guillemets » (often »reversed«), "double apostrophes" (also often "reversed"), and more are all found. Since the age of word processing, however, the standard English quotation marks have become most widespread. Quotations may be introduced with either a comma or a colon.
Capitalization
Capitalization is used for the first word of a sentence and for proper names when used as nouns. Names of months, days of the week, ethnicities, languages, and the adjectival forms of proper names are not typically capitalized (anglo "an Englishman", angla "English", usona "US American"), though national norms may override such generalizations. Titles are more variable: both the Romance style of capitalizing only the first word of the title and the English style of capitalizing all lexical words are found.
All capitals or small capitals are used for acronyms and initialisms of proper names, like TEJO, but not common expressions like ktp (etc.). Small capitals are also a common convention for family names, to avoid the confusion of varying national naming conventions: Kalocsay Kálmán, Leslie Cheung Kwok Wing.
Camel case, with or without a hyphen, may occur when a prefix is added to a proper noun: la geZamenhofoj (the Zamenhofs), pra-Esperanto (Proto-Esperanto). It is also used for Russian-style syllabic acronyms, such as the name ReVo for Reta Vortaro ("Internet Dictionary"), which is homonymous with revo (dream). Occasionally mixed capitalization will be used for orthographic puns, such as espERAnto, which stands for the esperanta radikala asocio (Radical Esperanto Association).
Zamenhof contrasted informal ci with formal, and capitalized, Vi as the second-person singular pronouns. However, lower-case vi is now used as the second-person pronoun regardless of number.
Braille, fingerspelling, and Morse code
⠁ a |
⠃ b |
⠉ c |
⠩ ĉ |
⠙ d |
⠑ e |
⠋ f |
⠛ g |
⠻ ĝ |
⠓ h |
⠳ ĥ |
⠊ i |
⠚ j |
⠺ ĵ |
⠅ k |
⠇ l |
⠍ m |
⠝ n |
⠕ o |
⠏ p |
⠗ r |
⠎ s |
⠮ ŝ |
⠞ t |
⠥ u |
⠬ ŭ |
⠧ v |
⠵ z |
⠟ q |
⠾ w |
⠭ x |
⠽ y |
Esperanto versions of braille and Morse code include the six diacritic letters.
An Esperanto braille magazine, Aŭroro, has been published since 1920.
Ĉ | ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ |
Ĝ | ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ |
Ĥ | ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ |
Ĵ | ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ |
Ŝ | ▄▄▄▄▄▄▄▄▄▄▄▄▄▄ |
Ŭ | ▄▄▄▄▄▄▄▄▄▄▄▄▄▄ |
In Morse code, a dot is added to C and J to derive Ĉ and Ĵ, a dash–dot is added to G and S to derive Ĝ and Ŝ, a dash is added to U to derive Ǔ, and the four dots of H are changed to four dashes for Ĥ.
There is a proposed manual alphabet as part of the Signuno project. Signuno itself, as a signed variant of Esperanto rather than as a language in its own right, is a manual logographic Esperanto orthography. The Signuno alphabet derives from international norms (that is from Gestuno which in turn derives from ASL) but there are sometimes significant differences, most notably that all letters are static and upright and with a straight wrist (for example the G is simply turned upright compared to ASL). T is as Irish T rather than exactly like ASL, and F is as the okay handshape rather than exactly like ASL. The handshape for H resembles its uppercase letter (as in French fingerspelling), as does P to a degree (deriving it seems from mixing Polish and Russian). Q is a modification from Irish, but also a merging of the Signuno handshapes for K and V. The Z appears to be unique to Signuno (it's shaped like an ASL 3, and the three fingers of the Z seem vague to relate to the three strokes of the letter Z). The diacritic letters Ĉ, Ĝ, Ĥ, Ĵ, Ŝ, Ŭ are derived from their base letters C, G, H, J, S, U. Ĉ is a modified C (and Chinese), Ĝ is the inverse of G, Ĥ is a modified H, Ĵ is the inverse of J and also a modified Z (as if ZH), Ŝ is the inverse of S (and German). J resembles its uppercase letter and is derived from I, while Ŭ is from U (as W is from V).
Other scripts
The Shavian alphabet, which was designed for English, was modified for use with Esperanto by John Wesley Starling. Though not widely used, at least one booklet has been published with transliterated sample texts. Not all letters are equivalent to their English values, and several ligatures are added for grammatical inflections and for a few grammatical words.
- The vowels necessarily differ from English. Esperanto a e i o u take the letters for English /æ ɛ ɪ ə ɒ/, with more regard to graphic symmetry than phonetic faithfulness in the cases of o and u. C takes the letter for /θ/, the Castilian value of c before e and i, and ĥ that for /ŋ/, the inverse of the letter for /h/.[note 2] The most divergent letters are those for m and n, which are /ʊ uː/ in English, but which are graphically better suited to be distinct letters than English Shavian /m n/.
Unicode
The entire Esperanto alphabet is part of the Latin-3 and Unicode character sets, and is included in WGL4. The code points and HTML entities for the special Esperanto characters in Unicode are:
Glyph | Codepoint | Name | Alternative HTML entities |
---|---|---|---|
Ĉ | U+0108 | Latin capital letter c with circumflex | Ĉ, Ĉ, Ĉ |
ĉ | U+0109 | Latin small letter c with circumflex | ĉ, Ĉ, ĉ |
Ĝ | U+011C | Latin capital letter g with circumflex | Ĝ, Ĝ, Ĝ |
ĝ | U+011D | Latin small letter g with circumflex | ĝ, ĝ, ĝ |
Ĥ | U+0124 | Latin capital letter h with circumflex | Ĥ, Ĥ, Ĥ |
ĥ | U+0125 | Latin small letter h with circumflex | ĥ, ĥ, ĥ |
Ĵ | U+0134 | Latin capital letter j with circumflex | Ĵ, Ĵ, Ĵ |
ĵ | U+0135 | latin small letter j with circumflex | ĵ, ĵ, ĵ |
Ŝ | U+015C | latin capital letter s with circumflex | Ŝ, Ŝ, Ŝ |
ŝ | U+015D | Latin small letter s with circumflex | ŝ, ŝ, ŝ |
Ŭ | U+016C | Latin capital letter u with breve | Ŭ, Ŭ, Ŭ |
ŭ | U+016D | Latin small letter u with breve | ŭ, ŭ, ŭ |
Microsoft Windows
Adjusting a keyboard to type Unicode is relatively simple (all Windows variants of the Microsoft Windows NT family, such as 2000 and XP, for example, support Unicode; Windows 9x does not natively support Unicode).
The Canadian Multilingual Standard layout is preinstalled in MS Windows.[15] The US international layout needs to be modified to enable Esperanto letters. This can be done using Microsoft Keyboard Layout Creator or by using a layout provided for this purpose, e.g. EoKlavaro. EoKlavaro gives access also to many other European language characters.
Another more recent free download to adapt a Windows keyboard for Esperanto letters is Tajpi - Esperanto Keyboard for Windows 2000 / XP / Vista / 7 / 8 by Thomas James.
A simple and free utility with all the Esperanto keys already installed is called Esperanto keyboard layout for Microsoft Windows – (QWERTY version) this is available as a free download.
A similar tool is Ek, and is available without charge. You can download the keyboard by clicking on Instalilo: ek(version#)inst.exe. Ek uses the cx keying function to produce ĉ. It will work with most programs but there are some that it is not compatible with.
A commercial but still cheap tool is Šibboleth, a program that can produce every Latin character. It enables composition of ĝ etc. using the ^ deadkey (like for French letters), so one does not have to learn new key positions. The ŭ is produced by the combination u followed by #.
Writing and receiving email in Esperanto is not a problem, because all modern email clients and servers accept Unicode as UTF-8 in at least one of 8bit, quoted-printable or base64 Content-Transfer-Encoding types. Esperanto text will normally be transmitted in UTF-8 with a Content-Transfer-Encoding of either 8bit (if the server supports it) or failing that, quoted-printable.
If one wants to use a text editor that is Esperanto-compatible, make sure it supports Unicode, as do Editplus (UTF-8), UniRed and Vim.
GNU/Linux
Since 2009 it has been very easy to add key combinations for accented Esperanto letters to one's usual keyboard layout, at least in Gnome and KDE. No download is required. The keyboard layout options can be modified under System Preferences. The options to choose are "Adding Esperanto circumflexes (supersigno)" and the appropriate keyboard layout (Qwerty or Dvorak). A third level shift key is also required: under "Key to choose 3rd level", e.g. LeftWin.[16]
In older systems it may be necessary to activate Unicode by setting the locale to a UTF-8 locale. There is a special eo_XX.UTF-8 locale available at Bertil Wennergren's home page, along with a thorough explanation of how one implements Unicode and the keyboard in GNU/Linux.
If the GNU/Linux system is recent, or kept updated, then the system is probably already working with Esperanto keys. For X11 and KDE, it's only necessary to switch to a keyboard layout that has Latin dead keys (for example, the "US International" keyboard), whenever the user wants to write in Esperanto. Some keyboards with dead keys are:
- In the US International keyboard, the dead circumflex is over the "6" key ("shift-6") and the dead breve is hidden over the "9" key ("altgr-shift-9").
- In the Spanish dead tilde input, ⇧ Shift+[ will produce the caret (^) dead tilde, which can be combined by pressing w, s, g, h, j and c to type ŵ, ŝ, ĝ, ĥ, ĵ and ĉ, respectively. It also can be combined with any vowel to type â, ê, î, ô, û and ŷ. ⇧ Shift+[ and then Space will produce the caret symbol itself (^).
- In the Brazilian ABNT2 keyboard, the dead circumflex has its own key together with dead tilde ("shift-~"), near the "Enter" key. The dead breve is hidden over the backslash ("altgr-shift-\") key.
- In the Portuguese keyboard, the dead tilde key, near the left shift key, has both the dead circumflex and the dead breve.
- On French and Belgian keyboards, the same dead key (the one right of p) used to produce French â ê î ô û ŷ when followed by a vowel will usually also produce ĉ ĝ ĥ ĵ ŝ when followed by the appropriate consonant. AltGr+⇧ Shift+ the key which would be a dead-grave when used with AltGr without ⇧ Shift (on Belgian keyboards, AltGr+⇧ Shift+£ which can be on the top or middle row) is usually a dead-breve, i.e. use it before hitting u in order to get ŭ.
Keys / Layout | US International | Brazilian ABNT2 | Portuguese |
---|---|---|---|
ĉ | shift-6 c | shift-~ c | shift-~ c |
Ĉ | shift-6 shift-c | shift-~ shift-c | shift-~ shift-c |
ĝ | shift-6 g | shift-~ g | shift-~ g |
Ĝ | shift-6 shift-g | shift-~ shift-g | shift-~ shift-g |
ĥ | shift-6 h | shift-~ h | shift-~ h |
Ĥ | shift-6 shift-h | shift-~ shift-h | shift-~ shift-h |
ĵ | shift-6 j | shift-~ j | shift-~ j |
Ĵ | shift-6 shift-j | shift-~ shift-j | shift-~ shift-j |
ŝ | shift-6 s | shift-~ s | shift-~ s |
Ŝ | shift-6 shift-s | shift-~ shift-s | shift-~ shift-s |
ŭ | altgr-shift-9 u | altgr-shift-\ u | altgr-shift-~ u |
Ŭ | altgr-shift-9 shift-u | altgr-shift-\ shift-u | altgr-shift-~ shift+u |
Another option is to use a keyboard layout that supports the Compose key (usually mapped to the right alt or to one of the windows keys). Then, "compose-u u" will combine the character u with the breve, and "compose-shift-6 s" will combine the character s with the circumflex (assuming "shift-6" is the position of the caret).
In GNOME, there exists a separate keyboard layout for Esperanto, replacing unused characters in Esperanto with the non-ASCII characters. A separate keyboard layout for Esperanto is available in KDE, too.
If necessary, install and use high quality fonts that have Esperanto glyphs, like Microsoft Web core fonts (free for personal use) or DejaVu (The Bitstream Vera glyphs have the Bitstream Vera license and DejaVu extensions are in public domain).
There is also an applet available for the gnome-panel called "Character Palette" and one can add the following characters to a new palette for quick placement from their panel menu bar:
- Ĉ ĉ Ĝ ĝ Ĥ ĥ Ĵ ĵ Ŝ ŝ Ŭ ŭ
The Character Palette applet makes for a quick and easy way to add Esperanto characters to a web browser or text document. One need only select their newly created palette and click a letter, and that letter will be on their system clipboard waiting to be pasted into the document.
macOS
On macOS systems Esperanto characters can be entered by selecting a keyboard layout from the "Input Sources" pane of "Language & Text" preferences, found in the "System Preferences" application, and the pre-installed ABC Extended keyboard layout can be used to type Esperanto's diacritics. When this layout is active, Esperanto characters can be entered using multiple keystrokes using a simple mnemonic device: 6 contains the caret character, which looks like a circumflex, so ⌥ Option+6 places a caret over the following character. Similarly, ⌥ Option+b stands for breve, so ⌥ Option+b adds the breve mark over the next character.
One can also download an Esperanto keyboard layout package that will, once installed, function in the same way as other languages' keyboards. When installed, this gives users two different methods of typing. The first, Esperanto maintains a QWERTY layout, but switches the letters that are not used in Esperanto (q, w, y, and x) for diacritical letters and makes a u into a ŭ if it follows an a or an e. The second method, Esperanto-sc, is more familiar to QWERTY users and allows the user to type in most Latin-scripted languages and Esperanto simultaneously. It treats the keys that take diacritics (a, s, e, c, g, h, u, and j) as dead keys, if a combining character is pressed afterwards—usually the semicolon (;). Both methods are also available using the less common Dvorak Keyboard.
A table of the input methods:
Char | Name | Esperanto | Esperanto-sc | ABC Extended |
---|---|---|---|---|
Ĉ | C-circumflex | Q | C ; | ⌥ Option+6 ⇧ Shift+c |
ĉ | c-circumflex | q | c ; | ⌥ Option+6 c |
Ĝ | G-circumflex | Y | G ; | ⌥ Option+6 ⇧ Shift+g |
ĝ | g-circumflex | y | g ; | ⌥ Option+6 g |
Ĥ | H-circumflex | ⌥ Option+6 ⇧ Shift+H | H ; | ⌥ Option+6 ⇧ Shift+H |
ĥ | h-circumflex | ⌥ Option+6 h | h ; | ⌥ Option+6 h |
Ĵ | J-circumflex | W | J ; | ⌥ Option+6 ⇧ Shift+j |
ĵ | j-circumflex | w | j ; | ⌥ Option+6 j |
Ŝ | S-circumflex | X | S ; | ⌥ Option+6 ⇧ Shift+s |
ŝ | s-circumflex | x | s ; | ⌥ Option+6 s |
Ŭ | U-breve | after pressing a or e, Y will make a Û | after pressing a or e, Y will make a Û | ⌥ Option+b ⇧ Shift+u |
ŭ | u-breve | after pressing a or e, y will make a û | after pressing a or e, y will make a û | ⌥ Option+b u |
Swedish Esperantists using Mac OS X can use the Finnish Extended layout, which comes with the OS. Finnish has the same alphabet and type layout as Swedish; the Finnish Extended layout adds functionality just like ABC Extended, only using other key combinations (the breve appears when one types |⌥ Option+y and the circumflex when one types |⌥ Option+^).
Similarly, British users may use the Irish Extended layout, which differs from the ABC Extended keyboard layout in several ways (preserving the simple option+vowel method of applying acute accents, important for the Irish language, and the £ sign on shift-3 like the UK layout), but uses the same "dead-keys" for modifiers as ABC Extended for Esperanto characters.
In OS X it is also possible to create one's own keyboard layouts, so it is relatively easy to have more convenient mappings, like for example one based on typing an x after the letter.
There is still no integrated solution for typing Esperanto-characters with AZERTY keyboards. Dead-circumflex followed by a consonant may or may not work for ĉ ĝ ĥ ĵ ŝ; and if nothing else avails, ù is a tolerable if imperfect approximation for ŭ.
Locale
An Esperanto locale would use a thin space as the thousands separator and comma as the decimal separator. Time and date format among Esperantists is not standardized, but of course "internationally unambiguous" formats such as 2020-10-11 or 11-okt-2020 are preferred when the date is not spelled out in full ("la 11-a de oktobro 2020").
Spesmilo
Unique to the Esperanto script is the spesmilo (1000 specie) sign, an Sm monogram for a now-obsolete international unit of auxiliary Esperanto currency used by a few British and Swiss banks before World War I. It has been assigned the Unicode value U+20B7, though in ordinary fonts it is often transcribed as Sm, usually italic.
See also
- Orthography
- Ĉ, Ĝ, Ĥ, Ĵ, Ŝ, Ŭ
Notes
- A few of these words may be difficult to distinguish from other Esperanto words in noisy conditions, such as gumo – kubo, naturo – maturo – daturo, maŝino – baseno, vulkano – bulgaro, and zinko – ŝinko, and so may not be easily recognizable if the system is not known.
- A phonetic joke in English (English /h/ and /ŋ/ being in complementary distribution) which does not work in Esperanto.
References
- Kalocsay & Waringhien, Plena analiza gramatiko, §17
- Plena analiza gramatiko, §22
- "PMEG". bertilow.com.
- Gaston Waringhien, ed. (2005). Plena Ilustrita Vortaro de Esperanto. Sennacieca Asocio Tutmonda. ISBN 2-9502432-8-2. Retrieved 23 January 2014.
duobla vo aŭ ĝermana vo. Nomo de neesperanta grafemo, kun la formo W, w, (prononcata v aŭ ŭ, depende de la lingvoj) [double V or Germanic V. Name of a non-Esperanto grapheme, with the form W, w, (pronounced v or ǔ [that is, with the sound of English "v" or "w"], depending on the language)]
- http://bertilow.com/pmeg/gramatiko/oa-vortecaj_vortetoj/liternomoj.html
Although this source claims these words are "used by" the World Esperanto Association, it was in fact simply reprinted in the 1995 edition of the Jarlibro (p. 93). - User AV3, Usenet, soc.culture.experanto, 2008-06-16. Available in Derkeiler.com or in Google Groups.
- "Akademio de Esperanto: Oficialaj Informoj 6 - 2007 01 21". akademio-de-esperanto.org. Archived from the original on 29 March 2013. Retrieved 22 January 2013.
- Wikipedia:Wikipedia Signpost/2012-12-31/Interview
- Chuck Smith. "Unicoding the Esperanto Wikipedia (Part 3 of 4)". Esperanto Language Blog. Retrieved 14 January 2013.
- "lernu!: Community / Forum / Introduction". lernu.net. Archived from the original on 16 January 2009. Retrieved 24 October 2008.
- Plena Analiza Gramatiko, end of section 4: Cê la sângôj okazintaj en la cî-landa vojkodo, cîuj automobilistoj zorge informigû pri la jûsaj instrukcioj.
- http://newsgroups.derkeiler.com/Archive/Soc/soc.culture.esperanto/2008-06/msg00190.html
- http://www.esperanto.de/bb/sefosas/enhavo.htm
- Kalocsay and Waringhien, §54.
- http://support.microsoft.com/kb/258824/en-us Select alternative keyboard layout
- How to type in Esperanto in Linux, Donald Rogers, Esperanto sub la Suda Kruco, p 8-11, Sep 2010
External links
- Computer input
- Amiketo is a software supporting the Esperanto alphabet for Windows, Mac OS, and Linux
- Online Esperanto keyboard
- Esperanto QWERTY keyboard for Windows using spare keys
- Esperanto GKOS keyboard for Android phones/tablets with genuine support (language option in Tools menu)
- Tajpi - Esperanto Keyboard for Windows 2000 / XP / Vista / 7 / 8 – free download
- Unired – Unicode plain text editor for Windows 95/98/NT/2000 (with E-o support)
- eoconv – a tool to convert text between various Esperanto orthographies and character encodings