Latin-1 Supplement (Unicode block)

The Latin-1 Supplement (also called C1 Controls and Latin-1 Supplement) is the second Unicode block in the Unicode standard. It encodes the upper range of ISO 8859-1: 80 (U+0080) - FF (U+00FF). Controls C1 (0080009F) are not graphic. This block ranges from U+0080 to U+00FF, contains 128 characters and includes the C1 controls, Latin-1 punctuation and symbols, 30 pairs of majuscule and minuscule accented Latin characters and 2 mathematical operators.

C1 Controls and Latin-1 Supplement
RangeU+0080..U+00FF
(128 code points)
PlaneBMP
ScriptsLatin (64 char.)
Common (64 char.)
Major alphabetsFrench
German
Icelandic
Spanish
Symbol setsPunctuation
Mathematics
Currency
Assigned128 code points
33 Control or Format
Unused0 reserved code points
Source standardsISO/IEC 8859-1
Unicode version history
1.0.0128 (+128)
Note: [1][2]

The C1 controls and Latin-1 Supplement block has been included in its present form, with the same character repertoire since version 1.0 of the Unicode Standard.[3] Its block name in Unicode 1.0 was simply Latin1.[4]

Character table

Code Result Description Acronym
C1 Controls
U+0080 Padding CharacterPAD
U+0081 High Octet PresetHOP
U+0082 Break Permitted HereBPH
U+0083 No Break HereNBH
U+0084 IndexIND
U+0085 Next LineNEL
U+0086 Start of Selected AreaSSA
U+0087 End of Selected AreaESA
U+0088 Character (Horizontal) Tabulation SetHTS
U+0089 Character (Horizontal) Tabulation with JustificationHTJ
U+008A Line (Vertical) Tabulation SetLTS
U+008B Partial Line Forward (Down)PLD
U+008C Partial Line Backward (Up)PLU
U+008D Reverse Line Feed (Index)RI
U+008E Single-Shift TwoSS2
U+008F Single-Shift ThreeSS3
U+0090 Device Control StringDCS
U+0091 Private Use OnePU1
U+0092 Private Use TwoPU2
U+0093 Set Transmit StateSTS
U+0094 Cancel characterCCH
U+0095 Message WaitingMW
U+0096 Start of Protected AreaSPA
U+0097 End of Protected AreaEPA
U+0098 Start of StringSOS
U+0099 Single Graphic Character IntroducerSGCI
U+009A Single Character IntroducerSCI
U+009B Control Sequence IntroducerCSI
U+009C String TerminatorST
U+009D Operating System CommandOSC
U+009E Private MessagePM
U+009F Application Program CommandAPC
Latin-1 Punctuation and Symbols
U+00A0   Non-breaking spaceNBSP
U+00A1 ¡ Inverted exclamation mark
U+00A2 ¢ Cent sign
U+00A3 £ Pound sign
U+00A4 ¤ Currency sign
U+00A5 ¥ Yen sign
U+00A6 ¦ Broken bar
U+00A7 § Section sign
U+00A8 ¨ Diaeresis
U+00A9 © Copyright sign
U+00AA ª Feminine Ordinal Indicator
U+00AB « Left-pointing double angle quotation mark
U+00AC ¬ Not sign
U+00AD Soft hyphenSHY
U+00AE ® Registered sign
U+00AF ¯ Macron
U+00B0 ° Degree symbol
U+00B1 ± Plus-minus sign
U+00B2 ² Superscript two
U+00B3 ³ Superscript three
U+00B4 ´ Acute accent
U+00B5 µ Micro sign
U+00B6 Pilcrow sign
U+00B7 · Middle dot
U+00B8 ¸ Cedilla
U+00B9 ¹ Superscript one
U+00BA º Masculine ordinal indicator
U+00BB » Right-pointing double-angle quotation mark
U+00BC ¼ Vulgar fraction one quarter
U+00BD ½ Vulgar fraction one half
U+00BE ¾ Vulgar fraction three quarters
U+00BF ¿ Inverted question mark
Letters
U+00C0 À Latin Capital Letter A with grave
U+00C1 Á Latin Capital letter A with acute
U+00C2 Â Latin Capital letter A with circumflex
U+00C3 Ã Latin Capital letter A with tilde
U+00C4 Ä Latin Capital letter A with diaeresis
U+00C5 Å Latin Capital letter A with ring above
U+00C6 Æ Latin Capital letter AE
U+00C7 Ç Latin Capital letter C with cedilla
U+00C8 È Latin Capital letter E with grave
U+00C9 É Latin Capital letter E with acute
U+00CA Ê Latin Capital letter E with circumflex
U+00CB Ë Latin Capital letter E with diaeresis
U+00CC Ì Latin Capital letter I with grave
U+00CD Í Latin Capital letter I with acute
U+00CE Î Latin Capital letter I with circumflex
U+00CF Ï Latin Capital letter I with diaeresis
U+00D0 Ð Latin Capital letter Eth
U+00D1 Ñ Latin Capital letter N with tilde
U+00D2 Ò Latin Capital letter O with grave
U+00D3 Ó Latin Capital letter O with acute
U+00D4 Ô Latin Capital letter O with circumflex
U+00D5 Õ Latin Capital letter O with tilde
U+00D6 Ö Latin Capital letter O with diaeresis
Mathematical operator
U+00D7 × Multiplication sign
Letters
U+00D8 Ø Latin Capital letter O with stroke
U+00D9 Ù Latin Capital letter U with grave
U+00DA Ú Latin Capital letter U with acute
U+00DB Û Latin Capital Letter U with circumflex
U+00DC Ü Latin Capital Letter U with diaeresis
U+00DD Ý Latin Capital Letter Y with acute
U+00DE Þ Latin Capital Letter Thorn
U+00DF ß Latin Small Letter sharp S
U+00E0 à Latin Small Letter A with grave
U+00E1 á Latin Small Letter A with acute
U+00E2 â Latin Small Letter A with circumflex
U+00E3 ã Latin Small Letter A with tilde
U+00E4 ä Latin Small Letter A with diaeresis
U+00E5 å Latin Small Letter A with ring above
U+00E6 æ Latin Small Letter AE
U+00E7 ç Latin Small Letter C with cedilla
U+00E8 è Latin Small Letter E with grave
U+00E9 é Latin Small Letter E with acute
U+00EA ê Latin Small Letter E with circumflex
U+00EB ë Latin Small Letter E with diaeresis
U+00EC ì Latin Small Letter I with grave
U+00ED í Latin Small Letter I with acute
U+00EE î Latin Small Letter I with circumflex
U+00EF ï Latin Small Letter I with diaeresis
U+00F0 ð Latin Small Letter Eth
U+00F1 ñ Latin Small Letter N with tilde
U+00F2 ò Latin Small Letter O with grave
U+00F3 ó Latin Small Letter O with acute
U+00F4 ô Latin Small Letter O with circumflex
U+00F5 õ Latin Small Letter O with tilde
U+00F6 ö Latin Small Letter O with diaeresis
Mathematical operator
U+00F7 ÷ Division sign
Letters
U+00F8 ø Latin Small Letter O with stroke
U+00F9 ù Latin Small Letter U with grave
U+00FA ú Latin Small Letter U with acute
U+00FB û Latin Small Letter U with circumflex
U+00FC ü Latin Small Letter U with diaeresis
U+00FD ý Latin Small Letter Y with acute
U+00FE þ Latin Small Letter Thorn
U+00FF ÿ Latin Small Letter Y with diaeresis

Subheadings

The C1 Controls and Latin-1 Supplement block has four subheadings within its character collection: C1 controls, Latin-1 Punctuation and Symbols, Letters, and Mathematical operator(s).[5]

C1 controls

The C1 controls subheading contains 32 supplementary control codes inherited from ISO/IEC 8859-1 and many other 8-bit character standards. The alias names for the C0 and C1 control codes are taken from ISO/IEC 6429:1992.[5]

Latin-1 punctuation and symbols

The Latin-1 Punctuation and Symbols subheading contains 32 characters of common international punctuation characters, such as inverted exclamation and question marks, and a middle dot; and symbols like currency signs, spacing diacritic marks, vulgar fraction, and superscript numbers.[5]

Letters

The Letters subheading contains 30 pairs of majuscule and minuscule accented or novel Latin characters for western European languages, and two extra minuscule characters not commonly used word-initially.[5]

Mathematical operator

The Mathematical operator subheading is used for the multiplication and division signs.[5]

Number of symbols, letters and control codes

The table below shows the number of each letters, symbols and control codes in each subheadings in the C1 Controls and Latin-1 Supplement block.

Type of subheadingNumber of symbolsRange of characters
C1 controls32 control codesU+0080 to U+009F
Latin-1 punctuation and symbols32 punctuation and symbolsU+00A0 to U+00BF
Letters30 pairs of majuscule and minuscule accented Latin charactersU+00C0 to U+00D6, U+00D8 to U+00F6 and U+00F8 to U+00FF
Mathematical operatorsThe U+00D7 × MULTIPLICATION SIGN and U+00F7 ÷ DIVISION SIGN symbols.U+00D7 and U+00F7

Compact table

C1 Controls and Latin-1 Supplement[1]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+008x  XXX   XXX   BPH   NBH   IND   NEL   SSA   ESA   HTS   HTJ   VTS   PLD   PLU    RI    SS2   SS3 
U+009x  DCS   PU1   PU2   STS   CCH    MW    SPA   EPA   SOS   XXX   SCI   CSI    ST    OSC    PM    APC 
U+00Ax NB
  SP  
¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬  SHY 
­
® ¯
U+00Bx ° ± ² ³ ´ µ · ¸ ¹ º » ¼ ½ ¾ ¿
U+00Cx À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï
U+00Dx Ð Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ý Þ ß
U+00Ex à á â ã ä å æ ç è é ê ë ì í î ï
U+00Fx ð ñ ò ó ô õ ö ÷ ø ù ú û ü ý þ ÿ
Notes
1.^ As of Unicode version 13.0

Emoji

The Latin-1 Supplement block contains two emoji: U+00A9 and U+00AE.[6][7]

The block has four standardized variants defined to specify emoji-style (U+FE0F VS16) or text presentation (U+FE0E VS15) for the two emoji, both of which default to a text presentation.[8]

Emoji variation sequences
U+00A900AE
base code point©®
base+VS15 (text)©︎®︎
base+VS16 (emoji)©️®️

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Latin-1 Supplement block:

VersionFinal code points[lower-alpha 1]CountL2 IDWG2 IDDocument
1.0.0U+0080..009F32X3L2/95-002PDAM No. 3 to ISO/IEC 10646-1 on coding of C1 controls, 1994-11-01
X3L2/95-028N1148Nine tables of replies to repeated/extended votes, 1995-02-22
N1203Umamaheswaran, V. S.; Ksar, Mike (1995-05-03), "5.3", Unconfirmed minutes of SC2/WG2 Meeting 27, Geneva
X3L2/95-061DAM no.3 to ISO/IEC 10646-1 (Coding of C1 controls), 1995-06-01
N1307Table of replies to JTC1 letter ballot on 10646 DAM 3, Coding of C1 Controls, (SC2 N 2666), 1996-01-15
N1309Paterson, Bruce (1996-01-17), Report and Disposition of Comments on DAM 1, UTF 16 and DAM 2, UTF-8, DAM 3, Coding of C1 Controls, and DAM 4, Removal of Annex G: UTF1
N1312Paterson, Bruce (1996-01-17), Draft Final Text of 10646 AMD-3, Coding of C1 Controls
L2/99-048Umamaheswaran, V. S. (1999-02-04), C1 controls in the code charts
L2/99-054RAliprand, Joan (1999-06-21), "C1 Controls", Approved Minutes from the UTC/L2 meeting in Palo Alto, February 3-5, 1999
N3046Suignard, Michel (2006-02-22), Improving formal definition for control characters
N3103 (pdf, doc)Umamaheswaran, V. S. (2006-08-25), "M48.33", Unconfirmed minutes of WG 2 meeting 48, Mountain View, CA, USA; 2006-04-24/27
U+00A0..00FF96(to be determined)
X3L2/94-077N994Davis, Mark (1994-03-03), ISO/IEC 10646-1 - Proposed Draft Corrigendum 1
X3L2/94-098N1033 (pdf, doc)Umamaheswaran, V. S.; Ksar, Mike (1994-06-01), "8.1.15", Unconfirmed Minutes of ISO/IEC JTC 1/SC 2/WG 2 Meeting 25, Falez Hotel, Antalya, Turkey, 1994-04-18--22
L2/11-016Moore, Lisa (2011-02-15), "Correct mistakes in property assignments for super and subscripted letters (B.13.4) [U+00AA, U+00BA]", UTC #126 / L2 #223 Minutes
L2/11-116Moore, Lisa (2011-05-17), "Consensus 127-C14", UTC #127 / L2 #224 Minutes, Change the general category of to U+00AA FEMININE ORDINAL INDICATOR and U+00BA MASCULINE ORDINAL INDICATOR "Lo" for Unicode 6.1.
L2/11-261R2Moore, Lisa (2011-08-16), "Consensus 128-C6", UTC #128 / L2 #225 Minutes, Change the general category from "So" to "Po" ... [U+00A7 and U+00B6]
L2/15-050R[lower-alpha 2][lower-alpha 3]Davis, Mark; et al. (2015-01-29), Additional variation selectors for emoji
  1. Proposed code points and characters names may differ from final code points and names
  2. See also L2/13-207, L2/14-054, L2/14-063, L2/15-051A, L2/15-051B
  3. Refer to the history section of the Miscellaneous Symbols and Pictographs block for additional emoji-related documents

See also

References

  1. "Unicode character database". The Unicode Standard. Retrieved 2016-07-09.
  2. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2016-07-09.
  3. The Unicode Standard Version 1.0, Volume 1. Addison-Wesley Publishing Company, Inc. 1991 [1990]. ISBN 0-201-56788-1.
  4. "3.8: Block-by-Block Charts" (PDF). The Unicode Standard. version 1.0. Unicode Consortium.
  5. "Unicode 6.2 code charts" (PDF). The Unicode Standard. Retrieved 1 April 2013.
  6. "UTR #51: Unicode Emoji". Unicode Consortium. 2020-02-11.
  7. "UCD: Emoji Data for UTR #51". Unicode Consortium. 2020-01-28.
  8. "UTS #51 Emoji Variation Sequences". The Unicode Consortium.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.