zoukankan      html  css  js  c++  java
  • LanguageTag

    LanguageTag

    Table of Contents

    This is a memo of RFC 5646, ie BCP-47.

    1 The Language Tag

    Language tags are used to help identify languages, whether spoken, written, signed, or otherwise signaled, for the purpose of communication. This includes constructed and artificial languages but excludes languages not intended primarily for human communication, such as programming languages.

    1.1 Syntax

    • TAG is composed from a sequence of one or more subtags
    • SubTags are sequence of alphanumric characters to narrow the range of languge.
    • SubTags are concated suing "-".

    The syntax of the language tag in ABNF [RFC5234] is:

    Language-Tag  = langtag             ; normal language tags
                  / privateuse          ; private use tag
                  / grandfathered       ; grandfathered tags
    
    langtag       = language
                    ["-" script]
                    ["-" region]
                    *("-" variant)
                    *("-" extension)
                    ["-" privateuse]
    
    language      = 2*3ALPHA            ; shortest ISO 639 code
                    ["-" extlang]       ; sometimes followed by
                                        ; extended language subtags
                  / 4ALPHA              ; or reserved for future use
                  / 5*8ALPHA            ; or registered language subtag
    
    extlang       = 3ALPHA              ; selected ISO 639 codes
                    *2("-" 3ALPHA)      ; permanently reserved
    
    script        = 4ALPHA              ; ISO 15924 code
    
    region        = 2ALPHA              ; ISO 3166-1 code
                  / 3DIGIT              ; UN M.49 code
    
    variant       = 5*8alphanum         ; registered variants
                  / (DIGIT 3alphanum)
    
    extension     = singleton 1*("-" (2*8alphanum))
    
                                        ; Single alphanumerics
                                        ; "x" reserved for private use
    singleton     = DIGIT               ; 0 - 9
                  / %x41-57             ; A - W
                  / %x59-5A             ; Y - Z
                  / %x61-77             ; a - w
                  / %x79-7A             ; y - z
    
    privateuse    = "x" 1*("-" (1*8alphanum))
    
    grandfathered = irregular           ; non-redundant tags registered
                  / regular             ; during the RFC 3066 era
    
    irregular     = "en-GB-oed"         ; irregular tags do not match
                  / "i-ami"             ; the 'langtag' production and
                  / "i-bnn"             ; would not otherwise be
                  / "i-default"         ; considered 'well-formed'
                  / "i-enochian"        ; These tags are all valid,
                  / "i-hak"             ; but most are deprecated
                  / "i-klingon"         ; in favor of more modern
                  / "i-lux"             ; subtags or subtag
                  / "i-mingo"           ; combination
                  / "i-navajo"
                  / "i-pwn"
                  / "i-tao"
                  / "i-tay"
                  / "i-tsu"
                  / "sgn-BE-FR"
                  / "sgn-BE-NL"
                  / "sgn-CH-DE"
    
    regular       = "art-lojban"        ; these tags match the 'langtag'
                  / "cel-gaulish"       ; production, but their subtags
                  / "no-bok"            ; are not extended language
                  / "no-nyn"            ; or variant subtags: their meaning
                  / "zh-guoyu"          ; is defined by their registration
                  / "zh-hakka"          ; and all of these are deprecated
                  / "zh-min"            ; in favor of a more modern
                  / "zh-min-nan"        ; subtag or sequence of subtags
                  / "zh-xiang"
    
    alphanum      = (ALPHA / DIGIT)     ; letters and numbers
    

    Figure 1: Language Tag ABNF

    Note:

    1.1.1 Formatting of Languge Tags

    Although tags should be case-insensitive, there are formatting conventions:

    • recommends that language codes be written in lowercase ('mn' Mongolian).
    • recommends that script codes use lowercase with the initial letter capitalized ('Cyrl' Cyrillic).
    • recommends that country codes be capitalized ('MN' Mongolia).

    1.2 Language Subtag Sources and Interpretation

    The namespace of language tags and their subtags is administered by the Internet Assigned Numbers Authority (IANA) according to the rules in Section 5 of this document. The Language Subtag Registry maintained by IANA is the source for valid subtags: other standards referenced in this section provide the source material for that registry.

    1.2.1 Primary Language Subtag

    Should never be omitted in most cases, can be two or three characters.

  • 相关阅读:
    css选择器中:first-child与:first-of-type的区别
    Chrome 快捷键
    notepad++ html格式化
    Linux VFS的主要的数据结构
    Linux根文件系统介绍
    Linux文件系统测试工具
    p​o​s​t​m​a​r​k​使​用
    虚拟文件系统
    linux文件系统初始化过程(6)---执行init程序
    linux文件系统初始化过程(4)---加载initrd(中)
  • 原文地址:https://www.cnblogs.com/yangyingchao/p/3794436.html
Copyright © 2011-2022 走看看