zoukankan      html  css  js  c++  java
  • LanguageTag

    LanguageTag

    Table of Contents

    This is a memo of RFC 5646, ie BCP-47.

    1 The Language Tag

    Language tags are used to help identify languages, whether spoken, written, signed, or otherwise signaled, for the purpose of communication. This includes constructed and artificial languages but excludes languages not intended primarily for human communication, such as programming languages.

    1.1 Syntax

    • TAG is composed from a sequence of one or more subtags
    • SubTags are sequence of alphanumric characters to narrow the range of languge.
    • SubTags are concated suing "-".

    The syntax of the language tag in ABNF [RFC5234] is:

    Language-Tag  = langtag             ; normal language tags
                  / privateuse          ; private use tag
                  / grandfathered       ; grandfathered tags
    
    langtag       = language
                    ["-" script]
                    ["-" region]
                    *("-" variant)
                    *("-" extension)
                    ["-" privateuse]
    
    language      = 2*3ALPHA            ; shortest ISO 639 code
                    ["-" extlang]       ; sometimes followed by
                                        ; extended language subtags
                  / 4ALPHA              ; or reserved for future use
                  / 5*8ALPHA            ; or registered language subtag
    
    extlang       = 3ALPHA              ; selected ISO 639 codes
                    *2("-" 3ALPHA)      ; permanently reserved
    
    script        = 4ALPHA              ; ISO 15924 code
    
    region        = 2ALPHA              ; ISO 3166-1 code
                  / 3DIGIT              ; UN M.49 code
    
    variant       = 5*8alphanum         ; registered variants
                  / (DIGIT 3alphanum)
    
    extension     = singleton 1*("-" (2*8alphanum))
    
                                        ; Single alphanumerics
                                        ; "x" reserved for private use
    singleton     = DIGIT               ; 0 - 9
                  / %x41-57             ; A - W
                  / %x59-5A             ; Y - Z
                  / %x61-77             ; a - w
                  / %x79-7A             ; y - z
    
    privateuse    = "x" 1*("-" (1*8alphanum))
    
    grandfathered = irregular           ; non-redundant tags registered
                  / regular             ; during the RFC 3066 era
    
    irregular     = "en-GB-oed"         ; irregular tags do not match
                  / "i-ami"             ; the 'langtag' production and
                  / "i-bnn"             ; would not otherwise be
                  / "i-default"         ; considered 'well-formed'
                  / "i-enochian"        ; These tags are all valid,
                  / "i-hak"             ; but most are deprecated
                  / "i-klingon"         ; in favor of more modern
                  / "i-lux"             ; subtags or subtag
                  / "i-mingo"           ; combination
                  / "i-navajo"
                  / "i-pwn"
                  / "i-tao"
                  / "i-tay"
                  / "i-tsu"
                  / "sgn-BE-FR"
                  / "sgn-BE-NL"
                  / "sgn-CH-DE"
    
    regular       = "art-lojban"        ; these tags match the 'langtag'
                  / "cel-gaulish"       ; production, but their subtags
                  / "no-bok"            ; are not extended language
                  / "no-nyn"            ; or variant subtags: their meaning
                  / "zh-guoyu"          ; is defined by their registration
                  / "zh-hakka"          ; and all of these are deprecated
                  / "zh-min"            ; in favor of a more modern
                  / "zh-min-nan"        ; subtag or sequence of subtags
                  / "zh-xiang"
    
    alphanum      = (ALPHA / DIGIT)     ; letters and numbers
    

    Figure 1: Language Tag ABNF

    Note:

    1.1.1 Formatting of Languge Tags

    Although tags should be case-insensitive, there are formatting conventions:

    • recommends that language codes be written in lowercase ('mn' Mongolian).
    • recommends that script codes use lowercase with the initial letter capitalized ('Cyrl' Cyrillic).
    • recommends that country codes be capitalized ('MN' Mongolia).

    1.2 Language Subtag Sources and Interpretation

    The namespace of language tags and their subtags is administered by the Internet Assigned Numbers Authority (IANA) according to the rules in Section 5 of this document. The Language Subtag Registry maintained by IANA is the source for valid subtags: other standards referenced in this section provide the source material for that registry.

    1.2.1 Primary Language Subtag

    Should never be omitted in most cases, can be two or three characters.

  • 相关阅读:
    POSIX、XNU
    面向切面编程
    盗链
    django restframwork教程之Request和Response
    django restframework 教程之Serialization(序列化)
    Django restframwork
    saltstack远程执行命令.md
    saltstack安装
    django实现瀑布流、组合搜索、阶梯评论、验证码
    django文件上传和序列化
  • 原文地址:https://www.cnblogs.com/yangyingchao/p/3794436.html
Copyright © 2011-2022 走看看