zoukankan      html  css  js  c++  java
  • java DotNet char 代码对应

    http://blog.sina.com.cn/s/blog_4ae102b001012u4y.html

    http://technet.microsoft.com/zh-cn/library/system.globalization.unicodecategory(v=vs.96).aspx

    http://docs.oracle.com/javase/6/docs/api/java/lang/Character.html

    http://www.chinaitpower.com/source/jdk142/java/lang/CharacterData.java.html

    public static final byte UNASSIGNED = 0; // 无效字符
    public static final byte UPPERCASE_LETTER = 1; // 大写字母
    public static final byte LOWERCASE_LETTER = 2; // 小写字母
    public static final byte TITLECASE_LETTER = 3; // 题目字母
    public static final byte MODIFIER_LETTER = 4; // 修饰字母
    public static final byte OTHER_LETTER = 5; // 其它字母
    public static final byte NON_SPACING_MARK = 6; // 非间隔符号
    public static final byte ENCLOSING_MARK = 7; // 包围符号
    public static final byte COMBINING_SPACING_MARK = 8; // 组合间隔符号
    public static final byte DECIMAL_DIGIT_NUMBER = 9; // 十进制数字
    public static final byte LETTER_NUMBER = 10; // 数字含义的非数字字符(例如中文“〇”,罗马数字“Ⅳ”等)
    public static final byte OTHER_NUMBER = 11; // 其它数字(如 3/5(/u2157), 1/8(/u215B)等)
    public static final byte SPACE_SEPARATOR = 12; // 空白符
    public static final byte LINE_SEPARATOR = 13; // 换行符
    public static final byte PARAGRAPH_SEPARATOR = 14; // 分段符
    public static final byte CONTROL = 15; // 扼制字符
    public static final byte FORMAT = 16; // 款式化字符
    public static final byte PRIVATE_USE = 18; // 专用字符(各种偏旁和符号,如:折角(/uE801) 上点号(/uF70A)等)
    public static final byte SURROGATE = 19; // 轮换字符
    public static final byte DASH_PUNCTUATION = 20; // 挫折号
    public static final byte START_PUNCTUATION = 21; // 开始标点号 (如:(,罗莱家纺蚕丝被 {, [ ...)
    public static final byte END_PUNCTUATION = 22; // 告终标点号 (如:), }, ] ...)
    public static final byte CONNECTOR_PUNCTUATION = 23; // 连字号
    public static final byte OTHER_PUNCTUATION = 24; // 其它标点符号 #=24 。=24 、=24
    public static final byte MATH_SYMBOL = 25; // 数学符号 ≈=25 ∑=25 √=25
    public static final byte CURRENCY_SYMBOL = 26; // 货币符号(如:“$”,“¥”等)
    public static final byte MODIFIER_SYMBOL = 27; // 修饰符号
    public static final byte OTHER_SYMBOL = 28; // 其它符号 ┌=28 ┭=28 ╃=28 §=28 ♀=28
    public static final byte INITIAL_QUOTE_PUNCTUATION = 29; // 引号 “ 等 ‘=29 “=29
    public static final byte FINAL_QUOTE_PUNCTUATION = 30; // 反引号 ”等别样视角的存在也解释了为什么作者必需编辑,而行发动必需教练的起因。

    EN

    此内容没有您的语言版本,但有英语版本。

    UnicodeCategory Enumeration

    其他版本

    此主题尚未评级 - 评价此主题

    Defines the Unicode category of a character.

    Namespace: System.Globalization
    Assembly: mscorlib (in mscorlib.dll)

    Syntax


    C#

    VB

    [ComVisibleAttribute(true)]
    public enum UnicodeCategory

    Members


    Member name
    Description

    Supported by Silverlight for Windows PhoneSupported by Xbox 360
    UppercaseLetter
    Indicates that the character is an uppercase letter. Signified by the Unicode designation "Lu" (letter, uppercase).

    Supported by Silverlight for Windows PhoneSupported by Xbox 360
    LowercaseLetter
    Indicates that the character is a lowercase letter. Signified by the Unicode designation "Ll" (letter, lowercase).

    Supported by Silverlight for Windows PhoneSupported by Xbox 360
    TitlecaseLetter
    Indicates that the character is a titlecase letter. Signified by the Unicode designation "Lt" (letter, titlecase).

    Supported by Silverlight for Windows PhoneSupported by Xbox 360
    ModifierLetter
    Indicates that the character is a modifier letter, which is free-standing spacing character that indicates modifications of a preceding letter. Signified by the Unicode designation "Lm" (letter, modifier).

    Supported by Silverlight for Windows PhoneSupported by Xbox 360
    OtherLetter
    Indicates that the character is a letter that is not an uppercase letter, a lowercase letter, a titlecase letter, or a modifier letter. Signified by the Unicode designation "Lo" (letter, other).

    Supported by Silverlight for Windows PhoneSupported by Xbox 360
    NonSpacingMark
    Indicates that the character is a nonspacing character, which indicates modifications of a base character. Signified by the Unicode designation "Mn" (mark, nonspacing).

    Supported by Silverlight for Windows PhoneSupported by Xbox 360
    SpacingCombiningMark
    Indicates that the character is a spacing character, which indicates modifications of a base character and affects the width of the glyph for that base character. Signified by the Unicode designation "Mc" (mark, spacing combining).

    Supported by Silverlight for Windows PhoneSupported by Xbox 360
    EnclosingMark
    Indicates that the character is an enclosing mark, which is a nonspacing combining character that surrounds all previous characters up to and including a base character. Signified by the Unicode designation "Me" (mark, enclosing).

    Supported by Silverlight for Windows PhoneSupported by Xbox 360
    DecimalDigitNumber
    Indicates that the character is a decimal digit, that is, in the range 0 through 9. Signified by the Unicode designation "Nd" (number, decimal digit).

    Supported by Silverlight for Windows PhoneSupported by Xbox 360
    LetterNumber
    Indicates that the character is a number represented by a letter, instead of a decimal digit, for example, the Roman numeral for five, which is "V". The indicator is signified by the Unicode designation "Nl" (number, letter).

    Supported by Silverlight for Windows PhoneSupported by Xbox 360
    OtherNumber
    Indicates that the character is a number that is neither a decimal digit nor a letter number, for example, the fraction 1/2. The indicator is signified by the Unicode designation "No" (number, other).

    Supported by Silverlight for Windows PhoneSupported by Xbox 360
    SpaceSeparator
    Indicates that the character is a space character, which has no glyph but is not a control or format character. Signified by the Unicode designation "Zs" (separator, space).

    Supported by Silverlight for Windows PhoneSupported by Xbox 360
    LineSeparator
    Indicates that the character is used to separate lines of text. Signified by the Unicode designation "Zl" (separator, line).

    Supported by Silverlight for Windows PhoneSupported by Xbox 360
    ParagraphSeparator
    Indicates that the character is used to separate paragraphs. Signified by the Unicode designation "Zp" (separator, paragraph).

    Supported by Silverlight for Windows PhoneSupported by Xbox 360
    Control
    Indicates that the character is a control code, with a Unicode value of U+007F or in the range U+0000 through U+001F or U+0080 through U+009F. Signified by the Unicode designation "Cc" (other, control).

    Supported by Silverlight for Windows PhoneSupported by Xbox 360
    Format
    Indicates that the character is a format character, which is not normally rendered but affects the layout of text or the operation of text processes. Signified by the Unicode designation "Cf" (other, format).

    Supported by Silverlight for Windows PhoneSupported by Xbox 360
    Surrogate
    Indicates that the character is a high surrogate or a low surrogate. Surrogate code values are in the range U+D800 through U+DFFF. Signified by the Unicode designation "Cs" (other, surrogate).

    Supported by Silverlight for Windows PhoneSupported by Xbox 360
    PrivateUse
    Indicates that the character is a private-use character, with a Unicode value in the range U+E000 through U+F8FF. Signified by the Unicode designation "Co" (other, private use).

    Supported by Silverlight for Windows PhoneSupported by Xbox 360
    ConnectorPunctuation
    Indicates that the character is a connector punctuation, which connects two characters. Signified by the Unicode designation "Pc" (punctuation, connector).

    Supported by Silverlight for Windows PhoneSupported by Xbox 360
    DashPunctuation
    Indicates that the character is a dash or a hyphen. Signified by the Unicode designation "Pd" (punctuation, dash).

    Supported by Silverlight for Windows PhoneSupported by Xbox 360
    OpenPunctuation
    Indicates that the character is the opening character of one of the paired punctuation marks, such as parentheses, square brackets, and braces. Signified by the Unicode designation "Ps" (punctuation, open).

    Supported by Silverlight for Windows PhoneSupported by Xbox 360
    ClosePunctuation
    Indicates that the character is the closing character of one of the paired punctuation marks, such as parentheses, square brackets, and braces. Signified by the Unicode designation "Pe" (punctuation, close).

    Supported by Silverlight for Windows PhoneSupported by Xbox 360
    InitialQuotePunctuation
    Indicates that the character is an opening or initial quotation mark. Signified by the Unicode designation "Pi" (punctuation, initial quote).

    Supported by Silverlight for Windows PhoneSupported by Xbox 360
    FinalQuotePunctuation
    Indicates that the character is a closing or final quotation mark. Signified by the Unicode designation "Pf" (punctuation, final quote).

    Supported by Silverlight for Windows PhoneSupported by Xbox 360
    OtherPunctuation
    Indicates that the character is a punctuation that is not a connector punctuation, a dash punctuation, an open punctuation, a close punctuation, an initial quote punctuation, or a final quote punctuation. Signified by the Unicode designation "Po" (punctuation, other).

    Supported by Silverlight for Windows PhoneSupported by Xbox 360
    MathSymbol
    Indicates that the character is a mathematical symbol, such as "+" or "= ". Signified by the Unicode designation "Sm" (symbol, math).

    Supported by Silverlight for Windows PhoneSupported by Xbox 360
    CurrencySymbol
    Indicates that the character is a currency symbol. Signified by the Unicode designation "Sc" (symbol, currency).

    Supported by Silverlight for Windows PhoneSupported by Xbox 360
    ModifierSymbol
    Indicates that the character is a modifier symbol, which indicates modifications of surrounding characters. For example, the fraction slash indicates that the number to the left is the numerator and the number to the right is the denominator. The indicator is signified by the Unicode designation "Sk" (symbol, modifier).

    Supported by Silverlight for Windows PhoneSupported by Xbox 360
    OtherSymbol
    Indicates that the character is a symbol that is not a mathematical symbol, a currency symbol or a modifier symbol. Signified by the Unicode designation "So" (symbol, other).

    Supported by Silverlight for Windows PhoneSupported by Xbox 360
    OtherNotAssigned
    Indicates that the character is not assigned to any Unicode category. Signified by the Unicode designation "Cn" (other, not assigned).

    Remarks


    A member of the UnicodeCategory enumeration is returned by the Char.GetUnicodeCategory and CharUnicodeInfo.GetUnicodeCategory methods. The UnicodeCategoryenumeration is also used to support Char methods, such as IsUpper(Char). Such methods determine whether a specified character is a member of a particular Unicode general category. A Unicode general category defines the broad classification of a character, that is, designation as a type of letter, decimal digit, separator, mathematical symbol, punctuation, and so on.

    This enumeration is based on The Unicode Standard, version 5.0. For more information, see the "UCD File Format" and "General Category Values" subtopics at the Unicode Character Database.

    The Unicode Standard defines the following:

    A surrogate pair is a coded character representation for a single abstract character that consists of a sequence of two code units, where the first unit of the pair is a high surrogate and the second is a low surrogate. A high surrogate is a Unicode code point in the range U+D800 through U+DBFF and a low surrogate is a Unicode code point in the range U+DC00 through U+DFFF.

    A combining character sequence is a combination of a base character and one or more combining characters. A surrogate pair represents a base character or a combining character. A combining character is either spacing or nonspacing. A spacing combining character takes up a spacing position by itself when rendered, while a nonspacing combining character does not. Diacritics are an example of nonspacing combining characters.

    A modifier letter is a free-standing spacing character that, like a combining character, indicates modifications of a preceding letter.

    An enclosing mark is a nonspacing combining character that surrounds all previous characters up to and including a base character.

    A format character is a character that is not normally rendered but that affects the layout of text or the operation of text processes.

    The Unicode Standard defines several variations to some punctuation marks. For example, a hyphen can be one of several code values that represent a hyphen, such as U+002D (hyphen-minus) or U+00AD (soft hyphen) or U+2010 (hyphen) or U+2011 (nonbreaking hyphen). The same is true for dashes, space characters, and quotation marks.

    The Unicode Standard also assigns codes to representations of decimal digits that are specific to a given script or language, for example, U+0030 (digit zero) and U+0660 (Arabic-Indic digit zero).

    等等
  • 相关阅读:
    阿里巴巴开源的Asynchronous I/O Design and Implementation
    maven 出现错误 -source 1.5 中不支持 diamond 运算符
    Kafka设计解析(六)- Kafka高性能架构之道
    Kafka设计解析(七)- Kafka Stream
    flink如何动态支持依赖jar包提交
    Hbase技术笔记
    windows环境:idea或者eclipse指定用户名操作hadoop集群
    HBase源码实战:BufferedMutator
    HBase工具:如何查看HBase的HFile
    HBase源码实战:CreateRandomStoreFile
  • 原文地址:https://www.cnblogs.com/adodo1/p/4327423.html
Copyright © 2011-2022 走看看