zoukankan      html  css  js  c++  java
  • java native2ascii的用法介绍

    将非Unicode编码字符转化为Unicode编码的字符,即国际化。

    语法:native2ascii [options] [inputfile [outputfile]]


    描述:如果outputfile没有指定,标准输出将用于结果输出;如果inputfile没有指定,标准输入设备将用于输入。

    参数
    -reverse
    使用该参数将Unicode编码字符转换为本地编码字符

    -encoding encoding_name 
    用于指定转化时使用的字符编码。默认编码从系统属性file.encoding获取。后面的表格式字符编码,指定encoding_name使用表格第一栏。

    -Joption
    该参数一般无需使用,用于指定java虚拟机的启动参数。例如:-J-Xms48m设置虚拟机启动时分配内存为48M 。

    Example1:
    native2ascii test.txt test_unicode.txt

    test.txt文件内容:native2ascii测试

    test_unicode.txt文件内容:native2asciiu6d4bu8bd5

    Example2:
    native2ascii test_unicode.txt test_gbk.txt -reverse

    test_gbk.txt内容:native2ascii测试

    Basic Encoding Set (contained in lib/rt.jar)
    Supported by java.nio, java.io and java.lang APIs

    Canonical Name for java.nio API

    Canonical Name for java.io and java.lang API

    Description

    US-ASCII

    ASCII

    American Standard Code for Information Interchange

    windows-1250

    Cp1250

    Windows Eastern European

    windows-1251

    Cp1251

    Windows Cyrillic

    windows-1252

    Cp1252

    Windows Latin-1

    windows-1253

    Cp1253

    Windows Greek

    windows-1254

    Cp1254

    Windows Turkish

    windows-1257

    Cp1257

    Windows Baltic

    ISO-8859-1

    ISO8859_1

    ISO 8859-1, Latin Alphabet No. 1

    ISO-8859-2

    ISO8859_2

    Latin Alphabet No. 2

    ISO-8859-4

    ISO8859_4

    Latin Alphabet No. 4

    ISO-8859-5

    ISO8859_5

    Latin/Cyrillic Alphabet

    ISO-8859-7

    ISO8859_7

    Latin/Greek Alphabet

    ISO-8859-9

    ISO8859_9

    Latin Alphabet No. 5

    ISO-8859-13

    ISO8859_13

    Latin Alphabet No. 7

    ISO-8859-15

    ISO8859_15

    Latin Alphabet No. 9

    KOI8-R

    KOI8_R

    KOI8-R, Russian

    UTF-8

    UTF8

    Eight-bit UCS Transformation Format

    UTF-16

    UTF-16

    Sixteen-bit UCS Transformation Format, byte order identified by an optional byte-order mark

    UTF-16BE

    UnicodeBigUnmarked

    Sixteen-bit Unicode Transformation Format, big-endian byte order

    UTF-16LE

    UnicodeLittleUnmarked

    Sixteen-bit Unicode Transformation Format, little-endian byte order

    Not available

    UnicodeBig

    Sixteen-bit Unicode Transformation Format, big-endian byte order, with byte-order mark

    Not available

    UnicodeLittle

    Sixteen-bit Unicode Transformation Format, little-endian byte order, with byte-order mark

    Extended Encoding Set (contained in lib/charsets.jar)
    Supported by java.nio, java.io and java.lang APIs

    Canonical Name for java.nio API

    Canonical Name for java.io and java.lang API

    Description

    windows-1255

    Cp1255

    Windows Hebrew

    windows-1256

    Cp1256

    Windows Arabic

    windows-1258

    Cp1258

    Windows Vietnamese

    ISO-8859-3

    ISO8859_3

    Latin Alphabet No. 3

    ISO-8859-6

    ISO8859_6

    Latin/Arabic Alphabet

    ISO-8859-8

    ISO8859_8

    Latin/Hebrew Alphabet

    windows-31j

    MS932

    Windows Japanese

    EUC-JP

    EUC_JP

    JISX 0201, 0208 and 0212, EUC encoding Japanese

    x-EUC-JP-LINUX

    EUC_JP_LINUX

    JISX 0201, 0208 , EUC encoding Japanese

    Shift_JIS

    SJIS

    Shift-JIS, Japanese

    ISO-2022-JP

    ISO2022JP

    JIS X 0201, 0208, in ISO 2022 form, Japanese

    x-mswin-936

    MS936

    Windows Simplified Chinese

    GB18030

    GB18030

    Simplified Chinese, PRC standard

    x-EUC-CN

    EUC_CN

    GB2312, EUC encoding, Simplified Chinese

    GBK

    GBK

    GBK, Simplified Chinese

    ISCII91

    ISCII91

    ISCII91 encoding of Indic scripts

    x-windows-949

    MS949

    Windows Korean

    EUC-KR

    EUC_KR

    KS C 5601, EUC encoding, Korean

    ISO-2022-KR

    ISO2022KR

    ISO 2022 KR, Korean

    x-windows-950

    MS950

    Windows Traditional Chinese

    x-MS950-HKSCS

    MS950_HKSCS

    Windows Traditional Chinese with Hong Kong extensions

    x-EUC-TW

    EUC_TW

    CNS11643 (Plane 1-3), EUC encoding, Traditional Chinese

    Big5

    Big5

    Big5, Traditional Chinese

    Big5-HKSCS

    Big5_HKSCS

    Big5 with Hong Kong extensions, Traditional Chinese

    TIS-620

    TIS620

    TIS620, Thai

    Extended Encoding Set (contained in lib/charsets.jar)
    Supported by java.io and java.lang APIs

    Canonical Name

    Description

    Big5_Solaris

    Big5 with seven additional Hanzi ideograph character mappings for the Solaris zh_TW.BIG5 locale

    Cp037

    USA, Canada (Bilingual, French), Netherlands, Portugal, Brazil, Australia

    Cp273

    IBM Austria, Germany

    Cp277

    IBM Denmark, Norway

    Cp278

    IBM Finland, Sweden

    Cp280

    IBM Italy

    Cp284

    IBM Catalan/Spain, Spanish Latin America

    Cp285

    IBM United Kingdom, Ireland

    Cp297

    IBM France

    Cp420

    IBM Arabic

    Cp424

    IBM Hebrew

    Cp437

    MS-DOS United States, Australia, New Zealand, South Africa

    Cp500

    EBCDIC 500V1

    Cp737

    PC Greek

    Cp775

    PC Baltic

    Cp838

    IBM Thailand extended SBCS

    Cp850

    MS-DOS Latin-1

    Cp852

    MS-DOS Latin-2

    Cp855

    IBM Cyrillic

    Cp856

    IBM Hebrew

    Cp857

    IBM Turkish

    Cp858

    Variant of Cp850 with Euro character

    Cp860

    MS-DOS Portuguese

    Cp861

    MS-DOS Icelandic

    Cp862

    PC Hebrew

    Cp863

    MS-DOS Canadian French

    Cp864

    PC Arabic

    Cp865

    MS-DOS Nordic

    Cp866

    MS-DOS Russian

    Cp868

    MS-DOS Pakistan

    Cp869

    IBM Modern Greek

    Cp870

    IBM Multilingual Latin-2

    Cp871

    IBM Iceland

    Cp874

    IBM Thai

    Cp875

    IBM Greek

    Cp918

    IBM Pakistan (Urdu)

    Cp921

    IBM Latvia, Lithuania (AIX, DOS)

    Cp922

    IBM Estonia (AIX, DOS)

    Cp930

    Japanese Katakana-Kanji mixed with 4370 UDC, superset of 5026

    Cp933

    Korean Mixed with 1880 UDC, superset of 5029

    Cp935

    Simplified Chinese Host mixed with 1880 UDC, superset of 5031

    Cp937

    Traditional Chinese Host miexed with 6204 UDC, superset of 5033

    Cp939

    Japanese Latin Kanji mixed with 4370 UDC, superset of 5035

    Cp942

    IBM OS/2 Japanese, superset of Cp932

    Cp942C

    Variant of Cp942

    Cp943

    IBM OS/2 Japanese, superset of Cp932 and Shift-JIS

    Cp943C

    Variant of Cp943

    Cp948

    OS/2 Chinese (Taiwan) superset of 938

    Cp949

    PC Korean

    Cp949C

    Variant of Cp949

    Cp950

    PC Chinese (Hong Kong, Taiwan)

    Cp964

    AIX Chinese (Taiwan)

    Cp970

    AIX Korean

    Cp1006

    IBM AIX Pakistan (Urdu)

    Cp1025

    IBM Multilingual Cyrillic: Bulgaria, Bosnia, Herzegovinia, Macedonia (FYR)

    Cp1026

    IBM Latin-5, Turkey

    Cp1046

    IBM Arabic - Windows

    Cp1097

    IBM Iran (Farsi)/Persian

    Cp1098

    IBM Iran (Farsi)/Persian (PC)

    Cp1112

    IBM Latvia, Lithuania

    Cp1122

    IBM Estonia

    Cp1123

    IBM Ukraine

    Cp1124

    IBM AIX Ukraine

    Cp1140

    Variant of Cp037 with Euro character

    Cp1141

    Variant of Cp273 with Euro character

    Cp1142

    Variant of Cp277 with Euro character

    Cp1143

    Variant of Cp278 with Euro character

    Cp1144

    Variant of Cp280 with Euro character

    Cp1145

    Variant of Cp284 with Euro character

    Cp1146

    Variant of Cp285 with Euro character

    Cp1147

    Variant of Cp297 with Euro character

    Cp1148

    Variant of Cp500 with Euro character

    Cp1149

    Variant of Cp871 with Euro character

    Cp1381

    IBM OS/2, DOS People's Republic of China (PRC)

    Cp1383

    IBM AIX People's Republic of China (PRC)

    Cp33722

    IBM-eucJP - Japanese (superset of 5050)

    ISO2022_CN_CNS

    CNS11643 in ISO 2022 CN form, Traditional Chinese (conversion from Unicode only)

    ISO2022_CN_GB

    GB2312 in ISO 2022 CN form, Simplified Chinese (conversion from Unicode only)

    JISAutoDetect

    Detects and converts from Shift-JIS, EUC-JP, ISO 2022 JP (conversion to Unicode only)

    MS874

    Windows Thai

    MacArabic

    Macintosh Arabic

    MacCentralEurope

    Macintosh Latin-2

    MacCroatian

    Macintosh Croatian

    MacCyrillic

    Macintosh Cyrillic

    MacDingbat

    Macintosh Dingbat

    MacGreek

    Macintosh Greek

    MacHebrew

    Macintosh Hebrew

    MacIceland

    Macintosh Iceland

    MacRoman

    Macintosh Roman

    MacRomania

    Macintosh Romania

    MacSymbol

    Macintosh Symbol

    MacThai

    Macintosh Thai

    MacTurkish

    Macintosh Turkish

    MacUkraine

    Macintosh Ukraine

  • 相关阅读:
    算法笔记--贪心
    算法笔记--递归
    算法笔记--哈希
    算法笔记--散列
    算法笔记--排序算法
    算法笔记--简单编程训练
    算法笔记--简单模拟
    算法笔记--注意事项
    3.4 空间滤波
    【解题报告】【概率DP入门】 P1850 换教室
  • 原文地址:https://www.cnblogs.com/lechance/p/4373274.html
Copyright © 2011-2022 走看看