zoukankan      html  css  js  c++  java
  • native2ascii转码工具的使用

    native2ascii转码工具是JDK自带的一种,方便我们将非unicode的编码文件转为unicode格式的文件,位置一般是位于JAVA_HOME/bin目录下。

    Why?

    在做Java开发的时候,常常会出现一些乱码,或者无法正确识别或读取的文件,比如常见的validator验证用的消息资源(properties)文 件就需要进行Unicode重新编码。原因是java默认的编码方式为Unicode,而计算机系统编码常常是GBK等编码。需要将系统的编码转换 为java正确识别的编码问题就解决

    语法格式:

    native2ascii -[options] [intputfile] [outputfile]

    语法格式说明: 

    -[options]:表示命令开关,有两个选项可供选择

      -reverse:将Unicode编码转为本地或者指定编码,不指定编码情况下,将转为本地编码。

      -encoding encoding_name:转换为指定编码,encoding_name为编码名称。

       [inputfile [outputfile]]

       inputfile:表示输入文件全名。

       outputfile:输出文件名。如果缺少此参数,将输出到控制台。

    举例说明:

              1.不指定输出文件的输出位置,打印在当前的dos界面

               

              2.  将指定的文件输出转为unicode码到当前目录下的new.properties文件中

                native2ascii  old.properties new.properties

               

               

               3.将当前的unicode码的文件new.properties转为本地编码的old.properties

                native2ascii  new.properties old.properties

               

               

    注意:前三种的写法都是采用的是相对路径,我们需要在cd   到dos界面下,执行native2ascii  old.properties  new.properties命令,比较麻烦

    另外我们提供一种比较简便的方法.用.bat来代替执行.

               4.自定义一个目录,提供两个bat文件,双击执行就ok

                 change.bat

                 D:Javain ative2ascii F:国际化old.properties F:国际化 ew.properties

                 ReturnChange.bat

                 D:Javain ative2ascii -reverse F:国际化 ew.properties F:国际化old.properties

                

               

    注意:针对数字跟字母,转换前后一致。

               

    衍生阅读:

    Basic Encoding Set (contained in lib/rt.jar)
    Supported by java.nio, java.io and java.lang APIs

    Canonical Name for java.nio API

    Canonical Name for java.io and java.lang API

    Description

    US-ASCII

    ASCII

    American Standard Code for Information Interchange

    windows-1250

    Cp1250

    Windows Eastern European

    windows-1251

    Cp1251

    Windows Cyrillic

    windows-1252

    Cp1252

    Windows Latin-1

    windows-1253

    Cp1253

    Windows Greek

    windows-1254

    Cp1254

    Windows Turkish

    windows-1257

    Cp1257

    Windows Baltic

    ISO-8859-1

    ISO8859_1

    ISO 8859-1, Latin Alphabet No. 1

    ISO-8859-2

    ISO8859_2

    Latin Alphabet No. 2

    ISO-8859-4

    ISO8859_4

    Latin Alphabet No. 4

    ISO-8859-5

    ISO8859_5

    Latin/Cyrillic Alphabet

    ISO-8859-7

    ISO8859_7

    Latin/Greek Alphabet

    ISO-8859-9

    ISO8859_9

    Latin Alphabet No. 5

    ISO-8859-13

    ISO8859_13

    Latin Alphabet No. 7

    ISO-8859-15

    ISO8859_15

    Latin Alphabet No. 9

    KOI8-R

    KOI8_R

    KOI8-R, Russian

    UTF-8

    UTF8

    Eight-bit UCS Transformation Format

    UTF-16

    UTF-16

    Sixteen-bit UCS Transformation Format, byte order identified by an optional byte-order mark

    UTF-16BE

    UnicodeBigUnmarked

    Sixteen-bit Unicode Transformation Format, big-endian byte order

    UTF-16LE

    UnicodeLittleUnmarked

    Sixteen-bit Unicode Transformation Format, little-endian byte order

    Not available

    UnicodeBig

    Sixteen-bit Unicode Transformation Format, big-endian byte order, with byte-order mark

    Not available

    UnicodeLittle

    Sixteen-bit Unicode Transformation Format, little-endian byte order, with byte-order mark

    Extended Encoding Set (contained in lib/charsets.jar)
    Supported by java.nio, java.io and java.lang APIs

    Canonical Name for java.nio API

    Canonical Name for java.io and java.lang API

    Description

    windows-1255

    Cp1255

    Windows Hebrew

    windows-1256

    Cp1256

    Windows Arabic

    windows-1258

    Cp1258

    Windows Vietnamese

    ISO-8859-3

    ISO8859_3

    Latin Alphabet No. 3

    ISO-8859-6

    ISO8859_6

    Latin/Arabic Alphabet

    ISO-8859-8

    ISO8859_8

    Latin/Hebrew Alphabet

    windows-31j

    MS932

    Windows Japanese

    EUC-JP

    EUC_JP

    JISX 0201, 0208 and 0212, EUC encoding Japanese

    x-EUC-JP-LINUX

    EUC_JP_LINUX

    JISX 0201, 0208 , EUC encoding Japanese

    Shift_JIS

    SJIS

    Shift-JIS, Japanese

    ISO-2022-JP

    ISO2022JP

    JIS X 0201, 0208, in ISO 2022 form, Japanese

    x-mswin-936

    MS936

    Windows Simplified Chinese

    GB18030

    GB18030

    Simplified Chinese, PRC standard

    x-EUC-CN

    EUC_CN

    GB2312, EUC encoding, Simplified Chinese

    GBK

    GBK

    GBK, Simplified Chinese

    ISCII91

    ISCII91

    ISCII91 encoding of Indic scripts

    x-windows-949

    MS949

    Windows Korean

    EUC-KR

    EUC_KR

    KS C 5601, EUC encoding, Korean

    ISO-2022-KR

    ISO2022KR

    ISO 2022 KR, Korean

    x-windows-950

    MS950

    Windows Traditional Chinese

    x-MS950-HKSCS

    MS950_HKSCS

    Windows Traditional Chinese with Hong Kong extensions

    x-EUC-TW

    EUC_TW

    CNS11643 (Plane 1-3), EUC encoding, Traditional Chinese

    Big5

    Big5

    Big5, Traditional Chinese

    Big5-HKSCS

    Big5_HKSCS

    Big5 with Hong Kong extensions, Traditional Chinese

    TIS-620

    TIS620

    TIS620, Thai

    Extended Encoding Set (contained in lib/charsets.jar)
    Supported by java.io and java.lang APIs

    Canonical Name

    Description

    Big5_Solaris

    Big5 with seven additional Hanzi ideograph character mappings for the Solaris zh_TW.BIG5 locale

    Cp037

    USA, Canada (Bilingual, French), Netherlands, Portugal, Brazil, Australia

    Cp273

    IBM Austria, Germany

    Cp277

    IBM Denmark, Norway

    Cp278

    IBM Finland, Sweden

    Cp280

    IBM Italy

    Cp284

    IBM Catalan/Spain, Spanish Latin America

    Cp285

    IBM United Kingdom, Ireland

    Cp297

    IBM France

    Cp420

    IBM Arabic

    Cp424

    IBM Hebrew

    Cp437

    MS-DOS United States, Australia, New Zealand, South Africa

    Cp500

    EBCDIC 500V1

    Cp737

    PC Greek

    Cp775

    PC Baltic

    Cp838

    IBM Thailand extended SBCS

    Cp850

    MS-DOS Latin-1

    Cp852

    MS-DOS Latin-2

    Cp855

    IBM Cyrillic

    Cp856

    IBM Hebrew

    Cp857

    IBM Turkish

    Cp858

    Variant of Cp850 with Euro character

    Cp860

    MS-DOS Portuguese

    Cp861

    MS-DOS Icelandic

    Cp862

    PC Hebrew

    Cp863

    MS-DOS Canadian French

    Cp864

    PC Arabic

    Cp865

    MS-DOS Nordic

    Cp866

    MS-DOS Russian

    Cp868

    MS-DOS Pakistan

    Cp869

    IBM Modern Greek

    Cp870

    IBM Multilingual Latin-2

    Cp871

    IBM Iceland

    Cp874

    IBM Thai

    Cp875

    IBM Greek

    Cp918

    IBM Pakistan (Urdu)

    Cp921

    IBM Latvia, Lithuania (AIX, DOS)

    Cp922

    IBM Estonia (AIX, DOS)

    Cp930

    Japanese Katakana-Kanji mixed with 4370 UDC, superset of 5026

    Cp933

    Korean Mixed with 1880 UDC, superset of 5029

    Cp935

    Simplified Chinese Host mixed with 1880 UDC, superset of 5031

    Cp937

    Traditional Chinese Host miexed with 6204 UDC, superset of 5033

    Cp939

    Japanese Latin Kanji mixed with 4370 UDC, superset of 5035

    Cp942

    IBM OS/2 Japanese, superset of Cp932

    Cp942C

    Variant of Cp942

    Cp943

    IBM OS/2 Japanese, superset of Cp932 and Shift-JIS

    Cp943C

    Variant of Cp943

    Cp948

    OS/2 Chinese (Taiwan) superset of 938

    Cp949

    PC Korean

    Cp949C

    Variant of Cp949

    Cp950

    PC Chinese (Hong Kong, Taiwan)

    Cp964

    AIX Chinese (Taiwan)

    Cp970

    AIX Korean

    Cp1006

    IBM AIX Pakistan (Urdu)

    Cp1025

    IBM Multilingual Cyrillic: Bulgaria, Bosnia, Herzegovinia, Macedonia (FYR)

    Cp1026

    IBM Latin-5, Turkey

    Cp1046

    IBM Arabic - Windows

    Cp1097

    IBM Iran (Farsi)/Persian

    Cp1098

    IBM Iran (Farsi)/Persian (PC)

    Cp1112

    IBM Latvia, Lithuania

    Cp1122

    IBM Estonia

    Cp1123

    IBM Ukraine

    Cp1124

    IBM AIX Ukraine

    Cp1140

    Variant of Cp037 with Euro character

    Cp1141

    Variant of Cp273 with Euro character

    Cp1142

    Variant of Cp277 with Euro character

    Cp1143

    Variant of Cp278 with Euro character

    Cp1144

    Variant of Cp280 with Euro character

    Cp1145

    Variant of Cp284 with Euro character

    Cp1146

    Variant of Cp285 with Euro character

    Cp1147

    Variant of Cp297 with Euro character

    Cp1148

    Variant of Cp500 with Euro character

    Cp1149

    Variant of Cp871 with Euro character

    Cp1381

    IBM OS/2, DOS People's Republic of China (PRC)

    Cp1383

    IBM AIX People's Republic of China (PRC)

    Cp33722

    IBM-eucJP - Japanese (superset of 5050)

    ISO2022_CN_CNS

    CNS11643 in ISO 2022 CN form, Traditional Chinese (conversion from Unicode only)

    ISO2022_CN_GB

    GB2312 in ISO 2022 CN form, Simplified Chinese (conversion from Unicode only)

    JISAutoDetect

    Detects and converts from Shift-JIS, EUC-JP, ISO 2022 JP (conversion to Unicode only)

    MS874

    Windows Thai

    MacArabic

    Macintosh Arabic

    MacCentralEurope

    Macintosh Latin-2

    MacCroatian

    Macintosh Croatian

    MacCyrillic

    Macintosh Cyrillic

    MacDingbat

    Macintosh Dingbat

    MacGreek

    Macintosh Greek

    MacHebrew

    Macintosh Hebrew

    MacIceland

    Macintosh Iceland

    MacRoman

    Macintosh Roman

    MacRomania

    Macintosh Romania

    MacSymbol

    Macintosh Symbol

    MacThai

    Macintosh Thai

    MacTurkish

    Macintosh Turkish

    MacUkraine

    Macintosh Ukraine

    参考链接:1.http://www.cr173.com/html/26685_1.html

                 2.http://blog.csdn.net/love_xsq/article/details/41911681

               

            

  • 相关阅读:
    listview 优化
    重要博客网址
    bottombar——Fragment
    视频播放,,今日头条样式
    databinding
    Picasso
    22222222
    202004leetcode刷题记录
    批量下载邮箱中指定日期范围的附件
    有雾环境下的目标检测
  • 原文地址:https://www.cnblogs.com/Ant-soldier/p/5991310.html
Copyright © 2011-2022 走看看