zoukankan      html  css  js  c++  java
  • tomcat部署在centos6.8上的乱码问题

    web访问经常会莫名其妙的出现各种乱码问题。按照我自己的理解,设置一个charSet的过滤器,代码如下:import java.io.IOException;

    import javax.servlet.Filter;
    import javax.servlet.FilterChain;
    import javax.servlet.FilterConfig;
    import javax.servlet.ServletException;
    import javax.servlet.ServletRequest;
    import javax.servlet.ServletResponse;import java.io.ByteArrayOutputStream;  
    import java.io.OutputStreamWriter;  
    
    public class Charset implements Filter {
    
        @Override//过滤器销毁
        public void destroy() {
            // TODO Auto-generated method stub
    
        }
    
        @Override//chain我感觉是对request的一次转发
        public void doFilter(ServletRequest request, ServletResponse response,
                FilterChain chain) throws IOException, ServletException {
            //设置字符编码,解决乱码问题
            request.setCharacterEncoding("utf-8");
            response.setCharacterEncoding("utf-8");
            response.setContentType("chatset=utf-8");
    chain.doFilter(request, response);
    //让目标执行,放行 } @Override//过滤器初始化 public void init(FilterConfig arg0) throws ServletException { // TODO Auto-generated method stub } private static String getDefaultCharSet() { OutputStreamWriter writer = new OutputStreamWriter(new ByteArrayOutputStream()); String enc = writer.getEncoding(); return enc; } }

    然后在tomcat的conf目录下server.xml添加那么一句   URIEncoding="UTF-8"

     <Connector port="8080" protocol="HTTP/1.1"
                   connectionTimeout="20000"
                   redirectPort="8443"   URIEncoding="UTF-8"   />
    

    最后记得每个jsp的开头都要记得  pageEncoding="utf-8"

    <%@ page language="java" import="java.util.*" pageEncoding="utf-8"%>
    

    基本上javaWeb的开发就不会出现什么乱码问题了。 

    然后想得深入些,String类里面有个方法  String.getBytes()  ,这个方法JDK的源码是这样子的

    @Deprecated
        public void getBytes(int srcBegin, int srcEnd, byte dst[], int dstBegin) {
            if (srcBegin < 0) {
                throw new StringIndexOutOfBoundsException(srcBegin);
            }
            if (srcEnd > value.length) {
                throw new StringIndexOutOfBoundsException(srcEnd);
            }
            if (srcBegin > srcEnd) {
                throw new StringIndexOutOfBoundsException(srcEnd - srcBegin);
            }
            int j = dstBegin;
            int n = srcEnd;
            int i = srcBegin;
            char[] val = value;   /* avoid getfield opcode */
    
            while (i < n) {
                dst[j++] = (byte)val[i++];
            }
        }
    
        /**
         * Encodes this {@code String} into a sequence of bytes using the named
         * charset, storing the result into a new byte array.
         *
         * <p> The behavior of this method when this string cannot be encoded in
         * the given charset is unspecified.  The {@link
         * java.nio.charset.CharsetEncoder} class should be used when more control
         * over the encoding process is required.
         *
         * @param  charsetName
         *         The name of a supported {@linkplain java.nio.charset.Charset
         *         charset}
         *
         * @return  The resultant byte array
         *
         * @throws  UnsupportedEncodingException
         *          If the named charset is not supported
         *
         * @since  JDK1.1
         */
        public byte[] getBytes(String charsetName)
                throws UnsupportedEncodingException {
            if (charsetName == null) throw new NullPointerException();
            return StringCoding.encode(charsetName, value, 0, value.length);
        }
    
        /**
         * Encodes this {@code String} into a sequence of bytes using the given
         * {@linkplain java.nio.charset.Charset charset}, storing the result into a
         * new byte array.
         *
         * <p> This method always replaces malformed-input and unmappable-character
         * sequences with this charset's default replacement byte array.  The
         * {@link java.nio.charset.CharsetEncoder} class should be used when more
         * control over the encoding process is required.
         *
         * @param  charset
         *         The {@linkplain java.nio.charset.Charset} to be used to encode
         *         the {@code String}
         *
         * @return  The resultant byte array
         *
         * @since  1.6
         */
        public byte[] getBytes(Charset charset) {
            if (charset == null) throw new NullPointerException();
            return StringCoding.encode(charset, value, 0, value.length);
        }
    
        /**
         * Encodes this {@code String} into a sequence of bytes using the
         * platform's default charset, storing the result into a new byte array.
         *
         * <p> The behavior of this method when this string cannot be encoded in
         * the default charset is unspecified.  The {@link
         * java.nio.charset.CharsetEncoder} class should be used when more control
         * over the encoding process is required.
         *
         * @return  The resultant byte array
         *
         * @since      JDK1.1
         */
        public byte[] getBytes() {
            return StringCoding.encode(value, 0, value.length);
        }

    如果不指定String.getBytes()里面的参数,它会调用系统默认的参数,也就是java虚拟机的默认参数,查看这个参数的方法如下所示

       System.out.println("Default Charset=" + java.nio.charset.Charset.defaultCharset());  
       System.out.println("file.encoding=" + System.getProperty("file.encoding"));  
       System.out.println("Default Charset=" + java.nio.charset.Charset.defaultCharset());  
       System.out.println("Default Charset in Use=" + getDefaultCharSet());  

    值得一提的是,当你使用sftp工具上传文件至服务器时,如果没有指定sftp协议的文件名格式,sftp工具会以你默认系统的编码(大部分windows的编码都是GBK)传送中文文件名给服务器,这时候如果服务器默认的编码方式不是GBK,那么你上传的文件名在服务器端显示都是乱码。

    centos查看系统的编码方式的命令是  echo $LAN ,也可以输入 locale  命令查看系统的编码。

    在tomcat7中,用get方式访问中文名的文件,如uploads/%E6%B5%8B%E8%AF%95.mp3

    浏览器本身会借助url.enCode()把中文名,空格等等转化为%E8等等的web地址传输编码格式,然后servlet接收到这个地址后会先url.unCode换成普通的编码,然后再根据request.setContentType="UTF-8"来换成中文名,但是这样子设置tomcat的 URIEncoding="UTF-8" 又是闹得哪样? 

    所以说编码真的是一个很神奇的玩意,此帖后面还会更新。

  • 相关阅读:
    心理学安全威胁
    设计模式是在运用构造定律
    分形理论
    构造定律
    [SOA]REST与SOA两种架构的异同比较
    加法是自然之道
    ES : 软件工程学的复杂度理论及物理学解释
    软件架构的灵活设计
    软件复杂度与结构:(影响复杂度的因素)
    socket 的通信过程
  • 原文地址:https://www.cnblogs.com/sundaymorning/p/7353325.html
Copyright © 2011-2022 走看看