zoukankan html css js c++ java

java版正则式提取替换示例

View Code

package regex;

import java.io.IOException;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**
 * 注意,Matcher才是正则式的主要操作类,它里面包含了抽取替换最重要的方法.Pattern不是主要的类.
 * replaceAll用来全部替换.replaceFirst或replaceEnd可以做递归替换.
 * 
 * @author gaoyibo
 * 
 */
public class RegexAPITest {

    public static final int TITLE_LEN = 50;
    public static final int DESC_LEN = 80;
    // \u4e00-\u9fa5表示中文,\uFF00-\uFFFF表示全角 其它表示一些特殊字符.
    public static String regexStr = "\\{([\u4e00-\u9fa5A-Za-z0-9\\'\",;.:?!、《》<>‘“，；。：？！＜＞：\\s\uFF00-\uFFFF]*)\\}";

    public static enum IdeaContentType {
        TITLE, DESC1, DESC2, ACCESS_URL, SHOW_URL
    }

    /**
     * 需求描述:admin中广告搜索,审核日志搜索,行业审核中需要用到替换默认关键词. 替换的规则是:
     * 1.对于title,如果替换后的长度超过50,对于desc1,desc2,如果替换后的长度超过80,则不做替换,使用默认关键词并且显示为绿色字体;
     * 2.如果未超过,则使用指定的关键词替换并且使用红色字体.
     * 
     * @author gaoyibo
     * @param source
     *            要替换的title or desc1 or desc2
     * @param key
     *            指定的关键词
     * @param type
     * @return
     */
    public static String replaceDefaultKey(String source, String key,
            IdeaContentType type) {
        String result = "";
        if (source == null || source.length() <= 0)
            return result;
        Matcher matcher = Pattern.compile(regexStr).matcher(source);
        if (!matcher.find()) {
            return source;
        }
        String replaceFormatKey = "<font color='red'>" + key + "</font>";
        // 先根据全部替换之后的值的长度判断,如果超长,就不用[格式化后的key]去替换source里面的默认关键词.如果未超长,则替换.
        result = Pattern.compile(regexStr).matcher(source).replaceAll(key);
        switch (type) {
        case TITLE:
            // 不替换,使用默认关键词,但默认关键词要格式化
            if (result.length() > TITLE_LEN) {
                return doReplace(source);
            }
            // 替换
            return matcher.replaceAll(replaceFormatKey);
        case DESC1:
            if (result.length() > DESC_LEN) {
                return doReplace(source);
            }
            return matcher.replaceAll(replaceFormatKey);
        case DESC2:
            if (result.length() > DESC_LEN) {
                return doReplace(source);
            }
            return matcher.replaceAll(replaceFormatKey);
        default:
            return source;
        }

    }

    /**
     * 递归方法,每次格式化第一个匹配到的默认关键词.
     * 
     * @author gaoyibo
     * @param source
     * @return
     */
    public static String doReplace(String source) {
        Matcher matcher = Pattern.compile(regexStr).matcher(source);
        while (matcher.find()) {
            // 匹配内容的提取.
            String keytmp = matcher.group();
            String defaultFormatKey = "<font color='green'>"
                    + keytmp.substring(1, keytmp.length() - 1) + "</font>";

            // 第一个匹配内容替换,替换之后,再递归.
            return doReplace(matcher.replaceFirst(defaultFormatKey)); // 将当前的默认关键词格式化之后,将返回的字符串递归.直到所有的默认关键词都被格式化.
        }
        return source;
    }

    public static void main(String[] args) throws IOException {
        // 下面的字符串里面有多个默认关键词的通配符,每个默认关键词都不一样.
        String testStr = "asd體:{重1:}23：{：}saA{d中國ks1}asdadsa{DK2}asda{dkＡＳＤ１２３3s}sad2ｒｔｙ34";
        System.out.println(replaceDefaultKey(testStr, "kkkkkk",
                IdeaContentType.TITLE));

    }

}

发现appendReplacement与appendTail方法搭配也能起到上面递归方法的作用.如下例:

    public static String doReplace2(String source) {
        Matcher matcher = Pattern.compile(regexStr).matcher(source);
        StringBuffer sb = new StringBuffer();
        while (matcher.find()) {
            String keytmp = matcher.group();
            String defaultFormatKey = "<font color='green'>"
                    + keytmp.substring(1, keytmp.length() - 1) + "</font>";
            matcher.appendReplacement(sb, defaultFormatKey);
        }
        matcher.appendTail(sb);
        return sb.toString();

    }

------------------------------------------------------

要求提供一个能替换字符串中换行符的方法.

linux 和unix系统的换行是"\n"，而windows的换行并不是直接的"\n"，是"\r\n"。所以out.write

("\n")只能得到一个黑框，因为windos不认为这是个“换行”。直接从记事本输入的话，windows自动

输入了"\r\n"，所以从从文本文件中读出来的也是"\r\n"，可以正常显示。

如果将正则式写成了^([^\n]*(\n)*)*$
则匹配的内容会全部高亮显示.比如
abc
abc
abc
如果将正则式写成[\n-\r]则仅匹配内容间的换行符.

abc

换行符的高亮是不可见的.

替换方法:

    public static String replaceIllegalCharacter(String source) {
        if (source == null)
            return source;
        String reg = "[\n-\r]";
        Pattern p = Pattern.compile(reg);
        Matcher m = p.matcher(source);
        return m.replaceAll("");
    }

-------------------------------------------------------

注:

很多时候,正则式的处理都是在客户端进行的,但是有一些也需要后端处理,比如上传excel并读行到记录.行的校验,清洗就可能需要在后端处理.

附:

http://www.java3z.com/cwbwebhome/article/article8/Regex/Java.Regex.Tutorial.html

查看全文

相关阅读:
微前端的那些事儿
 网络是怎样连接的作者户根勤交流论坛
 数据结构与算法学习
 cpu读取指令时读取的长度
 小程序开发
 npm 安装 chromedriver 失败的解决办法
 Git:代码冲突常见解决方法
 Android通过Chrome Inspect调试WebView的H5 App出现空白页面的解决方法（不需要FQ）
pm2
多媒体技术及应用

原文地址：https://www.cnblogs.com/highriver/p/2050346.html