zoukankan      html  css  js  c++  java
  • 正则表达式

    申明:正则表达式可以大大的简化代码,不过对于看不懂的人来说,那也只能骂娘。切身体会,所以感觉有必要撸下来正则表达式!(后期会不断添加各种正则判断)

    基本概念不多说我,直接上例子,通过例子说明吧。

    一、正则基础

    Demo1:

    在java中对反斜线的处理与其它语言不同,在其它语言中,\表示“我想要在正则表达式中插入一个普通的(字面上的)反斜线”,请不要给它任何特殊的意义。而在java中,\的意思是“我要插入一个正则表达式的反斜线,所以其后的写符具有特殊的意义。"例如,如果我想要表示一位数字,那么正则表达式应该是\d。如果你想插入一个普通的反斜线,则应该这样\\,不过换行和制表之类的东西只需要使用单反斜线:

    package com.rah;
    /***
     * 
     * @author team
     *
     */
    public class Demo1 {
    
        public static void main(String[] args) {
            /***
             * 反斜线在程序中必须以\表示如下:
             */
            System.out.println("\".matches("\\"));
            /***
             * ("-?\d+")匹配:可能有一个负号,或者后面跟着一位或多位数字
             */
            System.out.println("-1234".matches("-?\d+"));
            System.out.println("5678".matches("-?\d+"));
            System.out.println("+911".matches("-?\d+"));
            /***
             * ("(-|\+)?\d+")匹配:表示字符串的起始字符可能是一个-或者+(+有特殊意义,需要用\转义),后面跟着一位或多位数字
             */
            System.out.println("+911".matches("(-|\+)?\d+"));
        }
    
    }

    运行结果:

    true
    true
    true
    false
    true

    Demo2
    String.spilt是一个非常有用的正则表达式工具,其功能是”将字符串从正则表达式匹配的地方切开“。

    package com.rah;
    
    import java.util.Arrays;
    
    /***
     * 
     * @author team
     * 
     */
    public class Demo2 {
        public static String knights = "Then, when you have found the shrubbery, you must "
                + "cut down the mightiest tree in the forest..."
                + "with... a herring!";
    
        public static void split(String regex) {
            System.out.println(Arrays.toString(knights.split(regex)));
        }
    
        public static void main(String[] args) {
            /***
             * 按空格划分字符串
             */
            split(" ");
            /***
             * W(\W转义)意思是非单词字符如果是小写W,w则表示一个单词字符
             * 该正则可以标点字符给删了
             */
            split("\W+");
            /***
             * 字母n后面跟着一个单词字符
             */
            split("n\W+");
        }
    }

    运行结果:

    [Then,, when, you, have, found, the, shrubbery,, you, must, cut, down, the, mightiest, tree, in, the, forest...with..., a, herring!]
    [Then, when, you, have, found, the, shrubbery, you, must, cut, down, the, mightiest, tree, in, the, forest, with, a, herring]
    [The, whe, you have found the shrubbery, you must cut dow, the mightiest tree i, the forest...with... a herring!]

    Demo3

    String.replaceFirst()/replaceAll(),也是可以匹配正则的

    package com.rah;
    /***
     * 
     * @author team
     * 
     */
    public class Demo3 {
        public static String  sqlOne = "select * from students";
        public static String  sqlTwo = "seelct count(*) from students";
        public static void main(String[] args) {
            /***
             * 从前四个输出可以看出[]里面是只匹配他就会在第一时间匹配到就不会往下找,就是因为这个小知识点,在开发中浪费了我好多时间Q_Q
             */
            System.out.println(sqlOne.replaceFirst("s", "count(*)"));
            System.out.println(sqlOne.replaceFirst("[s]", "count(*)"));
            /***
             * 找到se匹配
             */
            System.out.println(sqlOne.replaceFirst("se", "count(*)"));
            /***
             * 找到s匹配,就不会往下找
             */
            System.out.println(sqlOne.replaceFirst("[se]", "count(*)"));
            
            System.out.println(sqlOne.replaceFirst("[*]", "count(*)"));
            System.out.println(sqlTwo.replaceFirst("count\(\*\)", "*"));
        }
    }

    运行结果:

    count(*)elect * from students
    count(*)elect * from students
    count(*)lect * from students
    count(*)elect * from students
    select count(*) from students
    seelct * from students

    Demo4:

    检查句子以大写字母开头、以句号结尾

    package com.rah;
    /***
     * 
     * @author team
     *
     */
    public class Demo4 {
        public static boolean matches(String text) {
            /***
             * \p{javaUpperCase} 大写字母,不明白的可以看jdk文档
             */
            return text.matches("\p{javaUpperCase}.*\.");
        }
        public static void main(String[] args) {
            System.out.println(matches("This is correct."));
            System.out.println(matches("bad sentence 1."));
            System.out.println(matches("Bad sentence 2"));
            System.out.println(matches("This is also correct..."));
        }
    }

    运行结果:

    true
    false
    false
    true

    Demo5:

    package com.rah;
    
    import java.util.Arrays;
    
    /***
     * 
     * @author team
     * 
     */
    public class Demo5 {
        public static String knights = "Then, when you have found the shrubbery, you must "
                + "cut down the mightiest tree in the forest..."
                + "with... a herring!";
    
        public static void split(String regex) {
            System.out.println(Arrays.toString(knights.split(regex)));
        }
    
        public static void main(String[] args) {
            /***
             * 在the和you处分割
             */
            split("the|you");
        }
    }

    运行结果:

    [Then, when ,  have found ,  shrubbery, ,  must cut down ,  mightiest tree in ,  forest...with... a herring!]

    Demo6

    package com.rah;
    
    /***
     * 
     * @author team
     * 
     */
    public class Demo5 {
        public static String knights = "Then, when you have found the shrubbery, you must "
                + "cut down the mightiest tree in the forest..."
                + "with... a herring!";
    
        /*
         * 对应的内嵌标志表达式是 (?i),它有四种形式:
         *  1,(?i) 
         *  2,(?-i) 
         *  3,(?i:X) 
         *  4,(?-i:X) 
         *  不带有 - 的是开标志,带有 - 的是关标志。
         */
        public static void main(String[] args) {
            /***
             * 对book都忽略大写
             */
            System.out.println("Book".matches("(?i)Book"));
            /***
             * 对b都忽略大写,ook还是得比较大小写,下面的方法作用一样,写的更简洁
             */
            System.out.println("Book".matches("(?i)b(?-i)ook"));
            /***
             * 对b都忽略大写,ook还是得比较大小写
             */
            System.out.println("Book".matches("(?i:b)ook"));
    
            /***
             * (?-i) 的作用域是前面,如a(?-i) (?-i)的作用域是后面,如(?i)B
             */
            System.out.println("bOOk".matches("b(?-i)(?i)ook"));
            System.out.println("aBook".matches("a(?-i:B)ook"));
            
            /***
             * [] 只要匹配到一个再往下匹配,匹配不到了就把当前的替换
             * 没有[] 他要满足字符串到匹配到才换
             */
            System.out.println("ouahoahuah".replaceAll("[ou]", ""));
            System.out.println("ouahoahuah".replaceAll("ou", ""));
            /***
             * 忽略大小写匹配aeiou
             */
            System.out.println(knights.replaceAll("(?i)[aeiou]", ""));
        }
    }

    运行结果:

    true
    true
    true
    true
    true
    ahahah
    ahoahuah
    Thn, whn y hv fnd th shrbbry, y mst ct dwn th mghtst tr n th frst...wth...  hrrng!

    二、创建正则表达式

    写法参考java.util.regex包下的Pattern类

    Demo7:

    package com.rah;
    
    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    
    public class Demo7 {
        public static void main(String[] args) {
            if(args.length<2){
                System.out.println("Usage:
    java TestRegularExpression " + 
                                 "characterSequence regularExpression");
                System.exit(0);
            }
            System.out.println("Input: "" + args[0] + """);
            for(String arg : args){
                System.out.println("Regulqr expression: "" +arg +""");
                Pattern p = Pattern.compile(arg);
                Matcher m = p.matcher(args[0]);
                while(m.find()){
                    System.out.println("Match "" + m.group() + "" at position " + m.start() + "-" + (m.end()-1));
                }
            }
        }
    }

    传入的参数:

    abcabcabcdefabc abc+ (abc)+ (abc){2,}

    运行结果:

    Input: "abcabcabcdefabc"
    Regulqr expression: "abcabcabcdefabc"
    Match "abcabcabcdefabc" at position 0-14
    Regulqr expression: "abc+"
    Match "abc" at position 0-2
    Match "abc" at position 3-5
    Match "abc" at position 6-8
    Match "abc" at position 12-14
    Regulqr expression: "(abc)+"
    Match "abcabcabc" at position 0-8
    Match "abc" at position 12-14
    Regulqr expression: "(abc){2,}"
    Match "abcabcabc" at position 0-8

     Demo8:

    package com.rah;
    
    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    /***
     * 
     * @author team
     *
     */
    public class Demo8 {
        public static void main(String[] args) {
            Matcher m = Pattern.compile("\w+").matcher("Evening is full of the linnet's wings");
            while(m.find())
                System.out.print(m.group() + " ");
            System.out.println();
            int i = 0;
            /***
             * find(args) 是字符的起始位置,注意输出结果
             */
            while(m.find(i)) {
                System.out.print(m.group() + " ");
                i++;
            }    
        }
    }

    运行结果:

    Evening is full of the linnet s wings 
    Evening vening ening ning ing ng g is is s full full ull ll l of of f the the he e linnet linnet innet nnet net et t s s wings wings ings ngs gs s 

    Demo9 (Group组)

    组是用括号提供划分的正则表达式,可以根据组的编号来引用某个组。组号为0表示整个表达式组号为1表示被第一对括号括起来的组。

    a(b(c))d    abcd是组0 bc是组1 c是组2

    package com.rah;
    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    /***
     * 
     * @author team
     *
     */
    public class Demo9 {
        static public final String POEM = 
                "Twas brillig, and the slithy toves
    " +
                "Did gyre and gimble in the wabe.
    " +
                "All mimsy were raths outgrabe.
    " +
                "And the mome raths outgrabe.
    
    " +
                "Beware the Jabberwock, my son,
    " +
                "The jaws that bite, the claws that catch.
    " +
                "Beware the Jubjub bird, and shun
    " +
                "The frumious Bandersnatch.";
        public static void main(String[] args) {
            /***
             * 检索每行的3个单词,每行最后以$结尾。通常$是与整个输入序列的末端进行匹配的,为了达到每行最后以$结尾,我们需要显示的通知正则表达式注意输入
             * 序列中的换行符,这个工作就由模式标记(?m)来完成。
             */
            Matcher m = Pattern.compile("(?m)(\S+)\s+((\S+)\s+(\S+))$").matcher(POEM);
            while(m.find()) {
                for (int i = 0; i <= m.groupCount(); i++) {
                    System.out.print("[" + m.group(i) + "]" + " ");
                }
                System.out.println();
            }
        }
    
    }

    运行结果:

    [the slithy toves] [the] [slithy toves] [slithy] [toves] 
    [in the wabe.] [in] [the wabe.] [the] [wabe.] 
    [were raths outgrabe.] [were] [raths outgrabe.] [raths] [outgrabe.] 
    [mome raths outgrabe.] [mome] [raths outgrabe.] [raths] [outgrabe.] 
    [Jabberwock, my son,] [Jabberwock,] [my son,] [my] [son,] 
    [claws that catch.] [claws] [that catch.] [that] [catch.] 
    [bird, and shun] [bird,] [and shun] [and] [shun] 
    [The frumious Bandersnatch.] [The] [frumious Bandersnatch.] [frumious] [Bandersnatch.] 

    Demo10

     ?:、?!、?s、?i、?x、?m、?u、?d等的使用  

    未完待续。。。

  • 相关阅读:
    浏览器控制台获取百度文库文章内容
    使用python登录CNZZ访问量统计网站,然后获取相应的数据
    使用Python登录腾讯MTA数据分析平台,然后获取相关数据
    使用python读写excel
    python将json转csv
    TCP/IP协议
    PHP smarty
    PHP入门及面向对象
    PHP概览
    PHP整体概览
  • 原文地址:https://www.cnblogs.com/rah123/p/3997531.html
Copyright © 2011-2022 走看看