zoukankan      html  css  js  c++  java
  • 正则表达式(java)

    概念:

    正则表达式,又称规则表达式。(英语:Regular Expression,在代码中常简写为regex、regexp或RE),计算机科学的一个概念。

    正则表通常被用来检索、替换那些符合某个模式(规则)的文本。

    用途:

    通常用于判断语句,检查字符串是否满足某一格式(匹配)。字符串查找、替换等。

    正则表达式是含有一些特殊意义的字符的字符串,这些特殊字符称为正则表达式的元字符。

    涉及的类

    java.lang.String

    java.util.regex.Pattern----模式

    java.util.regex.Matcher---结果

    示例:"."代表任何一个字符。“abc”用“...”匹配

    public class RegExp {
        public static void main(String[] args){
            //简单介绍正则表达式
            System.out.println("abc".matches("..."));
        }
    }

    "d"---0-9任意数字,java正则表达式在元字符基础上需要加""区分转义字符,所以写成“\d”

    public class RegExp {
        public static void main(String[] args){
            //简单介绍正则表达式
            p("abc".matches("..."));//匹配
            //"d"---匹配数字
            p("d1234w".replaceAll("\d", "-"));//替换,采用的是反斜杠
        }
        public static void p(Object o){
            System.out.println(o);
        }
    }

    类的介绍:

    Pattern

    定义:

    A compiled representation of a regular expression.

    A regular expression, specified as a string, must first be compiled into an instance of this class. The resulting pattern can then be used to create a Matcher object that can match arbitrary character sequences against the regular expression. All of the state involved in performing a match resides in the matcher, so many matchers can share the same pattern.

    A typical invocation sequence is thus

     Pattern p = Pattern.compile("a*b");
     Matcher m = p.matcher("aaaaab");
     boolean b = m.matches();

    matches method is defined by this class as a convenience for when a regular expression is used just once. This method compiles an expression and matches an input sequence against it in a single invocation. The statement

     boolean b = Pattern.matches("a*b", "aaaaab");

    is equivalent to the three statements above, though for repeated matches it is less efficient since it does not allow the compiled pattern to be reused.

    下面的写法更有效率efficient ,同时Pattern和Matcher提供了更多的方法。

    Pattern p = Pattern.compile("a*b");
     Matcher m = p.matcher("aaaaab");
     boolean b = m.matches();

    [a-z]代表一个在a-z范围内的字母

    []代表范围;

    限定修饰符

    ?---0次或者多次

    *----0次或者多次

    +---一次或者多次

    {n}---正好出现{n}次

    {n,}--至少出现n次

    {n,m}出现n~m次

    //范围

    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    
    public class RegExp {
        public static void main(String[] args){
            
            //范围
            p("a".matches("[abc]"));
            p("a".matches("[^abc]"));//除了abc之外的都可以
            p("A".matches("[a-zA-Z]"));//任意字母都可以
            p("A".matches("[a-z]|[A-Z]"));//a-z或者A-Z,任意字母都可以
            p("A".matches("[a-z[A-Z]]"));//一样
            p("A".matches("[A-Z]&&[REG]"));//属于A-Z而且是EEG中的一个
            
        }
        public static void p(Object o){
            System.out.println(o);
        }
    }

    //Predefined character classes

    "\".matches("\\")----匹配一个反斜线要写4个,前面写一个就会认为是转义,后面写两个会出错,三个转义,四个正确(暂时不清楚原理)
    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    
    public class RegExp {
        public static void main(String[] args){
        
            //认识s w d
            p(" 
    
    	".matches("\s{4}"));
            p(" ".matches("\S"));
            p("a_8".matches("\w{3}"));
            p("abc888&^%".matches("[a-z]{1,3}\d+[&^#%]+"));
            p("\".matches("\\"));
            
        }
        public static void p(Object o){
            System.out.println(o);
        }
    }
    Predefined character classes
    . Any character (may or may not match line terminators)
    d A digit: [0-9]
    D A non-digit: [^0-9]
    h A horizontal whitespace character: [ xA0u1680u180eu2000-u200au202fu205fu3000]
    H A non-horizontal whitespace character: [^h]
    s A whitespace character: [ x0Bf ]
    S A non-whitespace character: [^s]
    v A vertical whitespace character: [ x0Bf x85u2028u2029]
    V A non-vertical whitespace character: [^v]
    w A word character: [a-zA-Z_0-9]
    W A non-word character: [^w]

     find()

    Attempts to find the next subsequence(子序列) of the input sequence that matches the pattern.

    reset()

    Resetting a matcher discards all of its explicit state information and sets its append position to zero.

    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    
    public class RegExp {
        public static void main(String[] args){
            
            //matches find looking
            Pattern p = Pattern.compile("\d{3,5}");
            String s = "123-45623-789-00";
            Matcher m = p.matcher(s);
            p(m.matches());
            m.reset();//matches方法和find方法会造成冲突,记得要调用reset方法
            p(m.find());
            p(m.start()+"-"+ m.end());
            p(m.find());
            p(m.start()+"-"+ m.end());
            p(m.find());
            p(m.start()+"-"+ m.end());
            p(m.lookingAt());
            p(m.lookingAt());
            p(m.lookingAt());
            p(m.lookingAt());
            
            
        }
        public static void p(Object o){
            System.out.println(o);
        }
    }

    查找替代

    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    
    public class RegExp {
        public static void main(String[] args){
            
            //replacement   可以参考appendReplacement()在API文档里面的描述
            Pattern p = Pattern.compile("java",Pattern.CASE_INSENSITIVE);
            Matcher m = p.matcher("java Java Java I love Java  u hate JAVA sfarwwfr");
           // p(m.replaceAll("JAVA"));//所有都替换成JAVA
            StringBuffer buf = new StringBuffer();
            int i = 0;
            while(m.find()){  //寻找
                i++;
                if (i%2 == 0) { //单数替换为java双数替换成JAVA
                    m.appendReplacement(buf, "java");
                } else {
                    m.appendReplacement(buf, "JAVA");
                }
            }
            m.appendTail(buf);//appendReplacement()多次调用后用此方法补全尾部
           p(buf);     
        }
        public static void p(Object o){
            System.out.println(o);
        }
    }

    分组

    Matcher.group()-----Returns the input subsequence matched by the previous match.

    1 ((A)(B(C)))
    2 (A)
    3 (B(C))
    4 (C)

    group运用括号可以得到不同的分组,eg:group(1);group(2)

    public class RegExp {
        public static void main(String[] args){
        
            
            //groupregex
            Pattern p = Pattern.compile("(\d{3,5})|([a-z]{2})");
            String s = "123aa-34345bb-234cc-00";
            Matcher m = p.matcher(s);
            while (m.find()) {
                p(m.group(2));
            }
        }
        public static void p(Object o){
            System.out.println(o);
        }
    }

    总结几个重要的知识点:

  • 相关阅读:
    Open source cryptocurrency exchange
    Salted Password Hashing
    95. Unique Binary Search Trees II
    714. Best Time to Buy and Sell Stock with Transaction Fee
    680. Valid Palindrome II
    Java compiler level does not match the version of the installed Java project facet.
    eclipse自动编译
    Exception in thread "main" java.lang.StackOverflowError(栈溢出)
    博客背景美化——动态雪花飘落
    java九九乘法表
  • 原文地址:https://www.cnblogs.com/limingxian537423/p/6995025.html
Copyright © 2011-2022 走看看