public static final String POEM= "Twas brilling, and the slithy toves " + "Did gyre and gimble in the wabe. "+ "All mimsy were the borogoves, " + "And the mome rathsoutgrable. "+ "Beware the Jabberwork, my son, "+ "The jaws that bite, the claws that catch. "+ "Beware hte Jubjub bird, and shun "+ "The frumious Bandersnatch."; public static void main(String[] args) { // TODO Auto-generated method stub Matcher m= Pattern.compile("(?m)(\S+)\s+((\S+)\s+(\S+))$") .matcher(POEM); while (m.find()) { for (int j = 0; j <= m.groupCount(); j++) { System.out.print("["+ m.group(j)+ "]"); } System.out.println(); } }
output:
[the slithy toves][the][slithy toves][slithy][toves]
[in the wabe.][in][the wabe.][the][wabe.]
[were the borogoves,][were][the borogoves,][the][borogoves,]
[the mome rathsoutgrable.][the][mome rathsoutgrable.][mome][rathsoutgrable.]
[Jabberwork, my son,][Jabberwork,][my son,][my][son,]
[claws that catch.][claws][that catch.][that][catch.]
[bird, and shun][bird,][and shun][and][shun]
[The frumious Bandersnatch.][The][frumious Bandersnatch.][frumious][Bandersnatch.]
解析:
m.groupCout():匹配器匹配的组的总数,不包括0组。
m.group(j):匹配的第j组的值。group(0)是整个表达式
(?m):多行模式
S:非空白字符
s:空白字符 ==[ x0Bf ]
2.Matcher.find() vs .lookingAt() vs .matchers()
package com.westward; import java.util.regex.Matcher; import java.util.regex.Pattern; public class Demo31 { public static String input= "As long as there is injustice, whenever a "+ "Targathian baby cries out.wherever a distress " + "signal sounds among the stars ... We'll be there. "+ "This fine ship, and this fine crew ... " + "Never give up! Never surrender!"; private static class Display{ private boolean regexPrinted= false; private String regex; Display(String regex) { this.regex= regex; } void display(String message){ if (!regexPrinted) { System.out.println(regex); regexPrinted = true; } System.out.println(message); } } static void examine(String s,String regex){ Display d= new Display(regex); Pattern p= Pattern.compile(regex); Matcher m= p.matcher(s); while (m.find()) { d.display("find() '"+ m.group() + "' start= "+m.start()+ " end= "+ m.end()); } if (m.lookingAt()) { d.display("lookingAt() '"+ m.group() + "' start= "+m.start()+ " end= "+ m.end()); } if (m.matches()) { d.display("matches() '"+ m.group() + "' start= "+m.start()+ " end= "+ m.end()); } } public static void main(String[] args) { for (String in : input.split(" ")) { System.out.println("input :"+ in); for (String regex : new String[]{"\w*ere\w*", "\w*ever","T\w+","Never.*?!"}) { examine(in, regex); } } } }
output:
input :As long as there is injustice, whenever a
w*erew*
find() 'there' start= 11 end= 16
w*ever
find() 'whenever' start= 31 end= 39
input :Targathian baby cries out.wherever a distress
w*erew*
find() 'wherever' start= 26 end= 34
w*ever
find() 'wherever' start= 26 end= 34
Tw+
find() 'Targathian' start= 0 end= 10
lookingAt() 'Targathian' start= 0 end= 10
input :signal sounds among the stars ... We'll be there.
w*erew*
find() 'there' start= 43 end= 48
input :This fine ship, and this fine crew ...
Tw+
find() 'This' start= 0 end= 4
lookingAt() 'This' start= 0 end= 4
input :Never give up! Never surrender!
w*ever
find() 'Never' start= 0 end= 5
find() 'Never' start= 15 end= 20
lookingAt() 'Never' start= 0 end= 5
Never.*?!
find() 'Never give up!' start= 0 end= 14
find() 'Never surrender!' start= 15 end= 31
lookingAt() 'Never give up!' start= 0 end= 14
matches() 'Never give up! Never surrender!' start= 0 end= 31
总结:
Matcher.find():匹配字符串的任意位置
Matcher.lookingAt():匹配字符串的开始位置
Matcher.matchers():匹配整个字符串,String.matchers()底层就是调用的它。
3.Pattern标记 (Pattern的几个成员变量)
public static void main(String[] args) { // TODO Auto-generated method stub Pattern p= Pattern.compile("^java", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE); Matcher m= p.matcher( "java has regex Java has regex "+ "JAVA has pretty good regular expressions "+ "Regular expressions are in Java"); while (m.find()) { System.out.println(m.group(0)); // System.out.println(m.group());//the same } }
output:
java
Java
JAVA
总结:不同的Pattern标记可以用 或| 来连接。
Pattern.CASE_INSENSITIVE(?i):字母大小写不敏感
Pattern.MULTILINE(?m):多行模式