Matcher.find()

在Java的Regex包中，Matcher.find()是一个非常有用的方法，它可以在字符串中搜索一个正则匹配，从而创建一个Match对象，这个Match对象包含了所匹配的字符的位置和值。

一、搜索匹配

Matcher.find()是一个非常有用的工具，它可以帮助你在一个字符串中搜索你想要的匹配。例如，你有以下的字符串，你想要在其中搜索一个名为”hello”的单词：

String text = "hello world! This is a test.";
Pattern pattern = Pattern.compile("\\bhello\\b");
Matcher matcher = pattern.matcher(text);

在这个例子中，我们先定义了一个字符串，然后正则表达式中用\\b来匹配单词的边界。最后，我们创建了一个Matcher对象，用于搜索匹配。

现在，我们可以使用Matcher.find()来搜索匹配并输出。代码如下：

while (matcher.find()) {
    System.out.println("Match found at index " + matcher.start() + " to " + matcher.end());
    System.out.println("Match: " + matcher.group());
}

这里使用了一个while循环，因为你可以在同一个字符串中找到多个匹配。输出结果如下：

Match found at index 0 to 5
Match: hello

在这个例子中，我们只找到了一个匹配。matcher.start()返回匹配的起始位置，matcher.end()返回匹配的结束位置。同时，matcher.group()返回实际匹配的字符串。

二、正则表达式组

可以使用括号分组对Matcher.find()方法进行更高级的利用，例如将所有匹配的单词提取出来放到一个数组中。例如：

String text = "hello world, how are you?";
Pattern pattern = Pattern.compile("(\\w+)");
Matcher matcher = pattern.matcher(text);
List matches = new ArrayList();

while (matcher.find()) {
    matches.add(matcher.group(1));
}

System.out.println(matches);

这里，我们的正则表达式是”(\\w+)”。括号中的”\\w+”将匹配任何单词字符，而括号表明我们要将整个匹配作为一组。我们在循环中多次使用了Matcher.find()来找到多个匹配，并将每个匹配的字符串添加到数组中。

三、区分大小写

默认情况下，Matcher.find()是区分大小写的。这意味着如果你搜索”hello”，它将只匹配”hello”，而不是”Hello”或”HELLO”。如果你想要执行大小写不敏感的匹配，你需要在正则表达式模式中添加”(?i)”标志。例如：

String text = "HELLO world! This is a test.";
Pattern pattern = Pattern.compile("(?i)\\bhello\\b");
Matcher matcher = pattern.matcher(text);

while (matcher.find()) {
    System.out.println("Match found at index " + matcher.start() + " to " + matcher.end());
    System.out.println("Match: " + matcher.group());
}

这里，我们在正则表达式模式中添加了”(?i)”标志，这表示执行不区分大小写的匹配。输出结果如下：

Match found at index 0 to 5
Match: HELLO

四、多行模式

有时候，你可能需要在整个文本字符串中搜索匹配，而不只是在单行中搜索匹配。默认情况下，Matcher.find()只搜索一行。

你可以使用”(?m)”标志来启用多行模式。多行模式允许在整个文本字符串中进行搜索匹配。例如：

String text = "hello world\nhow are you\ntoday?";
Pattern pattern = Pattern.compile("^h", Pattern.MULTILINE);
Matcher matcher = pattern.matcher(text);

while (matcher.find()) {
    System.out.println("Match found at index " + matcher.start() + " to " + matcher.end());
    System.out.println("Match: " + matcher.group());
}

这里，我们的正则表达式是”^h”，这将匹配以字母”h”开头的任何行。我们也传递了Pattern.MULTILINE参数来启用多行模式。输出结果如下：

Match found at index 6 to 7
Match: w
Match found at index 18 to 19
Match: t

五、贪婪模式和懒惰模式

Matcher.find()默认是贪婪模式的。这意味着它会尽可能多地匹配字符。例如，如果你要匹配字符串”aaaaaaaaaaaaaab”中的”a+”，Matcher.find()将匹配整个字符串：

String text = "aaaaaaaaaaaaaab";
Pattern pattern = Pattern.compile("a+");
Matcher matcher = pattern.matcher(text);

while (matcher.find()) {
    System.out.println("Match found at index " + matcher.start() + " to " + matcher.end());
    System.out.println("Match: " + matcher.group());
}

输出结果如下：

Match found at index 0 to 13
Match: aaaaaaaaaaaa
Match found at index 13 to 14
Match: b

你可以使用”?在正则表达式模式中表示懒惰模式，这意味着它会尽可能少地匹配字符。例如：

String text = "aaaaaaaaaaaaaab";
Pattern pattern = Pattern.compile("a+?");
Matcher matcher = pattern.matcher(text);

while (matcher.find()) {
    System.out.println("Match found at index " + matcher.start() + " to " + matcher.end());
    System.out.println("Match: " + matcher.group());
}

输出结果如下：

Match found at index 0 to 0
Match: a
Match found at index 1 to 1
Match: a
Match found at index 2 to 2
Match: a
Match found at index 3 to 3
Match: a
Match found at index 4 to 4
Match: a
Match found at index 5 to 5
Match: a
Match found at index 6 to 6
Match: a
Match found at index 7 to 7
Match: a
Match found at index 8 to 8
Match: a
Match found at index 9 to 9
Match: a
Match found at index 10 to 10
Match: a
Match found at index 11 to 11
Match: a
Match found at index 12 to 12
Match: a
Match found at index 13 to 13
Match: a
Match found at index 13 to 14
Match: b

总结

Matcher.find()是一个非常有用的工具，可以帮助你搜索和提取字符串中的模式。你可以在不同的情况下使用括号分组、大小写敏感和多行模式来控制匹配。而贪婪和懒惰模式则允许你更好地控制匹配的细节。

代码示例：

import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class MatcherExample {
    public static void main(String[] args) {
        // Search for a single word
        String text = "hello world! This is a test.";
        Pattern pattern = Pattern.compile("\\bhello\\b");
        Matcher matcher = pattern.matcher(text);

        System.out.println("Searching for: " + pattern.pattern());

        while (matcher.find()) {
            System.out.println("Match found at index " + matcher.start() + " to " + matcher.end());
            System.out.println("Match: " + matcher.group());
        }

        // Using groups
        String text2 = "hello world, how are you?";
        Pattern pattern2 = Pattern.compile("(\\w+)");
        Matcher matcher2 = pattern2.matcher(text2);
        List matches = new ArrayList();

        System.out.println("\nSearching for: " + pattern2.pattern());

        while (matcher2.find()) {
            matches.add(matcher2.group(1));
        }

        System.out.println("Matches: " + matches);

        // Case insensitive search
        String text3 = "HELLO world! This is a test.";
        Pattern pattern3 = Pattern.compile("(?i)\\bhello\\b");
        Matcher matcher3 = pattern3.matcher(text3);

        System.out.println("\nSearching for: " + pattern3.pattern());

        while (matcher3.find()) {
            System.out.println("Match found at index " + matcher3.start() + " to " + matcher3.end());
            System.out.println("Match: " + matcher3.group());
        }

        // Multi-line search
        String text4 = "hello world\nhow are you\ntoday?";
        Pattern pattern4 = Pattern.compile("^h", Pattern.MULTILINE);
        Matcher matcher4 = pattern4.matcher(text4);

        System.out.println("\nSearching for: " + pattern4.pattern());

        while (matcher4.find()) {
            System.out.println("Match found at index " + matcher4.start() + " to " + matcher4.end());
            System.out.println("Match: " + matcher4.group());
        }

        // Greedy and lazy matching
        String text5 = "aaaaaaaaaaaaaab";

        // Greedy match
        Pattern pattern5a = Pattern.compile("a+");
        Matcher matcher5a = pattern5a.matcher(text5);

        System.out.println("\nSearching for: " + pattern5a.pattern());

        while (matcher5a.find()) {
            System.out.println("Match found at index " + matcher5a.start() + " to " + matcher5a.end());
            System.out.println("Match: " + matcher5a.group());
        }

        // Lazy match
        Pattern pattern5b = Pattern.compile("a+?");
        Matcher matcher5b = pattern5b.matcher(text5);

        System.out.println("\nSearching for: " + pattern5b.pattern());

        while (matcher5b.find()) {
            System.out.println("Match found at index " + matcher5b.start() + " to " + matcher5b.end());
            System.out.println("Match: " + matcher5b.group());
        }
    }
}

原创文章，作者：RSGPN，如若转载，请注明出处：https://www.506064.com/n/334694.html