Java – Regular Expressions – 2

Regular Expressions in Java

We have already covered the basics of regular expressions in Part 1.  I recommend you review it. Now we are going to examine some more useful methods. We would also write the code to demonstrate it.

Furthermore, a video is provided at the end of the page.

 

We would cover the following topics

  1. start and end Methods
  2. Matches and lookingAt Methods
  3. replaceFirst and replaceAll Methods
  4. appendReplacement and appendTail Methods
  5. PatternSyntaxException Class Method
  6. Video Explanation

 

 

1. start and end Methods

You use the start() method to find the starting index of a match in a given text. So if we have:

text = “The cup is on the floor”

And we find find that cup matches a regular expression. Then the start() method would return 4. Because that is the starting index of cup in the text. Similarly, the end() method would return 6. This is because 6 is the end index of cup in the text. That is the index of letter p.

Let’s take an example. We would write a program that counts the number of times the word cup appear in a string

 

//Program to count number of times
//cup appears in a string
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegexDemo2 {

	public static void main(String[] args) {
		String text = "This is blue cup. I need red cup";
		String regex = "\\bcup\\b";
		
		Pattern pt = Pattern.compile(regex);
		Matcher mt = pt.matcher(text);
		
		int count = 0;
		while(mt.find()) {
			count = count +1;
			System.out.println("Start index: " + mt.start());
			System.out.println("End index: " + mt.end());
			System.out.println("------------------------");
		}		
		System.out.println("Total count: " + count);
	}
}

Listing 1.0: Count how many times cup appears in a string

 

I would suggest you run the code yourself. As such, you get used to it. Also try to change things a bit to see how it works. Maybe you can change cup to plate or something.

Meanwhile, the output of the program is given below.

 

Start index: 13
End index: 16
------------------------
Start index: 29
End index: 32
------------------------
Total count: 2

 

Also note that the program uses word boundaries \b. This helps to ensure that the word is not inside another string. So if we have the word hiccup, then it would not match. Sure you will find it interesting to try it!

 

 

2. Matches and lookingAt Methods

The matches method as you already know matches an input text against a pattern. However, while matches tries to match the entire text, the lookingAt checks if the pattern matches the prefix of the text.

Let’s take an example.

//Program to demonstrate difference
//between matches() and lookingAt()
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegexDemo2 {

	public static void main(String[] args) {
		String text = "big bigger bigest";
		String regex = "big";
		
		Pattern pt = Pattern.compile(regex);
		Matcher mt = pt.matcher(text);
		
		System.out.println("matches: " + mt.matches());
		System.out.println("lookingAt: " + mt.lookingAt());
	}
}

Listing 1.1: Difference between matches() and lookingAt()

 

If you run the code in Listing 1.1, you will have the result below:

 

matches: false
lookingAt: true

 

Notice that for matches,  it is false. This is because, matches tries to match the entire string. For lookingAt() however, it it is true. This is because the pattern ‘big’ corresponds to the prefix of the text. That is the first part of the text is ‘big’. So it matches.

 

 

3. replaceFirst and replaceAll Methods

You use these method to replace the text that matches a given regular expression. While replaceFirst() replaces only the first match, the replaceAll replaces all matches.

Let’s take an example

 

//Program to demonstrate 
//replaceFirst() and replaceAll()
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegexDemo2 {

	public static void main(String[] args) {
		String text = "big bigger bigest";
		String regex = "big";
		
		Pattern pt = Pattern.compile(regex);
		Matcher mt = pt.matcher(text);
		
		System.out.println(mt.replaceFirst("small"));
		System.out.println(mt.replaceAll("small"));
	}
}

Listing 1.2: Program to illustrate replaceFirst() and replaceAll() Methods

 

If you run the code, then you will have the result below:

small bigger bigest
small smallger smallest

 

Notice that in the first line of the output, only the first match is replaced. But in the second line, all the matches for ‘big’ is replaced with small.

 

 

4. appendReplacement and appendTail Methods

The appendReplacement() method does the following:

  • It first reads characters from the input text, starting from the append position, and appends them to the given string buffer. It stops after is has read the lasts character preceding the previous match (the character at index start() -1)
  • It then appends the given replacement string to the specified stringbuffer
  • Finally, it sets the append position of the matcher to the index of the last matched character + 1 (that is to end())

 

The appendTail() method does the following:

It first reads characters from the input text, starting from the append position, and appends them to the given string buffer. (it implements a terminal append-and-replace step)

 

 

5. PatternSyntaxException Class Method

This is an exception that indicates the occurrence of a syntax error in  a regular expression pattern. The following methods are provided in this class.

 

SN. Method and brief description
1 getDescription()

It retrieves the description of the error. Returns a string.

2 getIndex()

It retrieves the error index. Returns an integer

3 getPattern()

It retrieves the erroneous regular expression pattern and returns a string.

4 getMessage()

Returns a multi-line string that contains the description of the syntax error and its index, the erroneous regular expression, and a visual indication of the error index within the regular expression pattern.

 

6. Video Explanation