|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Regular Expressions
Время создания: 05.09.2017 00:45
Текстовые метки: knowledge
Раздел: Java - Tutorial - Data Types
Запись: xintrea/mytetra_db_mcold/master/base/15045615206xoa250p8o/text.html на raw.githubusercontent.com
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Java - Regular Expressions Advertisements Java provides the java.util.regex package for pattern matching with regular expressions. Java regular expressions are very similar to the Perl programming language and very easy to learn. A regular expression is a special sequence of characters that helps you match or find other strings or sets of strings, using a specialized syntax held in a pattern. They can be used to search, edit, or manipulate text and data. The java.util.regex package primarily consists of the following three classes −
Capturing Groups Capturing groups are a way to treat multiple characters as a single unit. They are created by placing the characters to be grouped inside a set of parentheses. For example, the regular expression (dog) creates a single group containing the letters "d", "o", and "g". Capturing groups are numbered by counting their opening parentheses from the left to the right. In the expression ((A)(B(C))), for example, there are four such groups −
To find out how many groups are present in the expression, call the groupCount method on a matcher object. The groupCount method returns an int showing the number of capturing groups present in the matcher's pattern. There is also a special group, group 0, which always represents the entire expression. This group is not included in the total reported by groupCount. Example Following example illustrates how to find a digit string from the given alphanumeric string − import java.util.regex.Matcher; import java.util.regex.Pattern; public class RegexMatches { public static void main( String args[] ) { // String to be scanned to find the pattern. String line = "This order was placed for QT3000! OK?"; String pattern = "(.*)(\\d+)(.*)"; // Create a Pattern object Pattern r = Pattern.compile(pattern); // Now create matcher object. Matcher m = r.matcher(line); if (m.find( )) { System.out.println("Found value: " + m.group(0) ); System.out.println("Found value: " + m.group(1) ); System.out.println("Found value: " + m.group(2) ); }else { System.out.println("NO MATCH"); } } }
This will produce the following result − Output Found value: This order was placed for QT3000! OK?
Found value: This order was placed for QT300
Found value: 0
Regular Expression Syntax Here is the table listing down all the regular expression metacharacter syntax available in Java −
Methods of the Matcher Class Here is a list of useful instance methods − Index Methods Index methods provide useful index values that show precisely where the match was found in the input string −
Study Methods Study methods review the input string and return a Boolean indicating whether or not the pattern is found −
Replacement Methods Replacement methods are useful methods for replacing text in an input string −
The start and end Methods Following is the example that counts the number of times the word "cat" appears in the input string − Example import java.util.regex.Matcher; import java.util.regex.Pattern; public class RegexMatches { private static final String REGEX = "\\bcat\\b"; private static final String INPUT = "cat cat cat cattie cat"; public static void main( String args[] ) { Pattern p = Pattern.compile(REGEX); Matcher m = p.matcher(INPUT); // get a matcher object int count = 0; while(m.find()) { count++; System.out.println("Match number "+count); System.out.println("start(): "+m.start()); System.out.println("end(): "+m.end()); } } }
This will produce the following result − Output Match number 1
start(): 0
end(): 3
Match number 2
start(): 4
end(): 7
Match number 3
start(): 8
end(): 11
Match number 4
start(): 19
end(): 22
You can see that this example uses word boundaries to ensure that the letters "c" "a" "t" are not merely a substring in a longer word. It also gives some useful information about where in the input string the match has occurred. The start method returns the start index of the subsequence captured by the given group during the previous match operation, and the end returns the index of the last character matched, plus one. The matches and lookingAt Methods The matches and lookingAt methods both attempt to match an input sequence against a pattern. The difference, however, is that matches requires the entire input sequence to be matched, while lookingAt does not. Both methods always start at the beginning of the input string. Here is the example explaining the functionality − Example import java.util.regex.Matcher; import java.util.regex.Pattern; public class RegexMatches { private static final String REGEX = "foo"; private static final String INPUT = "fooooooooooooooooo"; private static Pattern pattern; private static Matcher matcher; public static void main( String args[] ) { pattern = Pattern.compile(REGEX); matcher = pattern.matcher(INPUT); System.out.println("Current REGEX is: "+REGEX); System.out.println("Current INPUT is: "+INPUT); System.out.println("lookingAt(): "+matcher.lookingAt()); System.out.println("matches(): "+matcher.matches()); } }
This will produce the following result − Output Current REGEX is: foo
Current INPUT is: fooooooooooooooooo
lookingAt(): true
matches(): false
The replaceFirst and replaceAll Methods The replaceFirst and replaceAll methods replace the text that matches a given regular expression. As their names indicate, replaceFirst replaces the first occurrence, and replaceAll replaces all occurrences. Here is the example explaining the functionality − Example import java.util.regex.Matcher; import java.util.regex.Pattern; public class RegexMatches { private static String REGEX = "dog"; private static String INPUT = "The dog says meow. " + "All dogs say meow."; private static String REPLACE = "cat"; public static void main(String[] args) { Pattern p = Pattern.compile(REGEX);
// get a matcher object Matcher m = p.matcher(INPUT); INPUT = m.replaceAll(REPLACE); System.out.println(INPUT); } }
This will produce the following result − Output The cat says meow. All cats say meow.
The appendReplacement and appendTail Methods The Matcher class also provides appendReplacement and appendTail methods for text replacement. Here is the example explaining the functionality − Example import java.util.regex.Matcher; import java.util.regex.Pattern; public class RegexMatches { private static String REGEX = "a*b"; private static String INPUT = "aabfooaabfooabfoob"; private static String REPLACE = "-"; public static void main(String[] args) { Pattern p = Pattern.compile(REGEX);
// get a matcher object Matcher m = p.matcher(INPUT); StringBuffer sb = new StringBuffer(); while(m.find()) { m.appendReplacement(sb, REPLACE); } m.appendTail(sb); System.out.println(sb.toString()); } }
This will produce the following result − Output -foo-foo-foo-
PatternSyntaxException Class Methods A PatternSyntaxException is an unchecked exception that indicates a syntax error in a regular expression pattern. The PatternSyntaxException class provides the following methods to help you determine what went wrong −
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Так же в этом разделе:
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|