Confused about Matcher group in Java regex -


i have following line,

typename="abc:xxxxx;"; 

i need fetch word abc,

i wrote following code snippet,

pattern pattern4=pattern.compile("(.*):"); matcher=pattern4.matcher(typename);  string namestr=""; if(matcher.find()) {     namestr=matcher.group(1);  } 

so if put group(0) abc: if put group(1) abc, want know

  1. what 0 , 1 mean? better if can explain me examples.

  2. the regex pattern contains : in it, why group(1) result omits that? group 1 detects words inside parenthesis?

  3. so, if put 2 more parenthesis such as, \\s*(\d*)(.*): then, there 2 groups? group(1) return (\d*) part , group(2) return (.*) part?

the code snippet given in purpose clear confusions. not code dealing with. code given above can done string.split() in easier way.

capturing , grouping

capturing group (pattern) creates group has capturing property.

a related 1 might see (and use) (?:pattern), creates group without capturing property, hence named non-capturing group.

a group used when need repeat sequence of patterns, e.g. (\.\w+)+, or specify alternation should take effect, e.g. ^(0*1|1*0)$ (^, 0*1 or 1*0, $) versus ^0*1|1*0$ (^0*1 or 1*0$).

a capturing group, apart grouping, record text matched pattern inside capturing group (pattern). using example, (.*):, .* matches abc , : matches :, , since .* inside capturing group (.*), text abc recorded capturing group 1.

group number

the whole pattern defined group number 0.

any capturing group in pattern start indexing 1. indices defined order of the opening parentheses of capturing groups. example, here all 5 capturing groups in below pattern:

(group)(?:non-capturing-group)(g(?:ro|u)p( (nested)inside)(another)group)(?=assertion) |     |                       |          | |      |      ||       |     | 1-----1                       |          | 4------4      |5-------5     |                               |          3---------------3              |                               2-----------------------------------------2 

the group numbers used in back-reference \n in pattern , $n in replacement string.

in other regex flavors (pcre, perl), can used in sub-routine calls.

you can access text matched group matcher.group(int group). group numbers can identified rule stated above.

in regex flavors (pcre, perl), there branch reset feature allows use the same number capturing groups in different branches of alternation.

group name

from java 7, can define named capturing group (?<name>pattern), , can access content matched matcher.group(string name). regex longer, code more meaningful, since indicates trying match or extract regex.

the group names used in back-reference \k<name> in pattern , ${name} in replacement string.

named capturing groups still numbered same numbering scheme, can accessed via matcher.group(int group).

internally, java's implementation maps name group number. therefore, cannot use same name 2 different capturing groups.


Comments

Popular posts from this blog

c# - DetailsView in ASP.Net - How to add another column on the side/add a control in each row? -

javascript - firefox memory leak -

Trying to import CSV file to a SQL Server database using asp.net and c# - can't find what I'm missing -