Confused about Matcher group in Java regex -
i have following line,
typename="abc:xxxxx;";
i need fetch word abc
,
i wrote following code snippet,
pattern pattern4=pattern.compile("(.*):"); matcher=pattern4.matcher(typename); string namestr=""; if(matcher.find()) { namestr=matcher.group(1); }
so if put group(0)
abc:
if put group(1)
abc
, want know
what
0
,1
mean? better if can explain me examples.the regex pattern contains
:
in it, whygroup(1)
result omits that? group 1 detects words inside parenthesis?so, if put 2 more parenthesis such as,
\\s*(\d*)(.*)
: then, there 2 groups?group(1)
return(\d*)
part ,group(2)
return(.*)
part?
the code snippet given in purpose clear confusions. not code dealing with. code given above can done string.split()
in easier way.
capturing , grouping
capturing group (pattern)
creates group has capturing property.
a related 1 might see (and use) (?:pattern)
, creates group without capturing property, hence named non-capturing group.
a group used when need repeat sequence of patterns, e.g. (\.\w+)+
, or specify alternation should take effect, e.g. ^(0*1|1*0)$
(^
, 0*1
or 1*0
, $
) versus ^0*1|1*0$
(^0*1
or 1*0$
).
a capturing group, apart grouping, record text matched pattern inside capturing group (pattern)
. using example, (.*):
, .*
matches abc
, :
matches :
, , since .*
inside capturing group (.*)
, text abc
recorded capturing group 1.
group number
the whole pattern defined group number 0.
any capturing group in pattern start indexing 1. indices defined order of the opening parentheses of capturing groups. example, here all 5 capturing groups in below pattern:
(group)(?:non-capturing-group)(g(?:ro|u)p( (nested)inside)(another)group)(?=assertion) | | | | | | || | | 1-----1 | | 4------4 |5-------5 | | 3---------------3 | 2-----------------------------------------2
the group numbers used in back-reference \n
in pattern , $n
in replacement string.
in other regex flavors (pcre, perl), can used in sub-routine calls.
you can access text matched group matcher.group(int group)
. group numbers can identified rule stated above.
in regex flavors (pcre, perl), there branch reset feature allows use the same number capturing groups in different branches of alternation.
group name
from java 7, can define named capturing group (?<name>pattern)
, , can access content matched matcher.group(string name)
. regex longer, code more meaningful, since indicates trying match or extract regex.
the group names used in back-reference \k<name>
in pattern , ${name}
in replacement string.
named capturing groups still numbered same numbering scheme, can accessed via matcher.group(int group)
.
internally, java's implementation maps name group number. therefore, cannot use same name 2 different capturing groups.
Comments
Post a Comment