php - Not gready regex doesn't work -

January 15, 2013

regex:

preg_match('/<td[^<^>]*>(.*?)<\/td><td[^<^>]*>'.preg_quote('<input type=\'text\' name=\'nazwisko\'>', '/').'<\/td>/ui', $form_string, $matches);

input:

<form action='http://freebot.pl/post.php' name='implebot.plshow' method='post' onsubmit='return sprawdzformularz(this)'>         <table><tr><td align=right>          <input type='hidden' name='uid' value='60431'>         email :</td><td><input type='text' name='email'></td></tr>     <tr><td align=right>imię :</td><td><input type='text' name='imie'></td></tr><tr><td align=right>nazwisko :</td><td><input type='text' name='nazwisko'></td></tr><tr><td align=right>#opcja1 :</td><td><input type='text' name='pole_1' value='war.1'></td></tr><input type='hidden' name='pole_2' value='war.2'><tr><td align=right>#opcja3 :</td><td><select name='pole_3'><option></option><option value='s1'>s1</option><option value='s2'>s2</option><option value='s3'>s3</option><option value='s4'>s4</option><option value='s5'>s5</option></select><tr><td align=right>#opcja4 :</td><td><select name='pole_4'><option></option><option value='a'>a</option><option value='b'>b</option><option value='c'>c</option><option value='d'>d</option><option value='e'>e</option><option value='f'>f</option><option value='g'>g</option></select><tr><td align=right>#opcja5 :</td><td><input type='text' name='pole_5' value='war.5'></td></tr></table><input type='hidden' name='zrodlo' value='formularz1'>zgadzam się z <input type='checkbox' name='pp' checked><a href='http://' >polityką prywatności</a><br><input type='submit' value='wyślij'></form>

$matches[1]:

<input type='hidden' name='uid' value='60431'>email :</td><td><input type='text' name='email'></td></tr><tr><td align=right>imi─Ö :</td><td><input type='text' name='imie'></td></tr><tr><td align=right>nazwisko :

instead of:

nazwisko :

i got (.*?) in <td[^<^>]*>(.*?)<\/td> should give me expected nazwisko :

what i'm doing wrong?

i don't see reason use ungreedy quantifiers in pattern. try instead:

preg_match('~<td[^>]*>([^<]*)</td><td[^>]*>'           .preg_quote("<input type='text' name='nazwisko'>")           .'</td>~i', $form_string, $matches);

if td tags can contain html content, can replace ([^<]*) ((?>[^<]+|<+(?!/td>))*)

explanation:

(?>             # atomic group     [^<]+       # characters expect < 1 or more times    |            # or     <+(?!/td>)  # < 1 or more times not followed /td> (negative lookahead) )*              # close atomic group, 0 or more times

in other words, part match: characters not < or < not followed /td>, each 1 or more times, of 0 or more times. it's little longer (.*?) more efficient far.

the reason regex engine must test each character, 1 one, followed </td> ungreedy pattern. in pattern regex engine test when character <.

i use atomic group (?>...) instead non capturing group (?:...) when possible, practice, can find more infos here.

Search This Blog

DIs

php - Not gready regex doesn't work -

Comments

Post a Comment

Popular posts from this blog

php - cannot display multiple markers in google maps v3 from traceroute result -

css - Text drops down with smaller window -

php - Boolean search on database with 5 million rows, very slow -