c# - .Net Regex performance issues -
i have output richtextbox (could lot or litte, search results) , apply custom color coding. decided regex , while works, seems pretty slow (~20 seconds) 300 results.
the output in same format:
attribute1=value1 attribute2=(value2) attribute3="string value 3" attribute4=
and on. so, have 4 cases: stuff=stuff, stuff=(stuff) stuff="string of stuff" , stuff=
the following regex works fine (matches should), slow:
(\s+)=("(?:[^"]|(?<open>")|(?<-open>"))+(?(open)(?!))")|(\s+)=(\((?:[^()]|(?<open>\()|(?<-open>\)))+(?(open)(?!))\))|(\s+)=(\s+)|(\s+)=\s
do guys see in particular that's slowing down? i'm sure can tell, first section matches quotes, second section matches parentheses, ect ect.
update kidding, doesn't return want... this:
attribute1=value1 attribute2=(value2) attribute3="string value 3" attribute4= attribute5="another string"
returns this:
5: attribute1 6: value1 3: attribute2 4: (value2) 1: attribute3 2: "string value 3" attribute4= attribute5="another string"
looks quote matched way across second string, instead of considering them separately.
description
your regex has lot of backtracking, wrote regex question. consider following powershell example of universal regex.
(?:\s|^)([^=]*)(?:=?["(]?([^)"]*?)[")]?)?(?=\s[^=\s]*=|$)
example
$matches = @() $string = 'attribute1=value1 attribute2=(value2) attribute3="string value 3" attribute4= attribute8=value8 attribut5=(value5) attribute6="string value 6" attribute7=' $regex = '(?:\s|^)([^=]*)(?:=?["(]?([^)"]*?)[")]?)?(?=\s[^=\s]*=|$)' write-host start write-host $string write-host write-host found ([regex]"(?i)$regex").matches($string) | foreach { write-host "key @ $($_.groups[1].index) = '$($_.groups[1].value)'`t= value @ $($_.groups[2].index) = '$($_.groups[2].value)'" } # next match
yields
start attribute1=value1 attribute2=(value2) attribute3="string value 3" attribute4= attribute8=value8 attribut5=(value5) attribute6="string value 6" attribute7= found key @ 0 = 'attribute1' = value @ 11 = 'value1' key @ 18 = 'attribute2' = value @ 30 = 'value2' key @ 38 = 'attribute3' = value @ 50 = 'string value 3' key @ 66 = 'attribute4' = value @ 77 = '' key @ 78 = 'attribute8' = value @ 89 = 'value8' key @ 96 = 'attribut5' = value @ 107 = 'value5' key @ 115 = 'attribute6' = value @ 127 = 'string value 6' key @ 143 = 'attribute7' = value @ 154 = ''
summary
(?:\s|^)
non-capture ensure we're @ start of string or substring([^=]*)
capture non-equalsign characters upto first equal sign(?:
start non-capture block=?
consume equal sign if exists["(]?
consume quote or open round bracket if exist([^)"]*?)
capture non close round brackets , non quote characters until[")]?
consume quote or close round bracket if exist)?
close non-capture block, , make part not required(?=
start 0 assertion block ensure don't travel next key/value set\s[^=\s]*=
block must have either space followed non space , non equalsigns characters|
or$
end of string ensure can capture last key/value set substing in string)
close 0 assertion block
Comments
Post a Comment