regex - How to capture this optional multiline string? -
how can capture optional group? (i mean consuming multiple lines) 
green group->optional group
red line->new segment(same patterns repeat)
my pattern:
(\t{2}<idx:entry name="dic">\r\n)(\t{4}<idx:orth>)(.+\r\n)(\t{4}<idx:infl>[^</idx:infl>]+)? 
any idea how capture optional group doesn't have fixed length?
try this:
\s*<idx:entry name="dic">\s*<idx:orth>[^<]*\s*(<idx:infl>\s*.*\s*</idx:infl>) whitespace between tags ignored in xml shouldn't have specify exact number of tabs , linebreaks in regex. use \s signify whitespace (this includes spaces, tabs , line breaks).
everything in between parantheses () captured , can access group using \1 or $1 depending on regex engine.
however, when parsing xml it's better idea use proper dom parser xpath.
Comments
Post a Comment