regex - How to capture this optional multiline string? -
how can capture optional group? (i mean consuming multiple lines)
green group->optional group
red line->new segment(same patterns repeat)
my pattern:
(\t{2}<idx:entry name="dic">\r\n)(\t{4}<idx:orth>)(.+\r\n)(\t{4}<idx:infl>[^</idx:infl>]+)?
any idea how capture optional group doesn't have fixed length?
try this:
\s*<idx:entry name="dic">\s*<idx:orth>[^<]*\s*(<idx:infl>\s*.*\s*</idx:infl>)
whitespace between tags ignored in xml shouldn't have specify exact number of tabs , linebreaks in regex. use \s
signify whitespace (this includes spaces, tabs , line breaks).
everything in between parantheses ()
captured , can access group using \1
or $1
depending on regex engine.
however, when parsing xml it's better idea use proper dom parser xpath.
Comments
Post a Comment