regex - Why doesn't ^ $ in .NET multiline regular expressions match CRLF 0D0A? -
i have .net application makes use of .net regex features match epl label text string. use following: ^[a-z0-9,]+"(.+)"$ , match every line (it captures text in-between epl code). epl has changed , @ end of every epl line there line feed \x0d\x0a.
so changed code pattern [((\r\n)|(\x0d\x0a))a-z0-9,]+"(.+)" , picks keep out of reach of children , doesn't recognise rest.
how can match text between epl code??
this raw epl i'm trying match
n 0d0a a230,1,0,2,1,1,n,"keep out of reach of children"0d0a a133,26,0,4,1,1,n," furosemide tablets 40 mg"0d0a a133,51,0,4,1,1,n," 1 in morning"0d0a a133,76,0,4,1,1,n,""0d0a a133,101,0,4,1,1,n,""0d0a a133,126,0,4,1,1,n,""0d0a a133,151,0,4,1,1,n,""0d0a a133,176,0,4,1,1,n,"19/04/13 28 tablet(s)"0d0a a133,201,0,4,1,1,n,"elizabeth m smith"0d0a lo133,232,550,40d0a a133,242,0,2,1,1,n,"any medical centre,blue road"0d0a a133,260,0,2,1,1,n,"dn54 5tz,tel:01424 503901"0d0a p1
i think you're looking regexoptions.multiline option. in:
regex myex = new regex("^[a-z0-9,]+\".+?\"$", regexoptions.multiline);
actually, regular expression should be:
"^[a-z0-9,]+\".*\"\r?$"
multiline
looks newline character, \n
. file contains \r\n
. finds ending quote, sees $
, , looks newline. file has windows line endings (\r\n
). modified regex skips on character if it's there.
if want eliminate characters in results, make capture group:
"^([a-z0-9,]+\".*\")\r?$"
or, can filter them calling trim
on each result:
matchcollection matches = myex.matches(text); foreach (match m in matches) { string s = m.value.trim(); // removes trailing \r }
Comments
Post a Comment