Using Perl regex to find and extract matches over multiple lines -


i have text file of several hundreds of terms in following format:

[term]   id: id1   name: name1   xref: type1:aab   xref: type2:cdc    [term]   id: id2   name: name2   xref: type1:aba   xref: type3:fee  

i need extract terms xref of type1 , write them new file in same format. planning use regular expression this:

/\[term\](.*)type1(.*)[^\[term\]]/g 

to find corresponding terms don't know how search regex on multiple lines. should read original text file string or rather line line? appreciated.

a different approach use $/ variable split blocks in blank line, each block split newline character , run regular expression each line.so when 1 of them matches print , read next block. example one-liner:

perl -ne '     begin { $/ = q|| }     @lines = split /\n/;       $line ( @lines ) {         if ( $line =~ m/xref:\s*type1/ ) {                  printf qq|%s|, $_;             last;         }     } ' infile 

assuming input file like:

[term] id: id1 name: name1 xref: type1:aab xref: type2:cdc  [term] id: id2 name: name1 xref: type6:aba xref: type3:fee  [term] id: id2 name: name1 xref: type1:aba xref: type3:fee  [term] id: id2 name: name1 xref: type4:aba xref: type3:fee  [term]   id: id2   name: name1   xref: type1:aba   xref: type3:fee 

it yields:

[term]   id: id1   name: name1   xref: type1:aab   xref: type2:cdc    [term]   id: id2   name: name1   xref: type1:aba   xref: type3:fee   [term]   id: id2   name: name1   xref: type1:aba   xref: type3:fee 

as can see, line xref: type1 in them printed.


Comments

Popular posts from this blog

html5 - What is breaking my page when printing? -

c# - must be a non-abstract type with a public parameterless constructor in redis -

ajax - PHP/JSON Login script (Twitter style) not setting sessions -