To push unique elements read from file using regex into array-Perl -


here file:

  heaven   heavenly   heavenns   abc   heavenns   heavennly 

according code, heavenns , heavennly should pushed @myarr, , should in array 1 time. how that?

my $regx = "heavenn\+"; $tmp=$regx;  $tmp=~ s/[\\]//g;  $regx=$tmp; print("\nnow regex:", $regx);  $file  = "myfilename.txt";  @myarr; open $fh, "<", $file;   while ( $line = <$fh> ) {  if ($line =~ /$regx/){     print $line; push (@myarr,$line); } }  print ("\nmylist:", @myarr); #printing 2 times heavenns , heavennly 

this perl, there's more 1 way (tmtowtdi). here's 1 of them:

#!/usr/bin/env perl use strict; use warnings;  $regex = "heavenn+"; $rx = qr/$regex/; print "regex: $regex\n";  $file  = "myfilename.txt"; %list; @myarr; open $fh, "<", $file or die "failed open $file: $?";  while ( $line = <$fh> ) {     if ($line =~ $rx)     {         print $line;         $list{$line}++;     } }  push @myarr, sort keys %list;  print "mylist: @myarr\n"; 

sample output:

regex: heavenn+ heavenns heavenns heavennly mylist: heavennly  heavenns 

the sort isn't necessary (but presents data in sane order). add items array when count in $list{$line} 0. chomp input lines remove newline. etc.


what if want push particular words. example, if file is, 1. "heavenns hello" 2. "heavenns hi", "3.heavennly good". print 'heavenns' , 'heavennly'?

then have arrange capture word only. means refining regex. assuming want heavenn @ start of word , don't mind alphabetic characters come after that, then:

#!/usr/bin/env perl use strict; use warnings;  $regex = '\b(heavenn[a-za-z]*)\b';  # single quotes necessary! $rx = qr/$regex/; print "regex: $regex\n";  $file  = "myfilename.txt"; %list; @myarr; open $fh, "<", $file or die "failed open $file: $?";  while ( $line = <$fh> ) {     if ($line =~ $rx)     {         print $line;         $list{$1}++;     } }  push @myarr, sort keys %list;  print "mylist: @myarr\n"; 

data file:

1. "heavenns hello" 2. "heavenns hi", "3.heavennly good". d heaven heavenly heavenns abc heavenns heavennly 

output:

regex: \b(heavenn[a-za-z]*)\b 1. "heavenns hello" 2. "heavenns hi", "3.heavennly good". d heavenns heavenns heavennly mylist: heavennly heavenns 

note names in list no longer include newlines.


after chat

this version takes regex command line. script invocation is:

perl script.pl -p 'regex' [file ...] 

it read standard input if no file specified on command line (better having fixed input file name — large margin). looks multiple occurrences of specified regex on each line, regex can preceded or followed (or both) 'word characters' specified \w.

#!/usr/bin/env perl use strict; use warnings; use getopt::std;  %opts; getopts('p:', \%opts) or die "usage: $0 [-p 'regex']\n";  $regex_base = 'heavenn'; #$regex_base = $argv[0] if defined $argv[0]; $regex_base = $opts{p} if defined $opts{p};  $regex = '\b(\w*' . ${regex_base} . '\w*)\b'; $rx = qr/$regex/; print "regex: $regex (compiled form: $rx)\n";  %list; @myarr;  while (my $line = <>) {     while ($line =~ m/$rx/g)     {         print $line;         $list{$1}++;         #$line =~ s///;     } }  push @myarr, sort keys %list;  print "matched words: @myarr\n"; 

given input file:

1. "heavenns hello" 2. "heavenns hi", "3.heavennly good". d heavennsy! heavennnly output equally heavennnnly input! unheavenly host.  heavens! heaves yacht! heaven heavens heavenly heavenns abc heavenns heavennly 

you can outputs such as:

$ perl script.pl -p 'e\w*?ly' myfilename.txt regex: \b(\w*e\w*?ly\w*)\b (compiled form: (?^:\b(\w*e\w*?ly\w*)\b)) "3.heavennly good". d heavennsy! heavennnly output equally heavennnnly input! heavennsy! heavennnly output equally heavennnnly input! heavennsy! heavennnly output equally heavennnnly input! unheavenly host.  heavens! heaves yacht! heavenly heavennly matched words: equally heavenly heavennly heavennnly heavennnnly unheavenly $ perl script.pl myfilename.txt regex: \b(\w*heavenn\w*)\b (compiled form: (?^:\b(\w*heavenn\w*)\b)) 1. "heavenns hello" 2. "heavenns hi", "3.heavennly good". d heavennsy! heavennnly output equally heavennnnly input! heavennsy! heavennnly output equally heavennnnly input! heavennsy! heavennnly output equally heavennnnly input! heavenns heavenns heavennly matched words: heavennly heavennnly heavennnnly heavenns heavennsy $ 

Comments

Popular posts from this blog

html5 - What is breaking my page when printing? -

c# - must be a non-abstract type with a public parameterless constructor in redis -

ajax - PHP/JSON Login script (Twitter style) not setting sessions -