Remove contents between script and style tags in Objective-C -


alright, working on web crawler can take webpages , convert them passages of text. remove tags themselves, found on stack overflow:

- (nsstring *) striptags:(nsstring *)str {     nsmutablestring *ms = [nsmutablestring stringwithcapacity:[str length]];      nsscanner *scanner = [nsscanner scannerwithstring:str];     [scanner setcharacterstobeskipped:nil];     nsstring *s = nil;     while (![scanner isatend])     {         [scanner scanuptostring:@"<" intostring:&s];         if (s != nil)             [ms appendstring:s];         [scanner scanuptostring:@">" intostring:null];         if (![scanner isatend])             [scanner setscanlocation:[scanner scanlocation]+1];         s = nil;     }      return ms; } 

and works, however, removes tags, not contents between script , style tags (as don't want contents between tags removed result in empty string).

is there way can have script , style tags truncated?

thanks lot in advance.

edit:

i have tried changing code to:

- (nsstring *) striptags:(nsstring *)str {     nsmutablestring *ms = [nsmutablestring stringwithcapacity:[str length]];      nsscanner *scanner = [nsscanner scannerwithstring:str];     [scanner setcharacterstobeskipped:nil];     nsstring *s = nil;     while (![scanner isatend])     {         [scanner scanuptostring:@"<script" intostring:&s];         if (s != nil)             [ms appendstring:s];         [scanner scanuptostring:@"script>" intostring:null];         if (![scanner isatend])             [scanner setscanlocation:[scanner scanlocation]+1];         [scanner scanuptostring:@"<" intostring:&s];         if (s != nil)             [ms appendstring:s];         [scanner scanuptostring:@">" intostring:null];         if (![scanner isatend])             [scanner setscanlocation:[scanner scanlocation]+1];         s = nil;     }      return ms; } 

but scripts , css still being included

you can edit scanner code can check tags. if tag 1 want remove can scan closing tag , discard string. not can store / append string.


read tag start (<)' read tag can check is. read tag close , either drop or save it.


start (typed inline , not tested in way):

while (![scanner isatend]) {     [scanner scanuptostring:@"<" intostring:&s];     if (s != nil)         [ms appendstring:s];     [scanner scanuptostring:@">" intostring:&t];     if ([t isequaltostring:@"tagtoignore"]) {         [scanner scanuptostring:@"<" intostring:null];         [scanner setscanlocation:[scanner scanlocation]-1];         s = nil;         t = nil;         continue;     }     if (![scanner isatend])         [scanner setscanlocation:[scanner scanlocation]+1];     s = nil;     t = nil; } 

Comments

Popular posts from this blog

html5 - What is breaking my page when printing? -

c# - must be a non-abstract type with a public parameterless constructor in redis -

ajax - PHP/JSON Login script (Twitter style) not setting sessions -