XPath contains to target specific type of link path -
i having lot of difficulty constructing xpath query return kinds of url's need. xpath query below works cases however, have been trying tweak returns url actual page name contains 'about' , not url's about found in directory name. 
current output (bad):
https://www.domain.com/about/account.asp   desired output:
https://www.domain.com/about/about.asp   xpath
 (//a[contains(@href,'about')]/@href)[1]   note: because using php xpath engine can utilize xpath 1.0 solution.
i appreciate suggestions!
many in advance!
xpath 1.0's string manipulation capabilities limited, can based on assumptions.
eg., if urls end .asp, search /about.asp, or more general /about.. dirty hack cut off starting @ first ?, use last few characters (to allow suffixes of different length .xhtml or .pl) , search in there:
[   contains(     substring(substring-before(., '?'), string-length(substring-before(., '?')) - 10),    'about'   ) or (     not(contains(., '?')) ,     contains(substring(., string-length(.) - 10), 'about')   ) ]   and still should still extended hashes # in place of ? fetch cases, there still enough fail at.
i highly recommend use regular expression php more robust , convenient. or use external xpath 2.0/xquery processor saxon, basex, ...
Comments
Post a Comment