XPath contains to target specific type of link path -
i having lot of difficulty constructing xpath query return kinds of url's need. xpath query below works cases however, have been trying tweak returns url actual page name contains 'about' , not url's about
found in directory name.
current output (bad):
https://www.domain.com/about/account.asp
desired output:
https://www.domain.com/about/about.asp
xpath
(//a[contains(@href,'about')]/@href)[1]
note: because using php xpath engine can utilize xpath 1.0 solution.
i appreciate suggestions!
many in advance!
xpath 1.0's string manipulation capabilities limited, can based on assumptions.
eg., if urls end .asp
, search /about.asp
, or more general /about.
. dirty hack cut off starting @ first ?
, use last few characters (to allow suffixes of different length .xhtml
or .pl
) , search in there:
[ contains( substring(substring-before(., '?'), string-length(substring-before(., '?')) - 10), 'about' ) or ( not(contains(., '?')) , contains(substring(., string-length(.) - 10), 'about') ) ]
and still should still extended hashes #
in place of ?
fetch cases, there still enough fail at.
i highly recommend use regular expression php more robust , convenient. or use external xpath 2.0/xquery processor saxon, basex, ...
Comments
Post a Comment