XPath contains to target specific type of link path -


i having lot of difficulty constructing xpath query return kinds of url's need. xpath query below works cases however, have been trying tweak returns url actual page name contains 'about' , not url's about found in directory name.

current output (bad):

https://www.domain.com/about/account.asp 

desired output:

https://www.domain.com/about/about.asp 

xpath

 (//a[contains(@href,'about')]/@href)[1] 

note: because using php xpath engine can utilize xpath 1.0 solution.

i appreciate suggestions!

many in advance!

xpath 1.0's string manipulation capabilities limited, can based on assumptions.

eg., if urls end .asp, search /about.asp, or more general /about.. dirty hack cut off starting @ first ?, use last few characters (to allow suffixes of different length .xhtml or .pl) , search in there:

[   contains(     substring(substring-before(., '?'), string-length(substring-before(., '?')) - 10),    'about'   ) or (     not(contains(., '?')) ,     contains(substring(., string-length(.) - 10), 'about')   ) ] 

and still should still extended hashes # in place of ? fetch cases, there still enough fail at.

i highly recommend use regular expression php more robust , convenient. or use external xpath 2.0/xquery processor saxon, basex, ...


Comments

Popular posts from this blog

html5 - What is breaking my page when printing? -

html - Unable to style the color of bullets in a list -

c# - must be a non-abstract type with a public parameterless constructor in redis -