Beautifulsoup Python nested text -
i wanted obtain text "some text" nested within tags this:
<tr> <td>cme globex</td> <td colspan="4"> text <a target="_blank"" href="http://...>view rollover dates</a> </td> </tr>
i .findall('tr')
first, some_tr.findall('td', colspan=4)
second , some_td.find(text=true)
. there more efficient way this? there way keep traversing through tags , find text?
you can use xpath
expressions using lxml
:
html = """<tr> <td>cme globex</td> <td colspan="4"> text <a target="_blank"" href="http://...">view rollover dates</a> </td> </tr>""" import lxml.html tree = lxml.html.fromstring(html) print tree.xpath('//tr/td[@colspan="4"]/text()')
not you're after...
another way maybe find anchor links "view rollover dates" , take preceding element...
from bs4 import beautifulsoup soup = beautifulsoup(html) in soup.find_all('a', text='view rollover dates'): print a.previous_element
Comments
Post a Comment