ruby - How do I parse and scrape the meta tags of a URL with Nokogiri? -
i using nokogiri pull <h1>
, <title>
tags, having trouble getting these:
<meta name="description" content="i design , develop websites , applications."> <meta name="keywords" content="web designer,web developer">
i have code:
url = 'https://en.wikipedia.org/wiki/emma_watson' page = nokogiri::html(open(url)) puts page.css('title')[0].text puts page.css('h1')[0].text puts page.css('description') puts meta description puts meta keywords
i looked in docs , didn't find anything. use regex this?
thanks.
here's how i'd go it:
require 'nokogiri' doc = nokogiri::html(<<eot) <meta name="description" content="i design , develop websites , applications."> <meta name="keywords" content="web designer,web developer"> eot contents = %w[description keywords].map { |name| doc.at("meta[name='#{name}']")['content'] } contents # => ["i design , develop websites , applications.", "web designer,web developer"]
or:
contents = doc.search("meta[name='description'], meta[name='keywords']").map { |n| n['content'] } contents # => ["i design , develop websites , applications.", "web designer,web developer"]
Comments
Post a Comment