ruby - Nokogiri can get og:image in some sites -



ruby - Nokogiri can get og:image in some sites -

i'm using nokogiri parse html , og:image value:

def get_og_image url html = open(url, "r:binary").read doc = nokogiri::html(html.toutf8, nil, 'utf-8') if doc.css("meta[property='og:image']").present? img_path = doc.css("meta[property='og:image']").first.attributes["content"].value end img_path end

now

> get_og_image "http://techcrunch.com/2014/08/05/the-hug-a-water-bottle-sensor-and-app-helps-you-stay-hydrated/" => "http://tctechcrunch2011.files.wordpress.com/2014/08/the-hug_office.jpg?w=680" > get_og_image "http://www.yahoo.co.jp/" => nil

however yahoo.co.jp has og:image value:

<meta property="og:image" content="http://k.yimg.jp/images/top/ogp/fb_y_1500px.png">

how can right og:image in nokogiri?

the response html of "http://www.yahoo.co.jp/", had problem with, changed user agent.

i set dummy user-agent when access url nokogiri , can og:image.

ruby nokogiri

Comments

Popular posts from this blog

Delphi change the assembly code of a running process -

json - Hibernate and Jackson (java.lang.IllegalStateException: Cannot call sendError() after the response has been committed) -

C++ 11 "class" keyword -