Scraping this URL, R XML and getting siblings -

hi: want scrap table federal electoral districts – representation order of 2003 subtable "ontario". url here: http://www.elections.ca/content.aspx?section=res&dir=cir/list&document=index&lang=e#list

i've tried code , gets me close, not exclusively there.

doc<-htmlparse('http://www.elections.ca/content.aspx?section=res&dir=cir/list&document=index&lang=e#list', useinternalnodes=true) doc2<-getnodeset(doc, "//table/caption[text()='ontario']")

i know utilize readhtmltable , find particular table, want know how select sibling nodes of caption node equals ontario. thanks

you can utilize following-sibling in xpath:

library(xml) appurl <- 'http://www.elections.ca/content.aspx?section=res&dir=cir/list&document=index&lang=e#list' doc<-htmlparse(appurl, encoding = "utf-8") tablenode <- doc["//*[@id='list']/following-sibling::table/caption[text()='ontario']/.."][[1]] mytable <- readhtmltable(tablenode) > head(mytable) code          federal electoral districts population 2006 1 35001                       ajax–pickering         117,183 2 35002        algoma–manitoulin–kapuskasing          77,961 3 35003 ancaster–dundas–flamborough–westdale         111,844 4 35004                               barrie         128,430 5 35005                    beaches–east york         104,831 6 35006                 bramalea–gore–malton         152,698

so break downwards xpath. heading federal electoral districts – representation order of 2003 has id="list". id's in html unique can filter on this

//*[@id='list'] find node id equal "list" /following-sibling::table sibling nodes follow tables /caption[text()='ontario'] select nodes have caption text equals "ontario" /.. go node

this gives required table nodes list. there 1 node satisfies above requirements. node can processed readhtmltable.

xml r xpath web-scraping

Search This Blog

Jaimee

Scraping this URL, R XML and getting siblings -

Comments

Post a Comment

Popular posts from this blog

c - Compilation of a code: unkown type name string -

java - Bypassing "final local variable defined in an enclosing type" -

json - Hibernate and Jackson (java.lang.IllegalStateException: Cannot call sendError() after the response has been committed) -