How to web scraping XML page with hidden codes in R? -



How to web scraping XML page with hidden codes in R? -

i'm trying web scraping open info (title, author , journal) in sciencedirect website. there problem code source because these info hidden, has resulted in problems in code. function i've created produce data.frame 1 row only. 1 time seek produce excel sheet these info result long row.

necessary packages

library(bitops) library(rcurl) library(xml) library(rjsonio) library(devtools) library(rselenium)

accessing search page remotely

checkforserver() startserver() #openning firefox firefox_con <- remotedriver(remoteserveraddr = "localhost", port = 4444, browsername = "firefox")

firefox_con$open() # allow window open setting info search url <- "http://www.sciencedirect.com" firefox_con$navigate("http://www.sciencedirect.com") busca <- firefox_con$findelement(using = "css selector", value = "#qs_all") keyword <- busca$sendkeystoelement(list("key word", key="enter"))`> passing page source r

pagina <- xmlroot( htmlparse(unlist(firefox_con$getpagesource()) ) )

function used in page source scraping scraper_science <- function(x) {

doc <- htmlparse(url, encoding="utf-8")

tit <- xpathapply(x, "//a[@id]", xmlvalue, "id")

class <- xpathapply(x, "//li [@class]", xmlvalue, "class")

inf.art <- class[seq(59,227,7)]

dat <- data.frame(title=tit, inf=inf.art)

}

xml r web

Comments

Popular posts from this blog

assembly - What is the addressing mode for ld, add, and rjmp instructions? -

vowpalwabbit - Interpreting Vowpal Wabbit results: Why are some lines appended by "h"? -

Is there a way to convert an HTML page styled with Bootstrap CSS into email-compatible html? -