web scraping - How can I get output data from this webpage using Python? -
web scraping - How can I get output data from this webpage using Python? -
i'm trying obtain geographic distance between 2 addresses using website: http://www.freemaptools.com/how-far-is-it-between.htm
i want able go page, come in 2 addresses, click "show", , extract "distance crow flies" , "distance land transport" values , save them dictionary.
is there way ouptut data(distance) webpage, not familiar html not sure output is. have input data, below code reference.
webpage source code: not able decipher
<tr> <td align="right">from <input name="pointa" type="text" value="" size="22" onkeypress="autocompletea(this.value, event)" /></td> <td><div align="center">to</div></td> <td><input name="pointb" type="text" value="" size="22" onkeypress="autocompleteb(this.value, event)"/></td> <td><p role="button" tabindex="0" class="fmtbutton" onkeypress="findaandb(document.forms['inp']['pointa'].value,document.forms['inp']['pointb'].value);" onclick="findaandb(document.forms['inp']['pointa'].value,document.forms['inp']['pointb'].value);"> show </p> <label></label></td> </tr> my code:
import re mechanize import browser text = """ web input""" browser = browser() browser.open("http://www.freemaptools.com/how-far-is-it-between.htm") browser.select_form(nr=0) browser['pointa'] = 'san diego, usa' browser['pointb'] = 'san francisco, usa' response = browser.submit() content = response.read() result = re.findall(r'dist', content) print result[5] thanks help
this page makes heavy utilize of javascript mechanise not handle browser would.
if inspect source can see using several apis, , calculations, instance using main api , crows fly calculation distance way
import requests import math beautifulsoup import beautifulsoup def distance_on_unit_sphere(lat1, long1, lat2, long2): "src: http://www.johndcook.com/python_longitude_latitude.html" degrees_to_radians = math.pi/180.0 phi1 = (90.0 - lat1)*degrees_to_radians phi2 = (90.0 - lat2)*degrees_to_radians theta1 = long1*degrees_to_radians theta2 = long2*degrees_to_radians cos = (math.sin(phi1)*math.sin(phi2)*math.cos(theta1 - theta2) + math.cos(phi1)*math.cos(phi2)) arc = math.acos( cos ) homecoming arc * 3960 # distance in kilometers, multiply 6373 instead def main(): r = requests.get('http://www.freemaptools.com/ajax/getaandb.php?a=sydney_australia&b=melbourne_australia&c=1317') xml = beautifulsoup(r.text) lat1 = float(xml.markers.findall('marker')[0]['lat']); lng1 = float(xml.markers.findall('marker')[0]['lng']); lat2 = float(xml.markers.findall('marker')[1]['lat']); lng2 = float(xml.markers.findall('marker')[1]['lng']); print distance_on_unit_sphere(lat1, lng1, lat2, lng2) if __name__ == '__main__': main() python web-scraping html-parsing web-crawler
Comments
Post a Comment