Web scraping 101

Web scraping 101

Postby pythonmeanic » Wed Jul 20, 2016 6:44 pm

Hi.
Please help - I do the following but get blank results for stock price!

Thanks so much for any help.
Am using python 3

Code: Select all
import re
import urllib
.request

url
="https://www.google.com/finance?q=NYSE%3AAAPL%3BNYSE%3AGOOG"
linkregex re.compile('<span id="ref_22144_l">(.+?)</span>')
urllib.request.urlopen(url)
msg m.read()
links re.findall(linkregex,str(msg))
print (
links
Last edited by micseydel on Wed Jul 20, 2016 7:12 pm, edited 1 time in total.
Reason: Code tags. Initial post lock.
pythonmeanic
 
Posts: 1
Joined: Wed Jul 20, 2016 6:42 pm

Re: Web scraping 101

Postby Ofnuts » Thu Jul 21, 2016 12:51 pm

Have you at least checked that the string you search exists in the HTML you received? This is likely much less a Python issue than understanding how these pages are made. Very often the contents come from an AJAX call (see if you get the contents in your browser if you disable Javascript) which is just a subsidiary HTTP request that returns a small about of data, usually in a form that is easy to parse, XML or JSON. In which case your Python can forget about the original page, and just issue the same smallish HTTP request to get just the data. With some luck this request is a documented API.
This forum has been moved to http://python-forum.io/. See you there.
User avatar
Ofnuts
 
Posts: 2659
Joined: Thu May 14, 2015 9:46 am
Location: Paris, France, EU, Earth, Solar system, Milky Way, Local Cluster, Universe #32987440940987


Return to Networking

Who is online

Users browsing this forum: Bing [Bot] and 2 guests