Trouble with urllib.request?

This is the place for queries that don't fit in any of the other categories.

Trouble with urllib.request?

Postby CaptainStecve » Thu Oct 31, 2013 11:27 pm

Sorry to bother you guys, having some problems with urllib.request. I wrote this code from a video tutorial on Youtube but it keeps giving me an error. I'm coding in Python 3 - any help?

Code: Select all
import urllib.request
import re

urls = ["http://google.com","http://bbc.co.uk","http://youtube.com"]
i=0
regex = '<title>(.+?)</title>'
pattern = re.compile(regex)

while i < len(urls):
   htmlfile = urllib.request.urlopen(urls[1])
   htmltext = htmlfile.read()
   titles = re.findall(pattern,htmltext)
   print(titles)
   i+=1

input('')
CaptainStecve
 
Posts: 5
Joined: Wed Oct 30, 2013 11:09 pm

Re: Trouble with urllib.request?

Postby ochichinyezaboombwa » Thu Oct 31, 2013 11:37 pm

It is not your first post but seems you didnb't read this yet: why?
ochichinyezaboombwa
 
Posts: 200
Joined: Tue Jun 04, 2013 7:53 pm

Re: Trouble with urllib.request?

Postby ochichinyezaboombwa » Thu Oct 31, 2013 11:40 pm

Also: you seem to loop through 3 urls but use only one of them all three times.

Also: stop today and for good using regex to parse HTML. Do yourself a big favor and learn either BeautifulSoup or lxml (or both).
ochichinyezaboombwa
 
Posts: 200
Joined: Tue Jun 04, 2013 7:53 pm


Return to General Coding Help

Who is online

Users browsing this forum: Bing [Bot] and 2 guests