![]() |
|
Re: ** Python Support Thread **
Elements of "tds" were not strings; they were bs4 tags (specifically: class 'bs4.element.Tag' ). Maybe someone else can pick up the slack and explain why you got the error, it seems like there is some sort of lazy evaluation in .string that interacts badly with print.
Anyways, try this and it should work. Code:
import urllib2 |
Re: ** Python Support Thread **
HolidayintheSun,
Your code worked for me running python 2.7 on windows. Code:
for row in soup('table', {'class' : 'spad'})[0].tbody('tr'):after you import BeautifulSoup and urllib2 run the first line Code:
>>> soup= BeautifulSoup(urllib2.urlopen('http://www.timeanddate.com/worldclock/astronomy.html?n=78').read())Code:
>>>soupCode:
<!DOCTYPE html> |
Re: ** Python Support Thread **
thanks for the very helpful replies. comments below:
Quote:
Code:
for row in soup('table', {'class' : 'spad'})[0].tbody('tr'):Quote:
Quote:
Code:
for row in soup('table', {'class' : 'spad'})[0].tbody('tr'): |
Re: ** Python Support Thread **
Hmm, one really common beginners mistake is mess up python's treatment of spaces
Code:
for row in soup('table', {'class' : 'spad'})[0].tbody('tr'): |
Re: ** Python Support Thread **
hmm i tried that and i still get the same result. maybe it's cause i'm using IDLE? i'm so confused now.
|
Re: ** Python Support Thread **
are you c/p'ing directly into the console? Or into a file in the editor?
|
Re: ** Python Support Thread **
i was trying to c/p directly into console.
also a piece of information i left out, but might be helpful: if i type: Code:
import urllib2------------------------------------------------- i just tried to copy paste into the editor and then "run module." when i do this, i still get an error message Code:
Traceback (most recent call last):Code:
import urllib2Code:
import urllib2 |
Re: ** Python Support Thread **
Hey guys,
I'm trying to get my program to read in text from a .txt file (it does a bunch of stuff with the text afterward but that's not the issue). Here's what I have: Code:
"UnboundLocalError: local variable "text" referenced before assignment" Thoughts on how to fix this? |
Re: ** Python Support Thread **
Your functions have to have a return value. Python doesn't have implicit returns, so you have to do be explicit to put the variables into the global space:
Code:
def read_file():Code:
myVariable = read_file()Code:
def main(): |
Re: ** Python Support Thread **
Holiday you should be making a new *.py file and copy and pasting to that file, then running (F5) that file.
|
Re: ** Python Support Thread **
Quote:
Honestly I think it's totally okay if your files are global. So, like... Code:
fi = open('text.txt','r') |
Finding Multiple Maxes from a List
Hey guys,
So I've written a program that prompts the user for a letter and then records the number of times each letter is input in a list. So list[1] is the number of times A is input, list[2] for B, etc. I want the last part of my program to tell me what the most common letter was (or letters if several letters are input the same number of times). This is the part I'm struggling with. I thought of creating a list like: Code:
|
Re: ** Python Support Thread **
Quote:
Juk :) |
Re: ** Python Support Thread **
Code:
most_common = [0] * 26num_max=3, max_occurrences=6, most_common={1,4,26} on exit would tell you that there are 3 letters with the max count, the max count is 6 occurrences and the most common letters were A, D and Z. It might not be quite syntactically correct as I don't use Python, but hopefully you should get the idea :) An alternative method would be to use a list/vector of tuples (letter/frequency) and then sort them in descending order letting you see the most frequent at the start of the sorted list. Juk :) |
Re: ** Python Support Thread **
Are you required to use list because this is a homework assignment? If it's a self-study project, maybe you should look into using a dictionary.
|
Re: ** Python Support Thread **
Yeah, required to use list.
|
Re: ** Python Support Thread **
So actually, I figured out how to do this. Thing is, I want the program to read in text until it reads "!", at which point it stops. I'm using a list to keep track of these but I'm not sure how to get the program to prompt the user for more text if it reads through all of the characters the user inputs and doesn't hit "!".
Here's my code so far: Code:
|
Re: ** Python Support Thread **
Quote:
|
Re: ** Python Support Thread **
Code:
charlist = [0 for i in xrange(27)]Okay now lets study this program. First we held open a character list. I would seed it by 0 but since you wanted to seed it so that a=1, that is fine too. We knew we would have to keep asking the user for text until we knew to stop, so we started with a while loop. Next we took raw input. We did not use input( ..) because it was not guaranteed to be of type string, which could hurt us in comparison later (for example, "for character in inputtext" may not work because inputtext is not an iterable.) Now we looked at each character in the input text. This was the right approach and you did a good job. Had you looked for the number of "a"s in inputtext, then followed by the number of "b"s, etc. you would have made a quadratic number of comparisons instead of a linear amount. So, for each character we stored the ordinal number in a variable "av", so we would not have to look it up each time. We then checked if it was in the range [65,90] which would make it a capital letter from A-Z. If it was, we ticked up our charlist. We did the same in checking for lowercase. Finally, we knew the ordinal number of "!" would be ord("!") == 33, so if that were true, we stopped looking at letters immediately. (So if you type xyzz!aaaaaaa, "a" wont be most common.) Finally, we checked if the residue on "av" was still 33. This is not that great of a practice but for something like this IMO it is fine. If and only if we saw a "!" (if character == "!"), the control structure would break out at the "elif av==33: break" part, and then it would immediately break out again at "if av==33: break". Now we come to reporting the result. The num_max is simply going to be the highest number in our tally (namely, charlist). The number of times the max occurs is going to be charlist.count(maxi) -- this just counts how many times maxi was seen. The last one is tricky, so lets look at it in two steps: First, we want a list of indices that represent the letters that are most common. For example if A, B, and D are most common, we want a list of [1,2,4]. The appropriate code for this is "filter(lambda x: charlist[x]==maxi, range(27))". What that does is, it takes a list [0,1,2,...,26], and it only keeps the elements of the list x for which charlist[x] == maxi -- namely, that it was a maximum. Secondly, we have this list (eg. nicelist = [1,2,4]) and we want to get to ['A','B','D']. The correct code is going to be map(lambda x: chr(x+64), nicelist). What that does is, it goes one by one down nicelist and it changes every element x to chr(x+64). Since each element can only be a number from 1 to 26, it will change everything to one of chr(65) = 'A', chr(66) = 'B', etc. up to chr(90) = 'Z'. Putting it together, we get this chunky line "map(lambda x: chr(x+64), filter(lambda x: charlist[x]==maxi, range(27)))". If list comprehensions are new to you, you can iterate through the list and use the same ideas. Code:
common = [] |
Re: ** Python Support Thread **
im trying to log into forums using python and getting stuck. I heard mechanize might work but i suck too much at programming to figure it out.
i am writing a program to go to my subscriptions threads for this and other forums and open all the new threads in new windows. I have it all working except the log in part. so with one click per forum i get all the new threads opened up |
Re: ** Python Support Thread **
could somebody walk me through a web scrape with BeautifulSoup?
i am trying to scrape the regular season table from here http://www.basketball-reference.com/.../2011/gamelog/. so i have Code:
from bs4 import BeautifulSoupBy inspecting the table on the website I can see that it is named "sortable stats_table" and that the data is nested under Quote:
Quote:
Code:
<td align="left">MIA</td>Quote:
Code:
<td align="right">240</td> |
Re: ** Python Support Thread **
i recommend using a debugger (or print statements) to understand how beautifulsoup models the html. once you've got that, extracting the parts you need will be a piece of cake.
|
Re: ** Python Support Thread **
Code:
for row in soup('table')[0].tbody('tr'):So that code returns all of the table but with the HTML tags included. How do I strip out just the data? |
Re: ** Python Support Thread **
I'm sure there is a mind boggling complex regex that will filter those out for you. To the google!
Elapsed time 32.5 seconds: http://www.pagecolumn.com/tool/all_about_html_tags.htm |
Re: ** Python Support Thread **
why would you take a guy working on the right path toward a solution and drop him in the dark forest of regex?
boothisman.gif |
| All times are GMT -4. The time now is 04:26 PM. |
|
Powered by vBulletin®
Copyright ©2000 - 2026, Jelsoft Enterprises Ltd.
Copyright © 2008-2020, Two Plus Two Interactive