Two Plus Two Poker Forums

Two Plus Two Poker Forums (https://forumserver.twoplustwo.com/)
-   Computer and Technical Help (https://forumserver.twoplustwo.com/48/computer-technical-help/)
-   -   ** Python Support Thread ** (https://forumserver.twoplustwo.com/48/computer-technical-help/python-support-thread-1007515/)

Alex Wice 07-26-2012 06:29 AM

Re: ** Python Support Thread **
 
Elements of "tds" were not strings; they were bs4 tags (specifically: class 'bs4.element.Tag' ). Maybe someone else can pick up the slack and explain why you got the error, it seems like there is some sort of lazy evaluation in .string that interacts badly with print.

Anyways, try this and it should work.

Code:

import urllib2
from bs4 import BeautifulSoup

soup = BeautifulSoup(urllib2.urlopen('http://www.timeanddate.com/worldclock/astronomy.html?n=78').read())

for row in soup('table', {'class' : 'spad'})[0].tbody('tr'):
  tds = row('td')
  tds_trunc = tds[0].string, tds[1].string
  print tds_trunc


LA_Price 07-26-2012 07:15 AM

Re: ** Python Support Thread **
 
HolidayintheSun,

Your code worked for me running python 2.7 on windows.

Code:

for row in soup('table', {'class' : 'spad'})[0].tbody('tr'):
...    tds = row('td')
...    print  tds[0].string, tds[1].string
...   
Jul 26, 2012 5:32 AM
Jul 27, 2012 5:34 AM
Jul 28, 2012 5:35 AM
Jul 29, 2012 5:37 AM
Jul 30, 2012 5:39 AM
Jul 31, 2012 5:40 AM
Aug 1, 2012 5:42 AM

I'd start running through the code line by line in an interpreter. For instance
after you import BeautifulSoup and urllib2 run the first line

Code:

>>> soup= BeautifulSoup(urllib2.urlopen('http://www.timeanddate.com/worldclock/astronomy.html?n=78').read())
Then just type

Code:

>>>soup
and you should get output like this

Code:

<!DOCTYPE html>

<!--

scripts and programs that download content transparent to the user are not allowed without permission

-->
<html lang="en">
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
<title>Sunrise and Sunset for Ireland – Dublin – coming days</title>


HolidayInTheSun 07-26-2012 12:20 PM

Re: ** Python Support Thread **
 
thanks for the very helpful replies. comments below:

Quote:

Originally Posted by Alex Wice (Post 33950385)
Elements of "tds" were not strings; they were bs4 tags (specifically: class 'bs4.element.Tag' ). Maybe someone else can pick up the slack and explain why you got the error, it seems like there is some sort of lazy evaluation in .string that interacts badly with print.

Anyways, try this and it should work.

Code:

import urllib2
from bs4 import BeautifulSoup

soup = BeautifulSoup(urllib2.urlopen('http://www.timeanddate.com/worldclock/astronomy.html?n=78').read())

for row in soup('table', {'class' : 'spad'})[0].tbody('tr'):
  tds = row('td')
  tds_trunc = tds[0].string, tds[1].string
  print tds_trunc


yes, that worked. i get:
Code:

for row in soup('table', {'class' : 'spad'})[0].tbody('tr'):
  tds = row('td')
  tds_trunc = tds[0].string, tds[1].string
  print tds_trunc

 
(u'Jul 26, 2012', u'5:32 AM')
(u'Jul 27, 2012', u'5:34 AM')
(u'Jul 28, 2012', u'5:35 AM')
(u'Jul 29, 2012', u'5:37 AM')
(u'Jul 30, 2012', u'5:39 AM')
(u'Jul 31, 2012', u'5:40 AM')
(u'Aug 1, 2012', u'5:42 AM')

however, this only works if i enter the code in multiple steps. i first tried to copy paste the entire thing (from import bs4 to print tds_trunc) and when i did that, nothing happened. that is, nothing was printed, and the next line returned to >>>. only when i entered it piece by piece (1st import bs4, then import urllib2, then define soup, then run the for loop)

Quote:

Originally Posted by LA_Price (Post 33950646)
HolidayintheSun,

Your code worked for me running python 2.7 on windows.

i'm using python 2.7.3 for windows 7 and it's still not working. i don't understand how that's possible? it's not calling anything from the hard drive, shouldn't the program function identically for both of us?


Quote:

I'd start running through the code line by line in an interpreter. For instance
after you import BeautifulSoup and urllib2 run the first line

Code:

>>> soup= BeautifulSoup(urllib2.urlopen('http://www.timeanddate.com/worldclock/astronomy.html?n=78').read())
Then just type

Code:

>>>soup
and you should get output like this

Code:

<!DOCTYPE html>

<!--

scripts and programs that download content transparent to the user are not allowed without permission

-->
<html lang="en">
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
<title>Sunrise and Sunset for Ireland – Dublin – coming days</title>


yeah that's what i tried doing last night. if i define soup, and then just type >>>soup, it prints out the html text which begins with the same text that you identify above. so it has to be something about
Code:

for row in soup('table', {'class' : 'spad'})[0].tbody('tr'):
  tds = row('td')
  print tds[0].string, tds[1].string
  # will print date and sunrise

cause up till then i think i'm okay, the program is doing what it should.

LA_Price 07-26-2012 06:08 PM

Re: ** Python Support Thread **
 
Hmm, one really common beginners mistake is mess up python's treatment of spaces

Code:

for row in soup('table', {'class' : 'spad'})[0].tbody('tr'):
(should be 4 spaces here)tds = row('td')
(that's 4 hits of spacebar)tds_trunc = tds[0].string, tds[1].string
(should be 4 spaces here)print tds_trunc


HolidayInTheSun 07-26-2012 10:20 PM

Re: ** Python Support Thread **
 
hmm i tried that and i still get the same result. maybe it's cause i'm using IDLE? i'm so confused now.

Xhad 07-27-2012 02:09 AM

Re: ** Python Support Thread **
 
are you c/p'ing directly into the console? Or into a file in the editor?

HolidayInTheSun 07-27-2012 02:34 AM

Re: ** Python Support Thread **
 
i was trying to c/p directly into console.

also a piece of information i left out, but might be helpful:

if i type:

Code:

import urllib2
from bs4 import BeautifulSoup

all in one step, then beautiful soup does not import properly. i have to do them one by one in order for bs to actually import. i also tried indenting the second line by 4 spaces and it did not help.
-------------------------------------------------
i just tried to copy paste into the editor and then "run module." when i do this, i still get an error message
Code:

Traceback (most recent call last):
  File "C:\Python27\Scripts\sunrise.py", line 8, in <module>
    print tds[0].string, tds[1].string
  File "C:\Python27\lib\idlelib\rpc.py", line 595, in __call__
    value = self.sockio.remotecall(self.oid, self.name, args, kwargs)
  File "C:\Python27\lib\idlelib\rpc.py", line 210, in remotecall
    seq = self.asynccall(oid, methodname, args, kwargs)
  File "C:\Python27\lib\idlelib\rpc.py", line 225, in asynccall
    self.putmessage((seq, request))
  File "C:\Python27\lib\idlelib\rpc.py", line 324, in putmessage
    s = pickle.dumps(message)
  File "C:\Python27\lib\copy_reg.py", line 74, in _reduce_ex
    getstate = self.__getstate__
RuntimeError: maximum recursion depth exceeded

when i try to run
Code:

import urllib2
from bs4 import BeautifulSoup

soup = BeautifulSoup(urllib2.urlopen('http://www.timeanddate.com/worldclock/astronomy.html?n=78').read())

for row in soup('table', {'class' : 'spad'})[0].tbody('tr'):
  tds = row('td')
  print tds[0].string, tds[1].string
  # will print date and sunrise

but when i run
Code:

import urllib2
from bs4 import BeautifulSoup

soup = BeautifulSoup(urllib2.urlopen('http://www.timeanddate.com/worldclock/astronomy.html?n=78').read())

for row in soup('table', {'class' : 'spad'})[0].tbody('tr'):
  tds = row('td')
  tds_trunc = tds[0].string, tds[1].string
  print tds_trunc

this works just fine and prints the date along with the sunrise time.

Mariogs37 07-27-2012 02:59 PM

Re: ** Python Support Thread **
 
Hey guys,

I'm trying to get my program to read in text from a .txt file (it does a bunch of stuff with the text afterward but that's not the issue). Here's what I have:

Code:


def read_file():
    f = open(text.txt, "r")
    text = f.read()


def start():


    space_count = 0

    speed("fastest")

    color("red")

    width = input("How wide would you like each character to be?: ")

    home_turtle(0,0)
   



def main():


    start()

    read_file()

    for character in text.txt():

        assess_letter()

        forward_turtle()

   
main()

And here's the error I get:

"UnboundLocalError: local variable "text" referenced before assignment"

Thoughts on how to fix this?

daveT 07-27-2012 04:55 PM

Re: ** Python Support Thread **
 
Your functions have to have a return value. Python doesn't have implicit returns, so you have to do be explicit to put the variables into the global space:

Code:

def read_file():
    f = open(text.txt, "r")
    text = f.read()
    return text

then you can call the value of the function like so:

Code:

myVariable = read_file()

print myVariable

So with that knowledge, you can do this:

Code:

def main():


    start()

    myText = read_file()

    for character in myText:

        assess_letter()

        forward_turtle()

The above isn't tested, but that should sort of work. But you might want to comment out the assess_letter() and forward_turtle() function and just put in print myText to a) see what the output looks like and b) prevent THAT error as well.

Alex Wice 07-27-2012 08:58 PM

Re: ** Python Support Thread **
 
Holiday you should be making a new *.py file and copy and pasting to that file, then running (F5) that file.

Alex Wice 07-27-2012 09:13 PM

Re: ** Python Support Thread **
 
Quote:

Originally Posted by Mariogs37 (Post 33973424)
Hey guys,

I'm trying to get my program to read in text from a .txt file (it does a bunch of stuff with the text afterward but that's not the issue). Here's what I have:

Code:


def read_file():
    f = open(text.txt, "r")
    text = f.read()


def start():


    space_count = 0

    speed("fastest")

    color("red")

    width = input("How wide would you like each character to be?: ")

    home_turtle(0,0)
   



def main():


    start()

    read_file()

    for character in text.txt():

        assess_letter()

        forward_turtle()

   
main()

And here's the error I get:

"UnboundLocalError: local variable "text" referenced before assignment"

Thoughts on how to fix this?

Yes, text is a local variable to read_file. So when you said "text.txt()", this doesn't have any meaning at all....

Honestly I think it's totally okay if your files are global. So, like...

Code:

fi = open('text.txt','r')

def start():
    #stuff goes here

def main():
    start()
    for cx in fi:
      assess_letter()
      forward_turtle()


Mariogs37 08-06-2012 07:56 PM

Finding Multiple Maxes from a List
 
Hey guys,

So I've written a program that prompts the user for a letter and then records the number of times each letter is input in a list. So list[1] is the number of times A is input, list[2] for B, etc.

I want the last part of my program to tell me what the most common letter was (or letters if several letters are input the same number of times). This is the part I'm struggling with.

I thought of creating a list like:

Code:


most_common = [0] * 27 #I'm not using the first entry spot

max_occurrences = 0

for letter in range(1, 27):
      if list[letter] == max_occurrences:
            most_common[letter] = list[letter]
      elif list[letter] > max_occurrences:
            most_common = [0] * 27
            most_common[letter] = list[letter]

#Then I'll have some code that looks at each element in my most_common list and prints letters for each spot in the list that isn't 0 (these should be all the numbers that are maxes).


jukofyork 08-06-2012 09:05 PM

Re: ** Python Support Thread **
 
Quote:

Should be in the python support thread, sorry about that
I merged it in for you.

Juk :)

jukofyork 08-06-2012 09:15 PM

Re: ** Python Support Thread **
 
Code:

most_common = [0] * 26

max_occurrences = -1
num_max = 0

for letter in range(1, 27):
      if list[letter] > max_occurrences:
            max_occurrences = list[letter]
            most_common[0] = letter
            num_max = 1
      elif list[letter] == max_occurrences:
            most_common[num_max] = letter
            num_max = num_max + 1

At the end of this code you should have a list (vector?) of indexes with the first num_max filled, eg:

num_max=3, max_occurrences=6, most_common={1,4,26}

on exit would tell you that there are 3 letters with the max count, the max count is 6 occurrences and the most common letters were A, D and Z.

It might not be quite syntactically correct as I don't use Python, but hopefully you should get the idea :)

An alternative method would be to use a list/vector of tuples (letter/frequency) and then sort them in descending order letting you see the most frequent at the start of the sorted list.

Juk :)

daveT 08-06-2012 10:26 PM

Re: ** Python Support Thread **
 
Are you required to use list because this is a homework assignment? If it's a self-study project, maybe you should look into using a dictionary.

Mariogs37 08-06-2012 11:47 PM

Re: ** Python Support Thread **
 
Yeah, required to use list.

Mariogs37 08-07-2012 04:25 PM

Re: ** Python Support Thread **
 
So actually, I figured out how to do this. Thing is, I want the program to read in text until it reads "!", at which point it stops. I'm using a list to keep track of these but I'm not sure how to get the program to prompt the user for more text if it reads through all of the characters the user inputs and doesn't hit "!".

Here's my code so far:

Code:


list = [0] * 27

text = input("Please input some text: ")

for character in text:

list = [0] * 27

text = input("Please input some text: ")

for character in text:
    if character != "!":
        if ord[character]-65 < 0 or ord[character] > 25:
          list[26] += 1
        else:
          list[ord[character]-65] += 1


theOnlyMoment 08-07-2012 04:58 PM

Re: ** Python Support Thread **
 
Quote:

Originally Posted by Mariogs37 (Post 34164192)
So actually, I figured out how to do this. Thing is, I want the program to read in text until it reads "!", at which point it stops. I'm using a list to keep track of these but I'm not sure how to get the program to prompt the user for more text if it reads through all of the characters the user inputs and doesn't hit "!".

Here's my code so far:

Code:


list = [0] * 27

text = input("Please input some text: ")

for character in text:

list = [0] * 27

text = input("Please input some text: ")

for character in text:
    if character != "!":
        if ord[character]-65 < 0 or ord[character] > 25:
          list[26] += 1
        else:
          list[ord[character]-65] += 1


Sorry not entirely sure what you're trying to do but for one thing if you want the program to continue asking the user for input then you'll need a while loop

Alex Wice 08-10-2012 06:48 AM

Re: ** Python Support Thread **
 
Code:

charlist = [0 for i in xrange(27)]
while True:
    inputtext = raw_input("Please input some text: ")
    for character in inputtext:
        av = ord(character)
        if av >= 65 and av <= 90:
            charlist[ord(character)-64] += 1
        elif av >= 97 and av <= 122:
            charlist[ord(character)-96] += 1
        elif av==33:
            break
    if av==33: break

maxi = max(charlist)
print 'num_max =',maxi
print 'max_occur =',charlist.count(maxi)
print 'most_common =',map(lambda x: chr(x+64), filter(lambda x: charlist[x]==maxi, range(27)))

I don't know all the terms of your assignment, so I just did this the way that I would actually code it.

Okay now lets study this program.

First we held open a character list. I would seed it by 0 but since you wanted to seed it so that a=1, that is fine too.

We knew we would have to keep asking the user for text until we knew to stop, so we started with a while loop. Next we took raw input. We did not use input( ..) because it was not guaranteed to be of type string, which could hurt us in comparison later (for example, "for character in inputtext" may not work because inputtext is not an iterable.)

Now we looked at each character in the input text. This was the right approach and you did a good job. Had you looked for the number of "a"s in inputtext, then followed by the number of "b"s, etc. you would have made a quadratic number of comparisons instead of a linear amount.

So, for each character we stored the ordinal number in a variable "av", so we would not have to look it up each time. We then checked if it was in the range [65,90] which would make it a capital letter from A-Z. If it was, we ticked up our charlist. We did the same in checking for lowercase. Finally, we knew the ordinal number of "!" would be ord("!") == 33, so if that were true, we stopped looking at letters immediately. (So if you type xyzz!aaaaaaa, "a" wont be most common.)

Finally, we checked if the residue on "av" was still 33. This is not that great of a practice but for something like this IMO it is fine. If and only if we saw a "!" (if character == "!"), the control structure would break out at the "elif av==33: break" part, and then it would immediately break out again at "if av==33: break".

Now we come to reporting the result. The num_max is simply going to be the highest number in our tally (namely, charlist). The number of times the max occurs is going to be charlist.count(maxi) -- this just counts how many times maxi was seen. The last one is tricky, so lets look at it in two steps:

First, we want a list of indices that represent the letters that are most common. For example if A, B, and D are most common, we want a list of [1,2,4]. The appropriate code for this is "filter(lambda x: charlist[x]==maxi, range(27))". What that does is, it takes a list [0,1,2,...,26], and it only keeps the elements of the list x for which charlist[x] == maxi -- namely, that it was a maximum.

Secondly, we have this list (eg. nicelist = [1,2,4]) and we want to get to ['A','B','D']. The correct code is going to be map(lambda x: chr(x+64), nicelist). What that does is, it goes one by one down nicelist and it changes every element x to chr(x+64). Since each element can only be a number from 1 to 26, it will change everything to one of chr(65) = 'A', chr(66) = 'B', etc. up to chr(90) = 'Z'.

Putting it together, we get this chunky line "map(lambda x: chr(x+64), filter(lambda x: charlist[x]==maxi, range(27)))".


If list comprehensions are new to you, you can iterate through the list and use the same ideas.

Code:

common = []
for i in xrange(len(charlist)):
    if charlist[i]==maxi: common.append(chr(64+i))
print common

I hope that helps, if you told me the assignment more maybe I can help you more simply.

fluorescenthippo 10-20-2012 07:01 PM

Re: ** Python Support Thread **
 
im trying to log into forums using python and getting stuck. I heard mechanize might work but i suck too much at programming to figure it out.

i am writing a program to go to my subscriptions threads for this and other forums and open all the new threads in new windows. I have it all working except the log in part. so with one click per forum i get all the new threads opened up

Lavon Affair 02-19-2013 10:58 PM

Re: ** Python Support Thread **
 
could somebody walk me through a web scrape with BeautifulSoup?

i am trying to scrape the regular season table from here http://www.basketball-reference.com/.../2011/gamelog/.

so i have
Code:

from bs4 import BeautifulSoup
import urllib2
     
url = 'http://www.basketball-reference.com/teams/BOS/2011/gamelog/'
soup = BeautifulSoup(urllib2.urlopen(url).read())

I am having trouble understanding how to import the data I want from the table. I have some idea of what I need to do from looking at BeautifulSoup tutorials and examples but don't really grasp everything.

By inspecting the table on the website I can see that it is named "sortable stats_table" and that the data is nested under
Quote:

<tbody>
then
Quote:

<tr class data-row="0'>
the data for the opponent is
Code:

<td align="left">MIA</td>
the result of the game
Quote:

<td align="center">W</td>
and the rest of the data is like this
Code:

<td align="right">240</td>
First off since I know it is the first table can I just use ('table')[0] or do I have to use it's name? Then I am really confused on pulling the data out of the table. Do I have to differentiate between the data in <td align="left", "right", "center" or can I just I grab all of the data without doing that?

tyler_cracker 02-19-2013 11:23 PM

Re: ** Python Support Thread **
 
i recommend using a debugger (or print statements) to understand how beautifulsoup models the html. once you've got that, extracting the parts you need will be a piece of cake.

Lavon Affair 02-20-2013 12:13 AM

Re: ** Python Support Thread **
 
Code:

for row in soup('table')[0].tbody('tr'):
    tds = row('td')
   
    print tds


So that code returns all of the table but with the HTML tags included. How do I strip out just the data?

kerowo 02-20-2013 08:12 PM

Re: ** Python Support Thread **
 
I'm sure there is a mind boggling complex regex that will filter those out for you. To the google!

Elapsed time 32.5 seconds: http://www.pagecolumn.com/tool/all_about_html_tags.htm

tyler_cracker 02-20-2013 11:55 PM

Re: ** Python Support Thread **
 
why would you take a guy working on the right path toward a solution and drop him in the dark forest of regex?

boothisman.gif


All times are GMT -4. The time now is 04:26 PM.

Powered by vBulletin®
Copyright ©2000 - 2026, Jelsoft Enterprises Ltd.

Copyright © 2008-2020, Two Plus Two Interactive