feedback on this python code - Computer Technical Help

Two Plus Two Forums Other Topics Computer and Technical Help

feedback on this python code

Post Reply Subscribe

...

06-19-2012 , 09:13 PM

unluckyboy

banned

Join Date: May 2012 Posts: 163

i'd like feedback on this python code i just did. i'd like to keep improving as a programmer.

i have data for 1300 football games. for each game i have the spread, and what the final score was compared to the spread. it is set up as a list of lists now all_line_differences=[(game1_spread, game1_result_compared_to_spread), (game2_spread, game2_result_compared_to_spread), etc.]

i want to take this information and make a printout with these datas:

spread, quantity_games, game1_result_compared_to_spread==0/total_games_with_this_spread

spread2, quantity_games, game2_result_compared_to_spread==0/total_games_with_this_spread
etc. for each spread

THIS IS MY CODE ----------- PLEASE GIVE SUGGESTIONS ON HOW TO MAKE NICER OR MORE ELEGANT

spread_quantity =dict()
spread_tied = dict()

for item in all_line_differences:

spread_quantity[abs(item[0])]=0
spread_tied[abs(item[0])]=0

for item in all_line_differences:
spread_quantity[abs(item[0])]+=1
if item[1] ==0:
spread_tied[abs(item[0])]+=1

for key,value in sorted(spread_quantity.iteritems()):
print key, " ", float(value), " ", spread_tied[key]/float(value)

Quote

150% up to $2,000 Welcome Bonus on CoinPoker

Join the action now

Daily Rewards • Splash Pots • CoinRaces

06-19-2012 , 09:14 PM

unluckyboy

banned

Join Date: May 2012 Posts: 163

i am guessing there is more elegant way to structure this task

Quote

06-20-2012 , 01:51 AM

RoundTower

ɹǝʍoʇpunoɹ

Join Date: Feb 2005 Posts: 14,562

does this work as you want it to? If so I do not think there is much point in improving it. It is concise and relatively clear. You could make it more idiomatic, but that would achieve little.

Quote

06-20-2012 , 02:18 AM

RoundTower

ɹǝʍoʇpunoɹ

Join Date: Feb 2005 Posts: 14,562

That might come across the wrong way, because I think writing idiomatic code is important in general, that you shouldn't always just write things the only way you know how. But I think you would understand that more if you wrote something longer and worse-written and realised the problems with it as you tried to extend it. This code simply isn't ugly enough.

having said that, here are some (arguable) improvements:

Code:

spread_quantity =dict()
spread_tied = dict()

you could use instead of dict(), defaultdict(int), and not have to initialize to zero (it's easier for you to read the docs than read me explain how defaultdict works).

Code:

for item in all_line_differences:
spread_quantity[abs(item[0])]+=1
if item[1] ==0:
spread_tied[abs(item[0])]+=1

this might be made clearer by implicitly unpacking the tuples, as you already know how to do given the next fragment

Code:

for spread, diff in all_line_differences:
spread_quantity[abs(spread)]+=1
if diff ==0:
spread_tied[abs(spread)]+=1

In fact the most concise way to solve the problem is probably to use the appropriate function in the itertools module.

Quote

06-21-2012 , 12:16 AM

daveT

S.A.G.E. Master

Join Date: Jun 2005 Posts: 23,955

I just do a full select clause in the raw SQL and then something like this will unwrap the tuples:

Code:

a = ('''select query''')

for i in a():
	print(i[0], i[1], i[2]....)

(replace the a() with whatever it is that allows you to print out the raw list of tuples)

Just about every SQL language has the ability to add, subtract, divide, do abs(), etc, so you would only have to do some select query like:

select game, result, up/down, abs(xyz)
from table1, table2
where table1.game = table2.game

or something along those lines. How complicated that query gets really depends on your schema and what it is you want to do. (Yes, I know the above query won't work at all)

Quote

06-21-2012 , 12:27 PM

unluckyboy

banned

Join Date: May 2012 Posts: 163

Quote:

Originally Posted by RoundTower

Code:

spread_quantity =dict()
spread_tied = dict()

you could use instead of dict(), defaultdict(int), and not have to initialize to zero (it's easier for you to read the docs than read me explain how defaultdict works).

Code:

for item in all_line_differences:
spread_quantity[abs(item[0])]+=1
if item[1] ==0:
spread_tied[abs(item[0])]+=1

this might be made clearer by implicitly unpacking the tuples, as you already know how to do given the next fragment

Code:

for spread, diff in all_line_differences:
spread_quantity[abs(spread)]+=1
if diff ==0:
spread_tied[abs(spread)]+=1

In fact the most concise way to solve the problem is probably to use the appropriate function in the itertools module.

thanks for the response. i definitely know what you mean about learning the need for cleaner programming naturally. on my scraping program when i had to change it from scraping just 1 day to scraping 3 years it was apparent how disastrously ugly my program was. i got it to work with many ugly fixes though next time around i will try to write the code using various functions and maybe try to do it with object oriented programming and a class

just worked through the defaultdict that is pretty cool. gonna work through your tuple suggestion now

Quote

06-21-2012 , 12:45 PM

unluckyboy

banned

Join Date: May 2012 Posts: 163

okay figured out the tuple thing and also using the SQl query to get the required data from the first step. thanks for ideas.

Quote

06-21-2012 , 12:46 PM

unluckyboy

banned

Join Date: May 2012 Posts: 163

i have a math question related to this code. so as you can see what i am doing is finding the percentage of times a game pushes for each different spread. i get this nice printout...

1.0 147.0 0.0272108843537
1.5 25.0 0.0
2.0 18.0 0.0
2.5 67.0 0.0
3.0 264.0 0.0833333333333

etc... how do i find out the margin of error on the third column?

Quote

06-21-2012 , 01:13 PM

unluckyboy

banned

Join Date: May 2012 Posts: 163

look while i try to figure out the margin of error on the last program which i am confused on for now i am starting my next task. finding out the correct spread for the 1st quarter based on 3 point spread for whole game. look applied learning:

def get_games():
return c.execute("select team1_line, team1_q1, team2_q2 from nfl_games where abs(team1_line)=3").fetchall()

for one, two, three in get_games():
print one, two, three

Quote

06-21-2012 , 04:55 PM

#10

Colombo

Pooh-Bah

Join Date: Feb 2005 Posts: 4,748

use code tags

Quote

06-22-2012 , 12:03 AM

#11

daveT

S.A.G.E. Master

Join Date: Jun 2005 Posts: 23,955

I'll show you a program I use quite often that is similar to what you are doing, and try to explain the logic of the decisions. Of course, there is room for disagreement, but this may be helpful (of course, if I'm wrong anywhere, please feel free to correct me):

Code:

import postgresql
from numpy import *
from pylab import *
import matplotlib.pyplot as plt

db = postgresql.open(user = '#')

a = db.prepare('''select to_char(invoicedate, ' MM YYYY') as monthly, sum(qty) 
from..... ;''')

def createLines(x):
    xax, yax = [], []
    for i in x:
        xax.append(i[0])
        yax.append(i[1])
    return xax, yax 

mon, tot = createLines(a())
plot(tot)

xticks(arange(len(mon)), mon, rotation = 90)
grid()

plt.title('Pieces Sold')
show()

The above program simply creates a line graph showing how much an item sells per month. This is not the complete program, but it shows what's important here.

The thing with your code is that you are allowing your functions to leak all over the place, which will wreak havoc now that you are attempting to extend it. The reason I say use SQL to create the list of tuples you want is for a few reasons:

- SQL is built for data management. What would happen if you ran, say, two queries, and one list is shorter than the other? This would be interesting to program, but SQL does a perfect job of mashing and matching data. If you have SQL tuples with no information, that missing information issue will be confined to SQL, and won't create mis-matches in your program when you start combining and sorting lists. SQL also has the immensely helpful "order by" command which will automatically sort the tuples you need. If you have uneven lists and then you attempt to sort the lists in python followed by zipping them together without confirming the correct data matches, you are likely going to have a major headache and some wildly incorrect answers.

- I only want to deal with well-defined data. In this case, I have a list that looks like:

Code:

[('01 2012', Decimal(1000), ('02 2012', Decimal(1000), .......]

I know that the list is ordered, and I know the exact format of the list I am always dealing with. With this information, I can create various lists and use them in generalized functions. Suppose, I wanted to do this:

Code:

a = db.prepare('''select to_char(invoicedate, ' MM YYYY') as monthly, sum(qty) 
from..... ;''')

b = db.prepare('''select to_char(invoicedate, ' MM YYYY') as monthly, sum(qty) 
from..... ;''')

c = db.prepare('''select to_char(invoicedate, ' MM YYYY') as monthly, sum(qty) 
from..... ;''')

so all I have to do is:

Code:

def createLines(x):
    xax, yax = [], []
    for i in x:
        xax.append(i[0])
        yax.append(i[1])
    return xax, yax 

mon, tot = createLines(a())
mon2, tot2 = createLines(b())
mon2, tot2 = createLines(c())

And by doing nothing more than changing the SQL commands, I can now plot regional sales history.

Since I know how know the exact format that I am dealing with, I can easily create the following:

Code:

a = db.prepare('''select to_char(invoicedate, ' MM YYYY') as monthly, sum(qty) 
from..... ;''')

def createLines(x):
    xax, yax = [], []
    for i in x:
        xax.append(i[0])
        yax.append(i[1])
    return xax, yax

def compare_stuff(x):
    pass

def do_this(x):
    pass

mon, tot = createLines(a())
mon2, tot2 = compare_stuff(a())
mon2, tot2 = do_this(a())

And if I want to create a text table:

Code:

for i in a():
     print(i[0], '\t', i[1])

It's easy to extend the code to do all sorts of neat things, from graphing, to printing, to i/o, simply by keeping sure that the features of SQL are contained in the SQL part, and the list contained in the body of the program, and then keeping the functionality contained in the functions. All I do is pass lists around to each function. Keeping track of the format of one data-type in your program will be much easier, and if you have an error, you'll be able to tell where it came from. A SQL error will be much different than a list or function-call error.

Quote

06-23-2012 , 03:24 PM

#12

DMACM

banned

Join Date: Apr 2005 Posts: 3,915

Wasnt sure if you accounted for a spread in the wrong direction.

For examples Bills favorite to win over Giants by 5.

Giants win by 5.

Should be off by 10 not 0 as abs might indicate.

Quote

150% up to $2,000 Welcome Bonus on CoinPoker

Join the action now

Daily Rewards • Splash Pots • CoinRaces

Post Reply Subscribe

...