Two Plus Two Publishing LLC Two Plus Two Publishing LLC
 

Go Back   Two Plus Two Poker Forums > Other Topics > Programming

Notices

Programming Discussions about computer programming

Reply
 
Thread Tools Display Modes
Old 06-14-2012, 02:05 PM   #526
banned
 
Join Date: May 2012
Posts: 163
Re: ** Python Support Thread **

lots of time to do work today. goal is to make program to go through oddsportal.com or sbr and get the pinny line for each NFL game over last few years and also the scores
unluckyboy is offline   Reply With Quote
Old 06-14-2012, 02:08 PM   #527
banned
 
Join Date: May 2012
Posts: 163
Re: ** Python Support Thread **

Quote:
Originally Posted by TheIrishThug View Post
I've never used beautiful soup or elementtree, but the answer is not regex.
how do you know if you dont have experience with either of those?

lol okay i have made a program succesfully parsing HTML with regex. although admittedly that was because i was too n00bie to understand those libraries. i will take your advice though and go about it as you say...
unluckyboy is offline   Reply With Quote
Old 06-14-2012, 02:38 PM   #528
Pooh-Bah
 
TheIrishThug's Avatar
 
Join Date: Jan 2005
Location: Belligerent and numerous
Posts: 5,213
Re: ** Python Support Thread **

The point is that regex is desigened to to work with a specific class of text and HTML is not that type. As stated by other answers in that thread, regex can be used when you are trying to parse a very speceific sub-set of HTML. However, if you are trying to do something more complex or do not have control of the input format, using a framework specifically designed for parsing HTML will make your life easier.

As for an actual recomendation, I've seen beautiful soup used and talked about frequently and never heard of elementtree. So based on that, I suspect it would be easier to get help with any potential issues if you use beautiful soup.
TheIrishThug is offline   Reply With Quote
Old 06-14-2012, 05:19 PM   #529
banned
 
Join Date: May 2012
Posts: 163
Re: ** Python Support Thread **

hey sorry if i ask too many questions. i have no formal instruction in programming and am learning this as i go along with no deep understanding of what is happening. after 2 hours of work i figured out how to get BeautifulSoup version 3 on my computer by downloading it, extracting the .tar.gz file and then run the setup file that it has which builds a file BeautifulSoup which i can cut and paste into the python/lib directory

however, i am trying to get BeautifulSoup version 4 on here and when i extract and run the setup file it doesn't build a BeautifulSoup file which i can use. here is the documentation which i don't understand. i don't know where to run these commands or what they do or what pip is. i downloaded easy install but still can't get this to work.

Quote:
If you’re using a recent version of Debian or Ubuntu Linux, you can install Beautiful Soup with the system package manager:

$ apt-get install python-beautifulsoup4

Beautiful Soup 4 is published through PyPi, so if you can’t install it with the system packager, you can install it with easy_install or pip. The package name is beautifulsoup4, and the same package works on Python 2 and Python 3.

$ easy_install beautifulsoup4

$ pip install beautifulsoup4

(The BeautifulSoup package is probably not what you want. That’s the previous major release, Beautiful Soup 3. Lots of software uses BS3, so it’s still available, but if you’re writing new code you should install beautifulsoup4.)

If you don’t have easy_install or pip installed, you can download the Beautiful Soup 4 source tarball and install it with setup.py.

$ python setup.py install

If all else fails, the license for Beautiful Soup allows you to package the entire library with your application. You can download the tarball, copy its bs4 directory into your application’s codebase, and use Beautiful Soup without installing it at all.

I use Python 2.7 and Python 3.2 to develop Beautiful Soup, but it should work with other recent versions.
unluckyboy is offline   Reply With Quote
Old 06-14-2012, 08:08 PM   #530
Pooh-Bah
 
Join Date: Jul 2005
Posts: 4,001
Re: ** Python Support Thread **

Quote:
Originally Posted by unluckyboy View Post
hey sorry if i ask too many questions. i have no formal instruction in programming and am learning this as i go along with no deep understanding of what is happening. after 2 hours of work i figured out how to get BeautifulSoup version 3 on my computer by downloading it, extracting the .tar.gz file and then run the setup file that it has which builds a file BeautifulSoup which i can cut and paste into the python/lib directory
Your process should roughly go like this:

Code:
/users/home/unlucky> tar xzf mypackage.tgz
/users/home/unlucky> cd mypackage
/users/home/unlucky> sudo python setup.py install
/users/home/unlucky> python
>>> import mypackage
>>> print "yay it worked"
>>> exit()
/users/home/unlucky>
You should never have to manually copy the files anywhere.

Alternatively install pip: http://www.saltycrane.com/blog/2010/...ll-pip-ubuntu/

and then from the command line:
Code:
/users/home/unlucky> sudo pip install beautifulsoup4
Neko is offline   Reply With Quote
Old 06-14-2012, 09:42 PM   #531
Carpal \'Tunnel
 
Jbrochu's Avatar
 
Join Date: Jan 2005
Posts: 6,105
Re: ** Python Support Thread **

Thanks Neko. I'm going to need to install some packages soon and that info will be helpful.
Jbrochu is offline   Reply With Quote
Old 06-14-2012, 10:02 PM   #532
Carpal \'Tunnel
 
tyler_cracker's Avatar
 
Join Date: Apr 2005
Location: Shallow End OTKP
Posts: 13,918
Re: ** Python Support Thread **

i haven't used beautiful soup but people at work like it.

we mostly use pyquery at work. it has been adequate for my (simple) needs.
tyler_cracker is offline   Reply With Quote
Old 06-17-2012, 12:44 PM   #533
banned
 
Join Date: May 2012
Posts: 163
Re: ** Python Support Thread **

beautifulsoup is so much better than elementtree, with elementtree can't even access a tag's siblings so gay
unluckyboy is offline   Reply With Quote
Old 06-17-2012, 12:50 PM   #534
Carpal \'Tunnel
 
tyler_cracker's Avatar
 
Join Date: Apr 2005
Location: Shallow End OTKP
Posts: 13,918
Re: ** Python Support Thread **

preventing incest is gay? these kids today...
tyler_cracker is offline   Reply With Quote
Old 06-17-2012, 01:11 PM   #535
banned
 
Join Date: May 2012
Posts: 163
Re: ** Python Support Thread **

how does this work when i tried to setup lxml myself i couldn't, how does this program work?

http://www.lfd.uci.edu/~gohlke/pythonlibs/#lxml
unluckyboy is offline   Reply With Quote
Old 06-18-2012, 12:21 PM   #536
veteran
 
Colombo's Avatar
 
Join Date: Feb 2005
Posts: 3,354
Re: ** Python Support Thread **

I wrote a program in python which runs math-intensive simulations. I usually run 7,500 simulations when I run this code.

The problem is it runs slow and I want to improve it. I spent a good amount of time setting up a threading process, where I had 50-75 simulations running at once. However I discovered that this didn't speed up my program at all. It ran just as a fast using a loop where I did 1 simulation per loop.

Is this because the CPU only has a limited amount of processors (1?) capable of running math calculations, so the threads just end up waiting on each other (effectively mimicking a loop where 1 simulation happens per iteration)?
Colombo is online now   Reply With Quote
Old 06-18-2012, 04:18 PM   #537
veteran
 
Colombo's Avatar
 
Join Date: Feb 2005
Posts: 3,354
Re: ** Python Support Thread **

Quote:
Originally Posted by Colombo View Post
I wrote a program in python which runs math-intensive simulations. I usually run 7,500 simulations when I run this code.

The problem is it runs slow and I want to improve it. I spent a good amount of time setting up a threading process, where I had 50-75 simulations running at once. However I discovered that this didn't speed up my program at all. It ran just as a fast using a loop where I did 1 simulation per loop.

Is this because the CPU only has a limited amount of processors (1?) capable of running math calculations, so the threads just end up waiting on each other (effectively mimicking a loop where 1 simulation happens per iteration)?
This answers my question I believe: http://stackoverflow.com/questions/3...and-performace
Colombo is online now   Reply With Quote
Old 06-18-2012, 09:57 PM   #538
Pooh-Bah
 
Join Date: Jul 2005
Posts: 4,001
Re: ** Python Support Thread **

yeah...definitely the GIL.

Here are some options for speeding up your calculations


- If your simulations aren't dependent on each other, just run multiple python interpreters.

- Use numpy especially if you're working with matricies

- Use the multiprocessing module instead of threading. This will allow you to run multiple simulations in parallel.

- Try Scipy.weave

- Rewrite any hotspots in your code using Cython. You can very easily get some big gains (like orders of magnitude better in some cases)

- Write performance critical parts in C and make them accessible via a dll and then use ctypes to call them.

- Write a full fledged C-extension
Neko is offline   Reply With Quote
Old 06-19-2012, 12:50 AM   #539
ɹǝʍoʇpunoɹ
 
RoundTower's Avatar
 
Join Date: Feb 2005
Location: soah made my profile
Posts: 13,926
Re: ** Python Support Thread **

before you try any of this, the first step imo is to identify the bottlenecks. There may be 20 different parts of your code that could be sped up tenfold, but only one or two that will actually give a significant increase in your overall running time.

Start with the profile or cProfile modules in the standard library.
RoundTower is offline   Reply With Quote
Old 06-19-2012, 12:59 AM   #540
Pooh-Bah
 
Join Date: Jul 2005
Posts: 4,001
Re: ** Python Support Thread **

yeah that's a good point to make. Profiling (use cProfile rather than profile) your program to find out where it's spending most of its time is the obvious first step.
Neko is offline   Reply With Quote

Reply
      

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off



All times are GMT -4. The time now is 03:03 PM.


Powered by vBulletin®
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.6.0 ©2011, Crawlability, Inc.
Copyright © 2008-2010, Two Plus Two Interactive