Open Side Menu Go to the Top

07-25-2012 , 06:22 PM
I hate to agree with shoe - but HH does seem like something that a document database would be good for.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD **
$25m Guaranteed WPM on CoinPoker
Join the action now
Daily Rewards • Splash Pots • CoinRaces
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD **
07-25-2012 , 06:23 PM
Quote:
Originally Posted by daveT
I also question, since noSQL doesn't use SQL as the query language, why you'd want to deal with noSQL for smaller projects outside of academic interest. I mean, noSQL of course makes perfect sense for Twitter or Facebook, but to do data comparisons, you are forced to create your own queries anyways, and most of those queries will revolve around join, natural join, and querying sub-tables and temporary tables, which are the very foundations of SQL. I'd rather have all of this built-in, and I'm pretty sure HEM and PT would rather have this built-in as well.
If you store the HH as a document you'll rarely need to join. All the information for a hand is in the same document.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-25-2012 , 06:42 PM
Quote:
Originally Posted by jjshabado
I hate to agree with shoe - but HH does seem like something that a document database would be good for.
Why so much hate haha? Are you still mad because I don't use jquery and I tried to give you (+anyone who read) multiple viable alternatives if they don't mind losing jquery plugins/UI?
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-25-2012 , 07:09 PM
F-it just ordered "7 databases in 7 weeks" to get an overview. Really liked "7 languages in 7 weeks" and it seems it can't really hurt to get a decent high level overview of different things ever.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-25-2012 , 07:11 PM
Quote:
Originally Posted by clowntable
F-it just ordered "7 databases in 7 weeks" to get an overview. Really liked "7 languages in 7 weeks" and it seems it can't really hurt to get a decent high level overview of different things ever.
Remember a few months ago when we were talking about DBs. I remember linking this:

http://kkovacs.eu/cassandra-vs-mongo...chdb-vs-redis/

It's still highly relevant today for a super basic overview.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-25-2012 , 07:12 PM
Quote:
Originally Posted by Shoe Lace
It seems silly to try and normalize a poker hand. I don't know what the HEM or PT schemas look like but I'm guessing there's like 10 tables and most queries are a short novel of joins/etc..

It's funny how you guys with no experience using any k/v or document db say it's worse but you haven't even used one. How can you make a conclusion that something is inferior or not correct when you don't know anything about it?
These sentences are wonderful. "I have no idea what the rdbms version looks like....but you guys who have no idea what _i'm_ talking about are wrong"

I'm genuinely curious how you would set this up. You could probably replace the storage engine in FPDB in an hour or two. Replacing Database.store* and you could store the hand histories using the current parser i'd guess.

Quote:
Originally Posted by MrWooster
Isnt this pretty much what the FPDB does?
Yep, we have an rdbms schema that works on sqlite, mysql and postgres.

Quote:
Originally Posted by MrWooster
Btw, I have written a JSON parser for the FPDB hand parser. You pass it a HH file, and it returns a JSON object of the hand histories. It still needs a bit of work (about 90% there), but happy to share it.
I'm curious about this.

We've been meaning to provide an interface for storing and fetching the actual hand histories for a while. Being able to separate the hh viewer/replayer from the rest of the code might be useful.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-25-2012 , 07:16 PM
Just initiated my first Facebook ad campaign. Anyone had any experience with FB ads? I have set it up to be very specifically targeted and the advert points to the Facebook page of my site. Basically doing a minimum spend to see what happens and try and asses the worth based on the results.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-25-2012 , 07:25 PM
Most people I've talked to seem to think FB adds are a total waste. Probably depends a ton on what you advertise though.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-25-2012 , 07:26 PM
Quote:
Originally Posted by sorrow

I'm curious about this.

We've been meaning to provide an interface for storing and fetching the actual hand histories for a while. Being able to separate the hh viewer/replayer from the rest of the code might be useful.
I had to make a couple of small changes to HandHistoryConverter.py to get the stdin and stdout handlers to work, and I took ChazDazzle's version of the codebase (https://github.com/ChazDazzle/fpdb-chaz) as he has made some nice fixes to the hand parser.

The actual JSON formatting is very simple. The script just calls the PokerStars hand converter, telling it to read the hand data from stdin, then outputs the result into a JSON object and prints it to stdout.

Horribly hacky and untested, but this is the code:

Code:
import sys
import os
import os.path
import xml.dom.minidom

import time
import datetime


import Hand
import Configuration
from PokerStarsToFpdb import *
import simplejson as json


log = logging.getLogger("root")
Configuration.set_logfile("fpdb-log.txt")
config = Configuration.Config()
con = PokerStars(config, '-', '-', 0)

# Create a list to hold the data object
hands = []

# Loop through and process hands
for hand in con.getProcessedHands():
    h = {
            'sitename' : hand.sitename,
            'gametype' : hand.gametype,
            'handText' : hand.handText,
            'handid' : hand.handid,
            'in_path' : hand.in_path,
            'cancelled' : hand.cancelled,
            'tablename' : hand.tablename,
            'hero' : hand.hero,
            'maxseats' : hand.maxseats,
            'counted_seats' : hand.counted_seats,
            'buttonpos' : hand.buttonpos,
            'runItTimes' : hand.runItTimes,
            'uncalledbets' : hand.uncalledbets,
            'tourNo' : hand.tourNo,
            'tourneyId' : hand.tourneyId,
            'tourneyTypeId' : hand.tourneyTypeId,
            'buyin' : hand.buyin,
            'buyinCurrency' : hand.buyinCurrency,
            'buyInChips' : hand.buyInChips,
            'fee' : hand.fee,
            'level' : hand.level,
            'mixed' : hand.mixed,
            'speed' : hand.speed,
            'isSng' : hand.isSng,
            'isRebuy' : hand.isRebuy,
            'rebuyCost' : hand.rebuyCost,
            'isAddOn' : hand.isAddOn,
            'addOnCost' : hand.addOnCost,
            'isKO' : hand.isKO,
            'koBounty' : hand.koBounty,
            'isMatrix' : hand.isMatrix,
            'isShootout' : hand.isShootout,
            'isZoom' : hand.isZoom,
            'added' : hand.added,
            'addedCurrency' : hand.addedCurrency,
            'tourneyComment' : hand.tourneyComment,
            'seating' : hand.seating,
            'players' : hand.players,
            'posted' : hand.posted,
            'tourneysPlayersIds' : hand.tourneysPlayersIds,
            'bets' : hand.bets,
            'lastBet' : hand.lastBet,
            'streets' : hand.streets,
            'actions' : hand.actions,
            'board' : hand.board,
            'holecards' : hand.holecards,
            'discards' : hand.discards,
            'showdownStrings' : hand.showdownStrings,
            'stacks' : hand.stacks,
            'collected' : hand.collected,
            'collectees' : hand.collectees,
            'folded' : dict.fromkeys(hand.folded),
            'dealt' : dict.fromkeys(hand.dealt),
            'shown' : dict.fromkeys(hand.shown),
            'mucked' : dict.fromkeys(hand.mucked),
            'totalpot' : hand.totalpot,
            'totalcollected' : hand.totalcollected,
            'rake' : hand.rake,
            'sym' : hand.sym

        }
    hands.append(h)


print json.dumps(hands, use_decimal=True)
I run it by calling

Code:
 cat hand.txt | fpdb_json.py
assuming fpdb.json is in the main pyfpdb repository
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-25-2012 , 07:28 PM
Quote:
Originally Posted by clowntable
Most people I've talked to seem to think FB adds are a total waste. Probably depends a ton on what you advertise though.
Ye, fairly sure this is going to be the case. I really like the way you can target a really specific audience tho, I dont know of any other platforms that give you that level of granularity.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-25-2012 , 07:28 PM
Quote:
Originally Posted by sorrow
These sentences are wonderful. "I have no idea what the rdbms version looks like....but you guys who have no idea what _i'm_ talking about are wrong"

I'm genuinely curious how you would set this up. You could probably replace the storage engine in FPDB in an hour or two. Replacing Database.store* and you could store the hand histories using the current parser i'd guess.
I worked with mysql on and off for 5 years and also have spent the last year or so researching my head off on a few nosql dbs.

It's true that I don't know what the schemas look like for HEM or PT exactly. I haven't played poker in a few years and I don't have postgres installed anymore because I've formatted since I stopped playing online poker.

Why would I know about the exact schemas used on products I don't even use or ever installed (HEM)?

I already told you how I would setup a db to store hand histories while still having an ability to query specific hands. If for some reason I ran into a problem I would consider alternative solutions.

If you were hoping for a perfectly structured json file as a response then I'm afraid I can't help you here. That would require dedicating a ton of time. Trying to come up with one on the spot would result in a disaster.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-25-2012 , 08:38 PM
Quote:
Originally Posted by Shoe Lace
I worked with mysql on and off for 5 years and also have spent the last year or so researching my head off on a few nosql dbs.

It's true that I don't know what the schemas look like for HEM or PT exactly. I haven't played poker in a few years and I don't have postgres installed anymore because I've formatted since I stopped playing online poker.

Why would I know about the exact schemas used on products I don't even use or ever installed (HEM)?

I already told you how I would setup a db to store hand histories while still having an ability to query specific hands. If for some reason I ran into a problem I would consider alternative solutions.

If you were hoping for a perfectly structured json file as a response then I'm afraid I can't help you here. That would require dedicating a ton of time. Trying to come up with one on the spot would result in a disaster.
Defensive aren't we.

I understand some basic theory on nosql and haven't had any real interest in learning it so far. As I said previously i'm genuinely curious how someone familiar with document based storage would actually structure this.

I "lead" FPDB and think SQL is a ****ing awful language propped up by 40 years of theory, 25 years of experience and a lack of solid alternatives. That said, my experience is that it performs very well with structured hh data. DB normalisation also seems to pretty closely match OO design, hence the prevalence of ORMs appearing in the last few years.

It also sucks in many ways. It is inflexible leaving us with flag days for modifying the schema, and some of the queries in use to provide tables are (as you said) dictionaries full of joins.

FPDB currently supports 3 databases, and with some work could be converted to use a nosql backend too. I have no interest in doing this work myself, but and i'm prepared to assist people and accept patches to make it happen.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-25-2012 , 08:50 PM
Quote:
Originally Posted by sorrow
Defensive aren't we.

I understand some basic theory on nosql and haven't had any real interest in learning it so far. As I said previously i'm genuinely curious how someone familiar with document based storage would actually structure this.

I "lead" FPDB and think SQL is a ****ing awful language propped up by 40 years of theory, 25 years of experience and a lack of solid alternatives. That said, my experience is that it performs very well with structured hh data. DB normalisation also seems to pretty closely match OO design, hence the prevalence of ORMs appearing in the last few years.

It also sucks in many ways. It is inflexible leaving us with flag days for modifying the schema, and some of the queries in use to provide tables are (as you said) dictionaries full of joins.

FPDB currently supports 3 databases, and with some work could be converted to use a nosql backend too. I have no interest in doing this work myself, but and i'm prepared to assist people and accept patches to make it happen.
Not being defensive. Just telling it how it is.

I'm not trying to be a jerk either about not providing a document based solution on how to store a poker hand. There's so many things you need to account for. It's something you would definitely need to sit down and evaluate for a considerable amount of time.

The thing is though, I don't even see that as something the db would be doing. That would be the job of the hand parser. The hand parser would take the hand from any supported network and transform it into a json structure. Then really it's as simple as directly doing a "save" into the db with that json object. That would probably get you 95% of the way.

You might end up refactoring it into 2 collections or buckets (basically an SQL style relation) for each HH depending on what nosql db you use if you feel it will give you more flexibility.

http://www.json.org has an ok cheat sheet to look though on the right side to see what the format is capable of doing.

Once you have the hands in there, you would use whatever tools the document db gives you to query it. These might be ad-hoc queries that you can run, or map/reduce jobs that you can write, etc.. Running standard filters like "show me hands where I have AA and I'm on the button" would be super ultra trivial in both complexity and performance.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-25-2012 , 08:51 PM
I was curious about the db structure of FPDB but got scared off by the 3200 line Database.py file.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-25-2012 , 09:10 PM
Here's a simple example on how you might query something with mongodb (a document db):

Code:
db.hands.find({'position': 6})
In that example I just queried our database to pull out hands where the selected player is on the button in a 6 max game.

It's basically the same as doing a WHERE position = 6 in SQL if you had one massive denormalized table. The model works with a document db though because there's no restrictive columns or schema.

In mongo's case it's smart enough to understand that you might have similar fields across a collection (mongo's way of saying "table" I guess) without having an enforced schema so it will let you apply indexes to fields.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-25-2012 , 09:10 PM
Quote:
Originally Posted by Shoe Lace
The thing is though, I don't even see that as something the db would be doing. That would be the job of the hand parser. The hand parser would take the hand from any supported network and transform it into a json structure. Then really it's as simple as directly doing a "save" into the db with that json object. That would probably get you 95% of the way.
The FPDB parser converts a hand into one of 3 classes of Hand object, Flop, Stud, Draw all inheriting from the base hand class.

It then generates stats based on the contents of that object in DerivedStats before building a series of dicts representing lines in the various DB tables.

In my opinion the parsing engine of FPDB is pretty damn good. We support at least 19 different hh formats, plus Flop, Stud, Draw variants within. I dont know of another project that can claim that.
Quote:
Originally Posted by jjshabado
I was curious about the db structure of FPDB but got scared off by the 3200 line Database.py file.
SQL.py is the better file for looking at the schema. That file is 7k lines, but most of the stuff in there is replicated 3 times to support the different query formats of SQLite, Postgres and MySQL.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-25-2012 , 09:13 PM
Quote:
Originally Posted by sorrow
That file is 7k lines, but most of the stuff in there is replicated 3 times to support the different query formats of SQLite, Postgres and MySQL.
whoa, that doesn't sound good. why is stuff replicated and not factored out, and why is one file so long?
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-25-2012 , 09:20 PM
just took a look at it, i see it's raw SQL. Just curious, why didn't you use an ORM to take care of building the vendor specific SQL for you? Are you taking advantage of special features that would prevent a general solution like that from working?
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-25-2012 , 09:20 PM
Quote:
Originally Posted by sorrow
It then generates stats based on the contents of that object in DerivedStats before building a series of dicts representing lines in the various DB tables.
I know very little about Python. A dict is what, just a series of keys/values? You can probably convert your dicts directly to json. From there you would probably need to combine those objects and tune things so it makes sense to be saved as 1 denormalized document rather than a bunch of different tables.

Last edited by Shoe Lace; 07-25-2012 at 09:26 PM.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-25-2012 , 09:24 PM
Quote:
Originally Posted by gaming_mouse
whoa, that doesn't sound good. why is stuff replicated and not factored out, and why is one file so long?
Its not actually that bad. The file is SQL queries only and looks a lot like:

Code:
        self.query['handActions'] = """
            SELECT
                      ha.actionNo,
                      p.name,
                      ha.street,
                      ha.actionId,
                      ha.allIn,
                      round(ha.amount / 100.0,2) as bet,
                      ha.numDiscarded,
                      ha.cardsDiscarded
                FROM
                      HandsActions as ha,
                      Players as p,
                      Hands as h
                WHERE
                          h.id = %s
                      AND ha.handId = h.id
                      AND ha.playerId = p.id
                ORDER BY
                      ha.id ASC
                """
When half the queries have one variable per line for readability/scm friendliness the file gets long quite quickly.

Replication happens when you try something like:
Code:
MySQL:
format( (sum(totalProfit)/100.0) / sum(HDs), 4)                 AS Profitperhand
v
SQLite:
round( (sum(totalProfit)/100.0) / sum(HDs), 4)                  AS Profitperhand
v
Postgres
to_char(sum(totalProfit/100.0) / (sum(HDs)+0.0), '990D0000')    AS Profitperhand
I suspect that half of this replication could be eliminated too.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-25-2012 , 09:27 PM
Quote:
Originally Posted by gaming_mouse
just took a look at it, i see it's raw SQL. Just curious, why didn't you use an ORM to take care of building the vendor specific SQL for you? Are you taking advantage of special features that would prevent a general solution like that from working?
No good reason, except we didn't know. Starting the project again today I probably would.

When we had a user attempt to convert FPDB to using an ORM we were getting import speeds of 1-2 hands per second, so work in that direction stopped pretty quickly.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-25-2012 , 09:33 PM
Quote:
Originally Posted by Shoe Lace
I know very little about Python. A dict is what, just a series of keys/values? You can probably convert your dicts directly to json. From there you would probably need to combine those objects and tune things so it makes sense to be saved as 1 denormalized document rather than a bunch of different tables.
Python + Mongo is easy.

So here's finding a user by email:

Code:
user = collection.find_one({'email': email})
And user will just be a dictionary that maps directly to the json from the database.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-25-2012 , 09:38 PM
Quote:
Originally Posted by sorrow
SQL.py is the better file for looking at the schema. That file is 7k lines, but most of the stuff in there is replicated 3 times to support the different query formats of SQLite, Postgres and MySQL.
If nothing else, I think those queries would be a lot smaller with mongo.

I'm not actually saying that you should have used Mongo, and I'm definitely not saying you should switch now, but if I were starting from scratch with python, Mongo would be my leading contender.

Edit: Especially if you have something that parses the HH and creates a json object for them.

Edit2: Also - move the testing files out of the source files. That drives me crazy!
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-25-2012 , 09:48 PM
Here's a good example. You have:

SQL.py:
Code:
        self.query['store_hands_actions'] = """insert into HandsActions (
                        handId,
                        playerId,
                        street,
                        actionNo,
                        streetActionNo,
                        actionId,
                        amount,
                        raiseTo,
                        amountCalled,
                        numDiscarded,
                        cardsDiscarded,
                        allIn
               )    
               values (
                    %s, %s, %s, %s, %s,
                    %s, %s, %s, %s, %s,
                    %s, %s
                )"""
Database.py:
Code:
        for a in adata:
            self.habulk.append( (hid,
                                 pids[adata[a]['player']],
                                 adata[a]['street'],
                                 adata[a]['actionNo'],
                                 adata[a]['streetActionNo'],
                                 adata[a]['actionId'],
                                 adata[a]['amount'],
                                 adata[a]['raiseTo'],
                                 adata[a]['amountCalled'],
                                 adata[a]['numDiscarded'],
                                 adata[a]['cardsDiscarded'],
                                 adata[a]['allIn']
                               ) )  
        if doinsert:
            q = self.sql.query['store_hands_actions']
            q = q.replace('%s', self.sql.query['placeholder'])
            c = self.get_cursor()
            self.executemany(c, q, self.habulk) #c.executemany(q, self.habulk)
Assuming that adata is a list of dictionaries (which it looks like). You could do all of this in mongo with something like:

Code:
if doinsert:
    #A couple of lines to get the collection object.
    hands_actions_collection.insert(adata)
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-25-2012 , 10:03 PM
Quote:
Originally Posted by jjshabado
Python + Mongo is easy.

So here's finding a user by email:

Code:
user = collection.find_one({'email': email})
And user will just be a dictionary that maps directly to the json from the database.
So a dict directly maps to json 1:1 without any encoding or decoding? That does makes things extremely easy if you can end up working with the db results directly and writing dicts straight to the db.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD **
$25m Guaranteed WPM on CoinPoker
Join the action now
Daily Rewards • Splash Pots • CoinRaces
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD **

      
m