Open Side Menu Go to the Top
Register
SQL noob looking to de-identify PT4/PostGreSQL database SQL noob looking to de-identify PT4/PostGreSQL database

05-16-2013 , 03:51 PM
I have a set of hand histories that I've loaded into both HEM2 (Small Stakes) and PT4 (trial). I was wondering how hard it would be to de-identify my list of user names. For example, suppose that I was only interested in users whose names started with the letter "A". For de-identification, I would want to change the names of my users of interest (names that start with "A") to "Subject001", "Subject002", etc., and change the others to "Bystander001", "Bystander002", etc.

Is there an easy way to do this, and if so, what tools would I need? Right now, all I have is R (which has an SQL module I'd have to learn) and PostgreSQL itself.

I'm posting in this part of the forum because de-identifying a database is a problem that might come up in other areas of research and not just poker.

Thank you for any time and/or advice.
SQL noob looking to de-identify PT4/PostGreSQL database Quote
05-16-2013 , 10:22 PM
update players set username = CONCAT("Subject", player_id)

Note:
this will forever delete the actual usernames! might want to take a back up first.
SQL noob looking to de-identify PT4/PostGreSQL database Quote

      
m