Two Plus Two Publishing LLC Two Plus Two Publishing LLC
 

Go Back   Two Plus Two Poker Forums > Other Topics > Programming

Notices

Programming Discussions about computer programming

Reply
 
Thread Tools Display Modes
Old 05-17-2012, 08:08 PM   #16
old hand
 
sards's Avatar
 
Join Date: Dec 2006
Location: Oklahoma
Posts: 1,476
Re: How would you program this?

Even better than a set<string> would be a map<string, int> to store the number of occurrences of each word.
sards is offline   Reply With Quote
Old 05-17-2012, 10:13 PM   #17
adept
 
Join Date: Apr 2011
Posts: 812
Re: How would you program this?

To solve any problem the first step is to ask questions. Therefore:

Step # 1 is to fully define the problem:
A. Is 1976 a word? Are aol and AOL different words? The answer to both questions is probably no given your problem description.
B. What should the program do if the input file doesn't exist?
C. Does the instant message log require parsing to only return the words you've typed? From the description you don't want to count the other person's words.
etc.

Step #2 is to break the problem into, potentially reusable, parts:
I'd suggest something like:
1. Open file
2. Read line
3. Break line into words
4. Store word into appropriate data structure (this depends on what the language you're using supports: A SET is the obvious answer for this problem
5. Close input file
6. Count # of unique words
7. Output the count
au4all is offline   Reply With Quote
Old 05-18-2012, 12:06 AM   #18
old hand
 
checkm8's Avatar
 
Join Date: Dec 2007
Posts: 1,498
Re: How would you program this?

Quote:
Originally Posted by Freakin View Post
Fwiw I had to actually do this in my workplace, parsing 25gb of exchange logs to get email DLs used more than once and cross reference against another list
Was there any unexpected challenges you have to overcome given the log size or was it trivial?
checkm8 is offline   Reply With Quote
Old 05-18-2012, 02:17 PM   #19
bacon wannabe
 
Freakin's Avatar
 
Join Date: Sep 2004
Posts: 17,019
Re: How would you program this?

Quote:
Originally Posted by checkm8 View Post
Was there any unexpected challenges you have to overcome given the log size or was it trivial?
once i switched to perl it was no problem. trying to use powershell to read files in chunks was lolbad.
Freakin is offline   Reply With Quote
Old 05-20-2012, 08:05 PM   #20
newbie
 
ryanb99's Avatar
 
Join Date: May 2012
Posts: 20
Re: How would you program this?

Quote:
Originally Posted by gwp View Post
It's a programming assignment,
This will be a hard class for OP then imo.

edit: my 2c - c# string class has a lot of options
http://msdn.microsoft.com/en-us/library/7wtc81z6
ryanb99 is offline   Reply With Quote
Old 05-20-2012, 08:37 PM   #21
Carpal \'Tunnel
 
Scary_Tiger's Avatar
 
Join Date: Oct 2005
Location: United States
Posts: 19,943
Re: How would you program this?

Quote:
Originally Posted by weevil View Post
Code:
sort -u file-a > file-b
to sort and store all the unique words in file-a to file-b, then
Code:
wc -w file-b
to get the number of words

Stuff like this is where the command-line shines, no need to mess around with writing a program or complicated script!
Using a Python dictionary was the first thought I had reading the OP, but after seeing he somehow didn't care about the frequency, the -u unique flag is clearly the simplest way to do it.
Scary_Tiger is offline   Reply With Quote

Reply
      

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off



All times are GMT -4. The time now is 10:23 AM.


Powered by vBulletin®
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.6.0 ©2011, Crawlability, Inc.
Copyright © 2008-2010, Two Plus Two Interactive