Open Side Menu Go to the Top
Register
C programming help? C programming help?

05-05-2011 , 10:52 PM
If this is the wrong forum or not allowed, mods plz move or delete.


Hiya all, I am taking a programming class, and need help with a program. I have some of the code and could use some help though with some of the concepts. The language is C.

I am supposed to count how many times each word appears in the string and then print out each word on an individual line along with the number of times it appears next to it. I also need to keep all apostrophes, but remove every other punctuation.

I have figured out how to remove all punctuation, but I cannot figure out how to keep the apostrophes. Here is what I have so far

Code:
  #include <stdio.h>

    int main() {
      char str[40+1];

      while (scanf("%s",str) != EOF)
        printf ("'%s'\n",str);

      return 0;
    }
I am using linux redirection to input the data. I can get it to print each word on a seperate line, but I am unsure of how to remove the punctuation from each word.
C programming help? Quote
05-05-2011 , 11:11 PM
So, you're retrieving a string of some unknown length made up of words and punctuation, and you're supposed to print out each word that appears in the string along with how many times it appears?

So this input:

Here's a word, and another word, and one last word!

becomes this output?

Here's 1
a 1
word 3
and 2
another 1
one 1
last 1

Is that right?

Assuming it is, the first thing you should do before you start writing code is come up with a step by step plan of how to solve this problem. It's gonna wind up being a lot more complex than the code you have now, because every time you read a word, you're going to have to compare it to the words you're already encountered in the input to see if you're already tracking it.

How much about C have you learned so far in your class? Are you familiar with the nuts and bolts of how arrays and pointers work? Have you used/learned about the strcopy function? Structs?
C programming help? Quote
05-05-2011 , 11:18 PM
Yes, that is exactly what I am supposed to do. Here is how my thought process is on completing the task.

1) loop string to remove any punctuation using ispunct
2) compare each word using either a sort or search method
3) print out how many times each word was encountered

I have learned the basics of arrays and pointers, I am still pretty new on how to use most of the str functions.
C programming help? Quote
05-05-2011 , 11:23 PM
Yeah, that's what you'll want to do. You need to work out the specifics of steps 2 and 3 though - when you read a new word and you're gonna compare it, what are you gonna compare it to? (i.e. where/how are you storing the data you'll want to compare it to?) Don't bother sorting anything, it's overkill for this problem.

(also, moving to programming forum)
C programming help? Quote
05-05-2011 , 11:28 PM
Oh, and for removing the punctuation - since character strings are stored like this (here's "string." as an example)

Code:
byte #    0  1  2  3  4  5  6  7 
value     s  t  r  i  n  g  .  \0
what you'll want to do is loop through the string and when you encounter punctuation (besides apostrophes - I'm presuming ispunct() will identify apostrophes, so just do something like if character == '.' || character == '!' for all types of punctuation you might encounter), just replace it with \0 and ta da, you've set the end of the string to where the punctuation used to be. This is assuming that you will only run into non-apostrophe punctuation at the end of a word...it gets a little more fun if it's at the start, like a double quote.
C programming help? Quote
05-05-2011 , 11:33 PM
Quote:
Originally Posted by goofyballer
Yeah, that's what you'll want to do. You need to work out the specifics of steps 2 and 3 though - when you read a new word and you're gonna compare it, what are you gonna compare it to? (i.e. where/how are you storing the data you'll want to compare it to?) Don't bother sorting anything, it's overkill for this problem.

(also, moving to programming forum)
Ok well here is my next question I was able to use this code to remove the punctuations, but is also removes all spaces and I don't know how then to put it into an array if I am going character by character and they are all bunched together with no space and no way to sort them into an array

Code:
 while (scanf("%c",&c) != EOF) {
        if(isalpha(c)) {
                printf("%c",c);
        }
                        else {
                                printf("");
                        }
C programming help? Quote
05-05-2011 , 11:40 PM
Code:
1  #include <stdio.h>
2 
3  int main( void )
4  {
5      char buf[64];
6      strcpy( buf, "\"string.\"" ); // copies "string." into buf
7      printf( "%s\n", buf );
8      char* str = buf + 1;
9      printf( "%s\n", buf );
10 }
The output of this program is:

"string."
string."

That's an example of how you can ignore punctuation at the start of a string. If line 8 is confusing at all let me know and I can explain what's going on there.
C programming help? Quote
05-05-2011 , 11:42 PM
Quote:
Originally Posted by BamBam192
Ok well here is my next question I was able to use this code to remove the punctuations, but is also removes all spaces and I don't know how then to put it into an array if I am going character by character and they are all bunched together with no space and no way to sort them into an array
<pulling code out of quote since it's hard to read in italics...>

Code:
 while (scanf("%c",&c) != EOF) {
        if(isalpha(c) || c == ' ' || c == '\'' ) {
                printf("%c",c);
        }
                        else {
                                printf("");
                        }
See bolded on 2nd line. Does that make sense?

I'm confused as to what you're trying to accomplish here though - are you trying to run multiple programs to solve this, like, one program that removes the punctuation, then sends the input to another program to count the words? Otherwise, printing each character out to the screen isn't that helpful, unless you're just trying to see what the code does.

I gotta run for a couple hours, hopefully some other people in the forum can help as well.
C programming help? Quote
05-05-2011 , 11:49 PM
Quote:
Originally Posted by goofyballer
<pulling code out of quote since it's hard to read in italics...>

Code:
 while (scanf("%c",&c) != EOF) {
        if(isalpha(c) || c == ' ' || c == '\'' ) {
                printf("%c",c);
        }
                        else {
                                printf("");
                        }
See bolded on 2nd line. Does that make sense?

I'm confused as to what you're trying to accomplish here though - are you trying to run multiple programs to solve this, like, one program that removes the punctuation, then sends the input to another program to count the words? Otherwise, printing each character out to the screen isn't that helpful, unless you're just trying to see what the code does.

I gotta run for a couple hours, hopefully some other people in the forum can help as well.

No I was just trying to see if doing it letter by letter made more sense than string by string.

Code:
#include <stdio.h>

int main() {

	char str[40+1];
	int i;


	while (scanf("%s",str) != EOF){
		for(i = 0; str[ i ]; i++){
			if(str[i] == "." || str[i] == '"'|| str[i] == '!'){	
				printf("");
		}
				else{
					printf("%s", str[i]);
				}

			}
	}

	
 



return 0;
}
I tried this, but it keep telling me "comparison between pointer and integer" when I try to compile and run.
C programming help? Quote
05-05-2011 , 11:51 PM
printf("%s", str[i] ) needs to be "%c" instead of "%s", you can't print a single character as a string
C programming help? Quote
05-05-2011 , 11:52 PM
Nevermind I figured out why I was getting the error, but the problem is when I try to run it I am now getting a segmentation fault. bleh
C programming help? Quote
05-05-2011 , 11:54 PM
Quote:
Originally Posted by goofyballer
printf("%s", str[i] ) needs to be "%c" instead of "%s", you can't print a single character as a string
Ohhhh!!! I see I see. Hmmm but it still removes all of the spaces?
C programming help? Quote
05-05-2011 , 11:55 PM
Oh, right - that's because when you do scanf( "%s", whatever ) it reads in an individual string, and it defines an individual string as something separated by whitespace (space or newline). So, scanf'ing a string will give you a string that has no spaces in it.
C programming help? Quote
05-05-2011 , 11:58 PM
Quote:
Originally Posted by goofyballer
Oh, right - that's because when you do scanf( "%s", whatever ) it reads in an individual string, and it defines an individual string as something separated by whitespace (space or newline). So, scanf'ing a string will give you a string that has no spaces in it.
So I should scanf it by character? Since a character will keep its space?
C programming help? Quote
05-06-2011 , 12:00 AM
You don't need to keep the spaces - your program's going to be reading words, removing punctuation, and then comparing them to words you're already read. If you want your output in the program above to keep spaces between the words, just add a printf( " " ); at the end of your while loop and it will print a space before it reads the next word. For the purpose of completing this assignment, you don't need to keep track of spaces once you've read a word.

Heading out now 4 reals, back in a couple hrs.
C programming help? Quote
05-06-2011 , 12:22 AM
Quote:
Originally Posted by goofyballer
You don't need to keep the spaces - your program's going to be reading words, removing punctuation, and then comparing them to words you're already read. If you want your output in the program above to keep spaces between the words, just add a printf( " " ); at the end of your while loop and it will print a space before it reads the next word. For the purpose of completing this assignment, you don't need to keep track of spaces once you've read a word.

Heading out now 4 reals, back in a couple hrs.
Ok Ok I get what you are saying, I went ahead and added the space just so my meager mind can keep up lol. So punctuation besides the " ' " have been removed. So now I need to compare the words to see how many times they occur. I have tried copying the letters of a word into a single array slot ie. the = array[0]. This however did not work.

This is what I have so far if anyone wants to point me in the right direction and give advice

Code:
#include <stdio.h>

int main() {

	char str[40+1];
	int i;


	while (scanf("%s",str) != EOF){
		for(i = 0; str[ i ]; i++){
			if(str[i] == '.' || str[i] == ':'|| str[i] == '"'|| str[i] == '!'|| str[i] == ','){	
				printf("\0");
		}
			
				else{
					printf("%c", str[i]);
				}

			}
			printf(" ");
	}


return 0;
}
C programming help? Quote
05-06-2011 , 02:14 AM
Quote:
Originally Posted by BamBam192
I have tried copying the letters of a word into a single array slot ie. the = array[0]. This however did not work.
Expand on this? That sounds like you're heading in the right direction but without seeing what you tried to do I can't say what you did wrong.
C programming help? Quote
05-06-2011 , 03:49 AM
Here is what I have so far. The part where I try to put the words into a new array and print it out just prints 3 blank lines

Code:
#include <stdio.h>

int main() {

	char str[40+1];
	char a[1000];
	int i,j;


	while (scanf("%s",str) != EOF){
		for(i = 0; str[ i ]; i++){
			if(str[i] == '.' || str[i] == ':'|| str[i] == '"'|| str[i] == '!'|| str[i] == ','){	
				printf("\0");
		}
			
				else{
					str[i] = toupper(str[i]);
					printf("%c", str[i]);
				}

			}
			printf(" ");
	}


		// The following code puts each word into its own array element

	for(j = 0; j < str[i]; j++) {
		for(i = 0; str[i]; i++) {
			if(str[i] != ' ') {
				a[j] = str[i];
			}
					else {
						j = j+1;
					}
		}

	}
	
		// Prints the new array

	for(j = 0; a[j]; j++) {
		printf("%c\n",a[j]);
	}
	
	



return 0;
}
C programming help? Quote
05-06-2011 , 04:02 AM
First things first, here are some tips for code readability:
- The closing brace for a block of code should have the same # of tabs in front of it as the line that opens that block of code.
- An else statement should have the same # of tabs in front of it as the if statement that it acts upon.

This is what that code should look like:

Code:
1  #include <stdio.h>
2  
3  int main() {
4  
5  	char str[40+1];
6  	char a[1000];
7  	int i,j;
8  
9  
10  	while (scanf("%s",str) != EOF){
11 		for(i = 0; str[ i ]; i++){
12 			if(str[i] == '.' || str[i] == ':'|| str[i] == '"'|| str[i] == '!'|| str[i] == ','){	
13 				printf("\0");
14 			}
15 			else{
16 				str[i] = toupper(str[i]);
17 				printf("%c", str[i]);
18 			}
19 		}
20 		printf(" ");
21 	}
22 
23 
24 	// The following code puts each word into its own array element
25 
26 	for(j = 0; j < str[i]; j++) {
27 		for(i = 0; str[i]; i++) {
28 			if(str[i] != ' ') {
29 				a[j] = str[i];
30 			}
31 			else {
32 				j = j+1;
33 			}
34 		}
35 	}
36 	
37 	// Prints the new array
38 
39 	for(j = 0; a[j]; j++) {
40 		printf("%c\n",a[j]);
41 	}
42 
43 
44 
45 	return 0;
46 }
In general - be consistent in your indentation. If you're opening a new code block (with a { brace), add another tab to the lines inside that block. If you're closing a code block, lose a tab for the code that follows. Makes it much easier for others to read.

Last edited by goofyballer; 05-06-2011 at 04:13 AM. Reason: adding line #s
C programming help? Quote
05-06-2011 , 04:08 AM
Ok sorry about that :[ I have made the indention changes to my current code.
C programming help? Quote
05-06-2011 , 04:34 AM
Okay, as for the code itself (added line #s to post above so refer to that)...

- The for loop starting on line 26 is only adding each letter in the last word of the input to the 'a' array. What happens when this program executes is that the while loop in lines 10-21 executes over and over again, reading a string from the input into 'str' until there are no strings left, and only then (with 'str' containing only the very last string read by the scanf call - and even then, it's possible str is empty because once scanf fails due to having no input left (which is when it returns EOF), it might just set str to an empty string to be safe) does it go to the for loop on line 26. You can try putting printf( "str before for loop: %s\n", str ); on line 25 right before the for loop to see what str is going into that for loop, but it's either going to be the last word in the input or just an empty string, probably the latter.

- The for loop on lines 26-35 has a number of things going on with it:
- - On line 28 you're testing to see if the current character in str is not a space; this will always be true, because of what I said earlier about scanf() always taking spaces out of the strings that it reads in. So, the else code (j = j+1) will never run.
- - The code probably isn't doing what you think it's doing - the execution will go like this as the two for loops run:
j = 0, i = 0; set a[j] = str[i], so the first letter in 'a' is set to the first letter in 'str'
j = 0, i = 1; set a[j] = str[i], so the first letter in 'a' is set to the second letter in 'str'
j = 0, i = 2; set a[j] = str[i], so the first letter in 'a' is set to the third letter in 'str'
...
j = 1, i = 0; set a[j] = str[i], so the second letter in 'a' is set to the first letter in 'str'
j = 1, i = 1; set a[j] = str[i], so the second letter in 'a' is set to the second letter in 'str'
etc.

If entering this for loop, 'str' contained the string "word", then after the for loop executed, the character array 'a' would contain the string "dddd", because every time the inner for loop runs, it sets a[j] to the last letter in str (eventually - as illustrated above, it sets it to the 1st letter, then the 2nd letter, etc. till it sets a[j] to the last letter of 'str', then increases j by 1 and repeats the process for the next character in 'a'). When your code isn't working how you expect, get in the habit of coming up with quick test cases (like "okay, so say that str has 'word' in it, let's go through this code step by step and see what it does in this situation...") to work through by hand in your code. Doing so is a valuable tool employed by every programmer when things go wrong.

Going into the final for loop on lines 39-41, that would just print a single character on each line. Have you studied two-dimensional arrays yet? Because this:

char buf[64]; // a character array with room for a 63-letter string and its null-terminating character

is an array of characters, also known as a string. What you need to complete the assignment is an array of strings, like this:

char stringarray[2][64]; // a two-dimensional character array with room for two different 63-letter strings

Does the latter look like something you've studied?
C programming help? Quote
05-06-2011 , 04:50 AM
Quote:
Originally Posted by goofyballer
Okay, as for the code itself (added line #s to post above so refer to that)...

- The for loop starting on line 26 is only adding each letter in the last word of the input to the 'a' array. What happens when this program executes is that the while loop in lines 10-21 executes over and over again, reading a string from the input into 'str' until there are no strings left, and only then (with 'str' containing only the very last string read by the scanf call - and even then, it's possible str is empty because once scanf fails due to having no input left (which is when it returns EOF), it might just set str to an empty string to be safe) does it go to the for loop on line 26. You can try putting printf( "str before for loop: %s\n", str ); on line 25 right before the for loop to see what str is going into that for loop, but it's either going to be the last word in the input or just an empty string, probably the latter.

- The for loop on lines 26-35 has a number of things going on with it:
- - On line 28 you're testing to see if the current character in str is not a space; this will always be true, because of what I said earlier about scanf() always taking spaces out of the strings that it reads in. So, the else code (j = j+1) will never run.
- - The code probably isn't doing what you think it's doing - the execution will go like this as the two for loops run:
j = 0, i = 0; set a[j] = str[i], so the first letter in 'a' is set to the first letter in 'str'
j = 0, i = 1; set a[j] = str[i], so the first letter in 'a' is set to the second letter in 'str'
j = 0, i = 2; set a[j] = str[i], so the first letter in 'a' is set to the third letter in 'str'
...
j = 1, i = 0; set a[j] = str[i], so the second letter in 'a' is set to the first letter in 'str'
j = 1, i = 1; set a[j] = str[i], so the second letter in 'a' is set to the second letter in 'str'
etc.

If entering this for loop, 'str' contained the string "word", then after the for loop executed, the character array 'a' would contain the string "dddd", because every time the inner for loop runs, it sets a[j] to the last letter in str (eventually - as illustrated above, it sets it to the 1st letter, then the 2nd letter, etc. till it sets a[j] to the last letter of 'str', then increases j by 1 and repeats the process for the next character in 'a'). When your code isn't working how you expect, get in the habit of coming up with quick test cases (like "okay, so say that str has 'word' in it, let's go through this code step by step and see what it does in this situation...") to work through by hand in your code. Doing so is a valuable tool employed by every programmer when things go wrong.

Going into the final for loop on lines 39-41, that would just print a single character on each line. Have you studied two-dimensional arrays yet? Because this:

char buf[64]; // a character array with room for a 63-letter string and its null-terminating character

is an array of characters, also known as a string. What you need to complete the assignment is an array of strings, like this:

char stringarray[2][64]; // a two-dimensional character array with room for two different 63-letter strings

Does the latter look like something you've studied?

Wow great help! I ran the test like you suggested, and as you also said the 'str' in the for loops was the last word of the scanf. I get everything you are talking about on what the for loops are actually doing compared to what I thought they were doing. The only thing I am struggling with is the 2d array concept. So right now the string is stored into a 1d array correct? Where as I need to store it in a 2d? So is should look like this

array[0][4] = 'w' 'o' 'r' 'd' '\0'
array[0][6] = 'w' 'o' 'r' 'd' '#' '2' '\0'

And if I need to do this before it goes to the EOF could I store them into a 2d array to begin with or would I need to keep it the way it is where it, takes out he punctuation, capitalizes the letters and then stores them in a 2d array.
C programming help? Quote
05-06-2011 , 05:02 AM
Quote:
Originally Posted by BamBam192
So is should look like this

array[0][4] = 'w' 'o' 'r' 'd' '\0'
array[0][6] = 'w' 'o' 'r' 'd' '#' '2' '\0'
To correct your notation - array[0][4] refers to the fifth character in the first (0th) sub-array, and array[0][6] refers to the seventh character in the first sub-array. When you declare a two-dimensional array in code like this:

char array[2][32];

you're creating two consecutive arrays of 32 characters each. array[0] accesses the first of those two, array[1] accesses the 2nd. You can use array[0] or array[1] just like you'd use any normal character array (like, how you use 'str' in the code you posted) - for example, you could say scanf( "%s", array[0] ) and whatever scanf reads would get put into the first of the two character arrays. So, a more accurate way of putting what you said in the above quote is

array[0] = { 'w', 'o', 'r', 'd', '\0', ...27 unused bytes }
array[1] = { 'w', 'o', 'r', 'd', '#', '2', '\0', ...25 unused bytes }
array[0][4] = '\0'
array[0][2] = 'o'
array[1][4] = '#'

The distinction is important because array[0] still resolves to a character array when it's used in code; array[x][y] resolves to a single character. Knowing what the variables you use mean is very important, because as you learned with your segmentation fault earlier when you had the %s where a %c should have gone, there is a *big* difference between an individual character and a string of multiple characters.

Quote:
Originally Posted by BamBam192
And if I need to do this before it goes to the EOF could I store them into a 2d array to begin with or would I need to keep it the way it is where it, takes out he punctuation, capitalizes the letters and then stores them in a 2d array.
Yeah, you'll want to store each string into a 2d array as you go in that while loop. Otherwise, once you get out of the while loop, all the input has been read already and you have no way of accessing anything other than the very last word.

The main point of the assignment, I guess, is that before you store each word, you'll want to compare the word you just read with all the words you've already stored and see if this is a new word, or if it's already been used and you need to bump up the # of times it's been used by one.
C programming help? Quote
05-06-2011 , 05:29 AM
Ok I tried to put it into a 2d array, while in the while loop, but no bueno. Did I put it in the wrong spot? I assumed it would work just fine since its putting each character into the 2d array seperately and when it gets to a space it puts that in their as well.

Code:
1. #include <stdio.h>
2.
3. int main() {
4.
5.	char str[40+1];
6.	char a[100][40];
7.	int i,j;
8.
9.
10.	while (scanf("%s",str) != EOF){
11.		for(j = 0; j < 100 ; j++){
12.			for(i = 0; i < str[i]; i++){
13.				if(str[i] == '.' || str[i] == ':'|| str[i] == '"'|| str[i] == '!'|| str[i] == ','){  //removes all punctuation 
except for '
14.					
15.					a[j][i] = '\0';
16.					printf("\0");
17.				}
18.			
19.				else{
20.					str[i] = toupper(str[i]);   // changes all letters to uppercase
21.					a[j][i] = str[i];
22.					printf("%c", str[i]);
23.					
24.					
25.				}
26.				
27.			}
28.			
29.		}
30.		
31.		printf("\n");
32.	}
33.	
34.		//prints new array
35.
36.	for(j = 0; j < 100; j++){
37.		for(i = 0; i < 40; i++){
38.				
39.			printf("%c", a[j][i]);
40.		}
41.	}
42.	
43.	
44.
45.
46.
47.  return 0;
48.  }
At the top loop its printing out each word 100 times. I am unsure why. I have worked with 2d arrays before with integers and I found that to be very easy but for some reason when it comes to putting words into the arrays I am a fool.

Last edited by BamBam192; 05-06-2011 at 05:32 AM. Reason: numbers
C programming help? Quote
05-06-2011 , 05:46 AM
Well, that makes sense, look at how your loops are nested:

- while we're still reading words into 'str'
- - loop over j 100 times
- - - loop over each letter in str, skip the punctuation and print each letter

Again - take a simple example (change the "100" to a "5", pick a simple input), work through the code step by step and think about what it's doing. Here's the execution path for input of "one two three":

- while loop reads "one" into str
- j = 0, set a[0] to "one", 1 letter at a time in inner loop; print it out
- j = 1, set a[1] to "one", 1 letter at a time in inner loop; print it out
- j = 2, set a[2] to "one", 1 letter at a time in inner loop; print it out
...etc till j = 100
- while loop reads "two" into str
- j = 0, set a[0] to "two", 1 letter at a time in inner loop; print it out
- j = 1, set a[1] to "two", 1 letter at a time in inner loop; print it out
...etc

You can see that isn't what you want to do. The problem here is with how your loops are organized - every time you read in one string with the scanf call, you're looping 100 times to fill in every character array in 'a' with the string you just read. What you want to do is just fill in one of the slots in 'a' with the string you just read. But which slot do you put it in? I guess that depends on how many strings you've read so far... *hint hint!*

Also, quick tip:

Code:
36.	for(j = 0; j < 100; j++){
37.		for(i = 0; i < 40; i++){
38.				
39.			printf("%c", a[j][i]);
40.		}
41.	}
This is a spot where you can take advantage of your character arrays - rather than printing each character individually, you can print out entire character arrays at once by doing this instead:

Code:
36.	for(j = 0; j < 100; j++){
37.		printf("%s\n", a[j] );
38.	}
This is where what I said in that last post comes into play - a[j] resolves to a character array, so you can print it out with %s which is designed to use character arrays.

Also, if you're tired, note that programming when tired is extremely difficult. Sleep is advised. I myself am heading to bed, I'll check in on this thread at work tomorrow, hopefully some other people in this forum can help out too.
C programming help? Quote

      
m