So there are 3 things you need to do. I don't know what language you're writing in, but probably all 3 things are available.
I use mongo with python most often. So you can iterate over a mongo find query, without loading the whole thing into RAM. You can do like
Code:
for x in mongo.db.collection.find(whatever):
... do something with x
This means you only have to be able to hold each entry in memory at a time, plus whatever structure aggregates results.
Secondly, you can limit mongo queries to just what you want. In pure mongo, this is something like
Code:
find({..query..}, {'_id': 0, 'field1': 1, 'field2': 1})
which means "only fetch field1 and field2, and leave out the _id field". This dramatically limits the amount of **** you fetch.
And finally, you need to look into aggregation. Mongo has ways to do processing on rows to develop summary data. It's not the fastest thing ever and it's a little mind bending but basically you can probably do 90% of what you want to do via aggregation queries and not have to load all the **** into RAM.
Long story short, but, it's time for you to actually learn how mongo works.