** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** - Page 1050 - Computer Technical Help

I doubt its a mongo issue really. More configuration / system spec / usage issue. Agree though that Index should be a really early check when queries start going crazy.

Although in dumb Mongo news, creating an index by default locks your whole database... which can be irritating.

Quote:

Originally Posted by daveT

"We never had a database person on our team, but we ran into this situation that took us [3 days ] 2 weeks] to solve. Now, we don't expect you to figure out the answer, but what would be your thinking process on..."

The honest answer is drop the database and rebuild, go back to bed, and look at the logs in the morning, but no one ever wants to hear that one. This question is a serious pet peeve of many database people.

Wait, you should drop the database??? Isn't that really bad?

Quote

12-13-2016 , 11:56 AM

#26227

RustyBrooks

Carpal \'Tunnel

Join Date: Feb 2006 Posts: 24,647

Quote:

Originally Posted by daveT

Those numbers are pretty sad for a so-called "big data" software. I"m also curious why there was no index on this query in the first place.

The 100ms was for total page response time - most of our mongo queries are 5ms or under.

I didn't create the data structure or the queries, or the indexing, but possibly if I had, there wouldn't be an index there either. I add indexes sparingly, when it's proven they do some good.

The worst part is, if you ask me, this endpoint is completely useless, and it's our most "used" API endpoint. But that's just because every deployment of our main product queries it every few minutes I think. I have no idea what they do with that information.

Regarding why it took us so long to get there, the problem was that literally every metric was off the charts and the nginx and uwsgi logs in particular had some stuff in them that looked really damning. Those things turned out to be symptoms, not causes.

Also, unfortunately, none of us were really empowered to plow through the system doing what needed to be done. So it involved a lot of corralling people and letting things get bad enough that everyone would agree

For example, my first desire was to drain the web queue and reject connections or redirect to a maintenance page. IMO this was a no brainer - we were dropping 80% of traffic on the floor and probably *all* of the frontend traffic because our frontend was unusable (our API was very slow and you'd have to try multiple times but it stayed up)

It's kind of a long story but if my operating theory was correct then this would actually solve the problem by itself. It didn't, though, so we ended up eliminating the likely causes and moving to the less likely.

Quote

12-13-2016 , 12:39 PM

#26228

daveT

S.A.G.E. Master

Join Date: Jun 2005 Posts: 23,955

Quote:

Originally Posted by jjshabado

Hah, I don't think this either. Almost everything is a spectrum. I think you need to work on some people skills. And I suspect its exasperated in the stressful situation fo an interview. I don't think you're a social moron or any sort of weirdo. This first sentence is a good example of what I'm talking about.

There is a difference between the casual writing I do here and how I conduct myself in a professional situation.

Quote:

Originally Posted by jjshabado

Eh, Mongo isn't the only NoSQL database that does this. I'm not entirely sure how all architectures work, but many document stores are basically a single two column database table*. It would make sense that adding an index locks the entire database, because adding an index to a single table in a database locks that table.

That paragraph is wonky and I don't feel like rewriting it. Hope it makes sense.

* Datomic is nothing more than a single two-column table on MySQL (or other db engine).

Quote:

Wait, you should drop the database??? Isn't that really bad?

It isn't always a bad thing, though it certainly shouldn't be the first plan of action. Database / SOAP table bloat often has no other resolution, unfortunately, though this should all be prevented with proper configurations.

The answer was sarcastic.

Quote:

Originally Posted by RustyBrooks

Sounds like a crazy situation. Learned a bit from this one.

Quote

12-13-2016 , 01:02 PM

#26229

RustyBrooks

Carpal \'Tunnel

Join Date: Feb 2006 Posts: 24,647

I guess the thing I learned, which is something I learn over and over again, is that performance is discontinuous with respect to the various things that affect it. Apparently we went over a cliff in our performance function.

Our devops guys are not happy that the resolution was "I made an index" because they feel like there is still an underlying problem. Maybe there is.

I think it's funny that the 60 second queries were "slow" because they were doing a full column scan. But in order to make an index, you have to... do a full column scan. And making the index took maybe 100ms.

I am kind of wondering if it's not a subtle bug in the query optimizer or something, where it's doing something really dumb in this particular edge case.

Back on oracle 8 or 9 or so, I remember that I routinely had to add "mydate > '1900-01-01'" to my queries, which would convince the query optimizer to not do something apparently stupid (all the mydate values were present day, i.e. the 1990s, that shouldn't have done anything useful, but often it did)

Quote

12-13-2016 , 01:34 PM

#26230

jjshabado

Carpal Tunnel

Join Date: Jul 2006 Posts: 22,732

Quote:

Originally Posted by daveT

There's an option to build the index in a way that doesn't lock the entire database. It's just not the default.

Quote

12-13-2016 , 01:36 PM

#26231

daveT

S.A.G.E. Master

Join Date: Jun 2005 Posts: 23,955

Quote:

Originally Posted by RustyBrooks

I think it's funny that the 60 second queries were "slow" because they were doing a full column scan. But in order to make an index, you have to... do a full column scan. And making the index took maybe 100ms.

I don't find this surprising. The reason raw queries are slow is because they can't be read until consistency is guaranteed, so all writes have to be put in before a read can be executed. This is why indexes lock the document, or in Mongo's case, the entire database.

There can be other reasons as well, but I don't want to talk about Mongo's architecture without authority, so I'll preface that I'm making assumptions here. The documents themselves may not be read in the raw; they are probably read from some intermediate write phase, from some caching, or other data structure in the system. This structure would be about 2 to 3 times larger than the document itself, but it is disorganized.

An index is an extra structure that is written one-to-one under a btree or other hashing strategy. Creating an index is a direct read to the document, not an intermediate or async / wait read to another data table / structure.

Quote

12-13-2016 , 01:36 PM

#26232

jjshabado

Carpal Tunnel

Join Date: Jul 2006 Posts: 22,732

Quote:

Originally Posted by RustyBrooks

May actually be related to what I posted in that the database was locked while creating the index.

It's been awhile since I had to seriously worry about mongo performance, but I think the "underlying issue" (or more aptly, the long term solution) is to monitor the amount of memory mongo is using and make sure it can at least contain your indexes, and ideally your data as well.

Quote

12-13-2016 , 01:42 PM

#26233

daveT

S.A.G.E. Master

Join Date: Jun 2005 Posts: 23,955

Quote:

Originally Posted by jjshabado

There's an option to build the index in a way that doesn't lock the entire database. It's just not the default.

Without good knowledge of Mongo, this sounds pretty sane. It sounds like it would prevent inconsistency and collisions..? I'd have to research it more, but alas, my initial deep dive in NoSQL is Cassandra.

Quote

12-13-2016 , 01:45 PM

#26234

jjshabado

Carpal Tunnel

Join Date: Jul 2006 Posts: 22,732

The only difference (as I understand it), is that it takes longer and it will use some extra space during creation.

Feels like if you're deciding which option to default to you should go with "slower, slightly inefficient space, safe" over "fast, efficient, will kill all applications using it".

Edit: Without looking into it, I suspect that the background option to do it safely was added later. And switching default behaviours for existing methods can be tricky."

Quote

12-13-2016 , 03:49 PM

#26235

suzzer99

Save the Cheerleader, Save the World

Join Date: Nov 2005 Posts: 99,984

Meeting subject just sent to me to accept or decline:

Quote:

BR Session: 10 VIPSC - Conv Ex: PMO/IT Resources (Source): Customer Orders PPV Requirements Session #5 - Troubleshooting and wrap up of BR sessions - 4 total = 6.55 hours

No context in the message body other than "Perry will be leading this". No idea who Perry is or what any of this means. Corporate life.

Oh yeah - it's a recurring meeting at 7am my time and 75 other people are on the invite. Pass.

Quote

12-13-2016 , 05:03 PM

#26236

jmakin

banned

Join Date: Jan 2008 Posts: 30,120

my new year's resolution is to do 30 mins of programming problems a day

any good sites? I know of codingbat (did all the problems on there a long time ago), hackerrank, and codefights.

Quote

12-13-2016 , 05:13 PM

#26237

suzzer99

Save the Cheerleader, Save the World

Join Date: Nov 2005 Posts: 99,984

My buddy started codewars. It seems to be fairly popular.

Quote

12-13-2016 , 05:19 PM

#26238

fredd-bird

Carpal \'Tunnel

Join Date: Aug 2007 Posts: 19,337

I liked http://exercism.io/ the little bit I used it. A couple hours after submitting, people were commenting on my code providing suggestions.

Quote

12-13-2016 , 07:35 PM

#26239

jmakin

banned

Join Date: Jan 2008 Posts: 30,120

thanks i'll check those out. My programming has gotten so rusty I feel this will be really good for me. C++ seems like a good language to focus on since it's my main one, should I try to branch out or just stick to one thing?

for simple problems it doesn't really matter I guess

Quote

12-13-2016 , 09:20 PM

#26240

codeartisan

centurion

Join Date: Jun 2008 Posts: 182

Quote:

Originally Posted by Prickly Pear

you should have stopped right there

Candybar is right. They hire rich kids with finance degrees and train them on the job. A finance degree is basically a "I can solve a linear equation with 6th grade algebra" degree. That wall street also needs a small % of highly qualified quants is irrelevant to his point.

I worked in and around Wall St for 12 years and never met a developer with a finance degree. Most of them were STEM graduates of one flavour or another who were then given, usually excelltent, training in the dark arts of investment bank products.

On an unrelated note, I also never met an accountant capable of performing the most basic arithmetic in their heads.

Quote

12-14-2016 , 07:35 PM

#26241

rtd353

Pooh-Bah

Join Date: Apr 2014 Posts: 3,897

Ran into this issue at work today.

ArrayList entityList = getEntityList(entityid);
if(entityList != null){
//process

Doesn't process, but when I add in a logging statement,

ArrayList entityList = getEntityList(entityid);
log.info("entitylist: "+entityList;
if(entityList != null){
//process

It processes. getEntityList(entityid) method returns arraylist and then does some cleanup in a finally block. Could that have any affect? Or could it be the way the arraylist was instantiated? ie would it help to do something like:

ArrayList entityList;
entityList=getEntityList(entityid);

This is written in Java 5 I think. and it only happened in prod, so I couldn't really test in e2 at all. So the code probably staying this way.

Quote

12-15-2016 , 04:46 AM

#26242

muttiah

Carpal \'Tunnel

Join Date: Aug 2004 Posts: 25,041

Quote:

Originally Posted by rtd353

The only way the log statement could matter is if it's sharing some resource with getEntityList. That seems unlikey but I'd have to see the getEntityList code.

More likely it has nothing to do with logging, but you're seeing sporadic failure.

Quote

12-15-2016 , 06:51 AM

#26243

plexiq

veteran

Join Date: Apr 2007 Posts: 2,557

Even if they share resources somehow the log statement still couldn't re-assign the local entityList variable to null. Looks like pure coincidence to me.

Quote

12-15-2016 , 12:08 PM

#26244

maxtower

Carpal \'Tunnel

Join Date: Sep 2005 Posts: 7,996

Better add another log just to be safe.

Quote

12-15-2016 , 04:17 PM

#26245

rtd353

Pooh-Bah

Join Date: Apr 2014 Posts: 3,897

Quote:

Originally Posted by maxtower

Better add another log just to be safe.

I actually did. The first log I added, an entity that had one item in arraylist processed, the rest which had multiple item in the arraylist didn't. Added another one, every entity I tested with processed.

So seems to be related to length of the db query in getEntityList, which could explain why it only happened in prod.

Quote

12-15-2016 , 06:20 PM

#26246

HastenDan

banned

Join Date: Sep 2016 Posts: 1,711

I have been building a new box for some RL. Been a number of years since making my own machine but have enjoyed it so far.

Not complete yet as I am waiting on some of the new Samsung 960 Pro's to ship, as well as some traditional data storage and have still been undecided on what configuration to use for storage.

Also to be added is a Titan X Pascal to replace the placeholder $50 videocard in there now, as well as another 4 sticks of 16gb ram to bring the total to 128gb.

Had planned to keep the machine purely Linux, as I do not do much gaming, but I think I will have to add Windows10 as well to mess around with a bit of gaming just to satisfy curiosity on what the full build can pump out.

Also I will say that the Air 540 is a fantastic case for someone without much build experience as myself and that hates cable management.

Quote

12-15-2016 , 08:26 PM

#26247

iversonian

Carpal \'Tunnel

Join Date: Sep 2003 Posts: 21,441

what do you do with that beast if you don't game?

Quote

12-15-2016 , 08:44 PM

#26248

goofyballer

Carpal \'Tunnel

Join Date: Jun 2005 Posts: 70,719

I assumed RL meant Rocket League? (which runs pretty well on my 2 year old gaming laptop)

I used to be pretty into building my own machines but it's been a long time since I've even used a desktop, and it's hard to see myself going back to one. My last couple computers have been Sagers and I don't feel like I've had to sacrifice any power in the process. If you demand the absolute most in performance though then they'll always lag behind desktops.

Last edited by goofyballer; 12-15-2016 at 08:49 PM.

Quote

12-15-2016 , 09:41 PM

#26249

blacklab

adept

Join Date: Jan 2005 Posts: 1,089

Quote:

Originally Posted by iversonian

what do you do with that beast if you don't game?

It might be powerful enough to run emacs.

Quote

12-15-2016 , 10:13 PM

#26250

RustyBrooks

Carpal \'Tunnel

Join Date: Feb 2006 Posts: 24,647

128 gigs of ram!! Christ! My first thought is "why" but then, the first computer I paid for with my own money had 32 megs of ram and everyone was like WHY?? at the time (and believe me, I used all of it, regularly)

And yeah, I'm old, our family computer when I was a kid had no hard drive and then when we got one it was 10 megabytes and I'm sure it's was unholy expensive.

Quote

Page 1050 of 1603

First

50 550 850 950 1000 1030 1040 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1060 1070 1100 1150 1250 1550

Last

Post Reply Subscribe

...

Page 1050 of 1603

First

50 550 850 950 1000 1030 1040 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1060 1070 1100 1150 1250 1550

Last