Open Side Menu Go to the Top
Register
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** ** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD **

12-13-2016 , 11:35 AM
Quote:
Originally Posted by daveT
Those numbers are pretty sad for a so-called "big data" software. I"m also curious why there was no index on this query in the first place.
I doubt its a mongo issue really. More configuration / system spec / usage issue. Agree though that Index should be a really early check when queries start going crazy.

Although in dumb Mongo news, creating an index by default locks your whole database... which can be irritating.

Quote:
Originally Posted by daveT
"We never had a database person on our team, but we ran into this situation that took us [3 days ] 2 weeks] to solve. Now, we don't expect you to figure out the answer, but what would be your thinking process on..."

The honest answer is drop the database and rebuild, go back to bed, and look at the logs in the morning, but no one ever wants to hear that one. This question is a serious pet peeve of many database people.
Wait, you should drop the database??? Isn't that really bad?
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-13-2016 , 11:56 AM
Quote:
Originally Posted by daveT
Those numbers are pretty sad for a so-called "big data" software. I"m also curious why there was no index on this query in the first place.
The 100ms was for total page response time - most of our mongo queries are 5ms or under.

I didn't create the data structure or the queries, or the indexing, but possibly if I had, there wouldn't be an index there either. I add indexes sparingly, when it's proven they do some good.

The worst part is, if you ask me, this endpoint is completely useless, and it's our most "used" API endpoint. But that's just because every deployment of our main product queries it every few minutes I think. I have no idea what they do with that information.

Regarding why it took us so long to get there, the problem was that literally every metric was off the charts and the nginx and uwsgi logs in particular had some stuff in them that looked really damning. Those things turned out to be symptoms, not causes.

Also, unfortunately, none of us were really empowered to plow through the system doing what needed to be done. So it involved a lot of corralling people and letting things get bad enough that everyone would agree

For example, my first desire was to drain the web queue and reject connections or redirect to a maintenance page. IMO this was a no brainer - we were dropping 80% of traffic on the floor and probably *all* of the frontend traffic because our frontend was unusable (our API was very slow and you'd have to try multiple times but it stayed up)

It's kind of a long story but if my operating theory was correct then this would actually solve the problem by itself. It didn't, though, so we ended up eliminating the likely causes and moving to the less likely.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-13-2016 , 12:39 PM
Quote:
Originally Posted by jjshabado
Hah, I don't think this either. Almost everything is a spectrum. I think you need to work on some people skills. And I suspect its exasperated in the stressful situation fo an interview. I don't think you're a social moron or any sort of weirdo. This first sentence is a good example of what I'm talking about.
There is a difference between the casual writing I do here and how I conduct myself in a professional situation.

Quote:
Originally Posted by jjshabado
I doubt its a mongo issue really. More configuration / system spec / usage issue. Agree though that Index should be a really early check when queries start going crazy.

Although in dumb Mongo news, creating an index by default locks your whole database... which can be irritating.
Eh, Mongo isn't the only NoSQL database that does this. I'm not entirely sure how all architectures work, but many document stores are basically a single two column database table*. It would make sense that adding an index locks the entire database, because adding an index to a single table in a database locks that table.

That paragraph is wonky and I don't feel like rewriting it. Hope it makes sense.

* Datomic is nothing more than a single two-column table on MySQL (or other db engine).

Quote:
Wait, you should drop the database??? Isn't that really bad?
It isn't always a bad thing, though it certainly shouldn't be the first plan of action. Database / SOAP table bloat often has no other resolution, unfortunately, though this should all be prevented with proper configurations.

The answer was sarcastic.

Quote:
Originally Posted by RustyBrooks
The 100ms was for total page response time - most of our mongo queries are 5ms or under.

I didn't create the data structure or the queries, or the indexing, but possibly if I had, there wouldn't be an index there either. I add indexes sparingly, when it's proven they do some good.

The worst part is, if you ask me, this endpoint is completely useless, and it's our most "used" API endpoint. But that's just because every deployment of our main product queries it every few minutes I think. I have no idea what they do with that information.

Regarding why it took us so long to get there, the problem was that literally every metric was off the charts and the nginx and uwsgi logs in particular had some stuff in them that looked really damning. Those things turned out to be symptoms, not causes.

Also, unfortunately, none of us were really empowered to plow through the system doing what needed to be done. So it involved a lot of corralling people and letting things get bad enough that everyone would agree

For example, my first desire was to drain the web queue and reject connections or redirect to a maintenance page. IMO this was a no brainer - we were dropping 80% of traffic on the floor and probably *all* of the frontend traffic because our frontend was unusable (our API was very slow and you'd have to try multiple times but it stayed up)

It's kind of a long story but if my operating theory was correct then this would actually solve the problem by itself. It didn't, though, so we ended up eliminating the likely causes and moving to the less likely.
Sounds like a crazy situation. Learned a bit from this one.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-13-2016 , 01:02 PM
I guess the thing I learned, which is something I learn over and over again, is that performance is discontinuous with respect to the various things that affect it. Apparently we went over a cliff in our performance function.

Our devops guys are not happy that the resolution was "I made an index" because they feel like there is still an underlying problem. Maybe there is.

I think it's funny that the 60 second queries were "slow" because they were doing a full column scan. But in order to make an index, you have to... do a full column scan. And making the index took maybe 100ms.

I am kind of wondering if it's not a subtle bug in the query optimizer or something, where it's doing something really dumb in this particular edge case.

Back on oracle 8 or 9 or so, I remember that I routinely had to add "mydate > '1900-01-01'" to my queries, which would convince the query optimizer to not do something apparently stupid (all the mydate values were present day, i.e. the 1990s, that shouldn't have done anything useful, but often it did)
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-13-2016 , 01:34 PM
Quote:
Originally Posted by daveT
Eh, Mongo isn't the only NoSQL database that does this. I'm not entirely sure how all architectures work, but many document stores are basically a single two column database table*. It would make sense that adding an index locks the entire database, because adding an index to a single table in a database locks that table.
There's an option to build the index in a way that doesn't lock the entire database. It's just not the default.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-13-2016 , 01:36 PM
Quote:
Originally Posted by RustyBrooks
I think it's funny that the 60 second queries were "slow" because they were doing a full column scan. But in order to make an index, you have to... do a full column scan. And making the index took maybe 100ms.
I don't find this surprising. The reason raw queries are slow is because they can't be read until consistency is guaranteed, so all writes have to be put in before a read can be executed. This is why indexes lock the document, or in Mongo's case, the entire database.

There can be other reasons as well, but I don't want to talk about Mongo's architecture without authority, so I'll preface that I'm making assumptions here. The documents themselves may not be read in the raw; they are probably read from some intermediate write phase, from some caching, or other data structure in the system. This structure would be about 2 to 3 times larger than the document itself, but it is disorganized.

An index is an extra structure that is written one-to-one under a btree or other hashing strategy. Creating an index is a direct read to the document, not an intermediate or async / wait read to another data table / structure.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-13-2016 , 01:36 PM
Quote:
Originally Posted by RustyBrooks
I guess the thing I learned, which is something I learn over and over again, is that performance is discontinuous with respect to the various things that affect it. Apparently we went over a cliff in our performance function.

Our devops guys are not happy that the resolution was "I made an index" because they feel like there is still an underlying problem. Maybe there is.

I think it's funny that the 60 second queries were "slow" because they were doing a full column scan. But in order to make an index, you have to... do a full column scan. And making the index took maybe 100ms.

I am kind of wondering if it's not a subtle bug in the query optimizer or something, where it's doing something really dumb in this particular edge case.

Back on oracle 8 or 9 or so, I remember that I routinely had to add "mydate > '1900-01-01'" to my queries, which would convince the query optimizer to not do something apparently stupid (all the mydate values were present day, i.e. the 1990s, that shouldn't have done anything useful, but often it did)
May actually be related to what I posted in that the database was locked while creating the index.

It's been awhile since I had to seriously worry about mongo performance, but I think the "underlying issue" (or more aptly, the long term solution) is to monitor the amount of memory mongo is using and make sure it can at least contain your indexes, and ideally your data as well.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-13-2016 , 01:42 PM
Quote:
Originally Posted by jjshabado
There's an option to build the index in a way that doesn't lock the entire database. It's just not the default.
Without good knowledge of Mongo, this sounds pretty sane. It sounds like it would prevent inconsistency and collisions..? I'd have to research it more, but alas, my initial deep dive in NoSQL is Cassandra.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-13-2016 , 01:45 PM
The only difference (as I understand it), is that it takes longer and it will use some extra space during creation.

Feels like if you're deciding which option to default to you should go with "slower, slightly inefficient space, safe" over "fast, efficient, will kill all applications using it".

Edit: Without looking into it, I suspect that the background option to do it safely was added later. And switching default behaviours for existing methods can be tricky."
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-13-2016 , 03:49 PM
Meeting subject just sent to me to accept or decline:

Quote:
BR Session: 10 VIPSC - Conv Ex: PMO/IT Resources (Source): Customer Orders PPV Requirements Session #5 - Troubleshooting and wrap up of BR sessions - 4 total = 6.55 hours
No context in the message body other than "Perry will be leading this". No idea who Perry is or what any of this means. Corporate life.

Oh yeah - it's a recurring meeting at 7am my time and 75 other people are on the invite. Pass.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-13-2016 , 05:03 PM
my new year's resolution is to do 30 mins of programming problems a day

any good sites? I know of codingbat (did all the problems on there a long time ago), hackerrank, and codefights.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-13-2016 , 05:13 PM
My buddy started codewars. It seems to be fairly popular.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-13-2016 , 05:19 PM
I liked http://exercism.io/ the little bit I used it. A couple hours after submitting, people were commenting on my code providing suggestions.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-13-2016 , 07:35 PM
thanks i'll check those out. My programming has gotten so rusty I feel this will be really good for me. C++ seems like a good language to focus on since it's my main one, should I try to branch out or just stick to one thing?

for simple problems it doesn't really matter I guess
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-13-2016 , 09:20 PM
Quote:
Originally Posted by Prickly Pear
you should have stopped right there

Candybar is right. They hire rich kids with finance degrees and train them on the job. A finance degree is basically a "I can solve a linear equation with 6th grade algebra" degree. That wall street also needs a small % of highly qualified quants is irrelevant to his point.
I worked in and around Wall St for 12 years and never met a developer with a finance degree. Most of them were STEM graduates of one flavour or another who were then given, usually excelltent, training in the dark arts of investment bank products.

On an unrelated note, I also never met an accountant capable of performing the most basic arithmetic in their heads.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-14-2016 , 07:35 PM
Ran into this issue at work today.

ArrayList entityList = getEntityList(entityid);
if(entityList != null){
//process

Doesn't process, but when I add in a logging statement,

ArrayList entityList = getEntityList(entityid);
log.info("entitylist: "+entityList;
if(entityList != null){
//process

It processes. getEntityList(entityid) method returns arraylist and then does some cleanup in a finally block. Could that have any affect? Or could it be the way the arraylist was instantiated? ie would it help to do something like:

ArrayList entityList;
entityList=getEntityList(entityid);

This is written in Java 5 I think. and it only happened in prod, so I couldn't really test in e2 at all. So the code probably staying this way.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-15-2016 , 04:46 AM
Quote:
Originally Posted by rtd353
Ran into this issue at work today.

ArrayList entityList = getEntityList(entityid);
if(entityList != null){
//process

Doesn't process, but when I add in a logging statement,

ArrayList entityList = getEntityList(entityid);
log.info("entitylist: "+entityList;
if(entityList != null){
//process

It processes. getEntityList(entityid) method returns arraylist and then does some cleanup in a finally block. Could that have any affect? Or could it be the way the arraylist was instantiated? ie would it help to do something like:

ArrayList entityList;
entityList=getEntityList(entityid);

This is written in Java 5 I think. and it only happened in prod, so I couldn't really test in e2 at all. So the code probably staying this way.
The only way the log statement could matter is if it's sharing some resource with getEntityList. That seems unlikey but I'd have to see the getEntityList code.

More likely it has nothing to do with logging, but you're seeing sporadic failure.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-15-2016 , 06:51 AM
Even if they share resources somehow the log statement still couldn't re-assign the local entityList variable to null. Looks like pure coincidence to me.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-15-2016 , 12:08 PM
Better add another log just to be safe.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-15-2016 , 04:17 PM
Quote:
Originally Posted by maxtower
Better add another log just to be safe.
I actually did. The first log I added, an entity that had one item in arraylist processed, the rest which had multiple item in the arraylist didn't. Added another one, every entity I tested with processed.

So seems to be related to length of the db query in getEntityList, which could explain why it only happened in prod.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-15-2016 , 06:20 PM
I have been building a new box for some RL. Been a number of years since making my own machine but have enjoyed it so far.



Not complete yet as I am waiting on some of the new Samsung 960 Pro's to ship, as well as some traditional data storage and have still been undecided on what configuration to use for storage.

Also to be added is a Titan X Pascal to replace the placeholder $50 videocard in there now, as well as another 4 sticks of 16gb ram to bring the total to 128gb.

Had planned to keep the machine purely Linux, as I do not do much gaming, but I think I will have to add Windows10 as well to mess around with a bit of gaming just to satisfy curiosity on what the full build can pump out.

Also I will say that the Air 540 is a fantastic case for someone without much build experience as myself and that hates cable management.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-15-2016 , 08:26 PM
what do you do with that beast if you don't game?
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-15-2016 , 08:44 PM
I assumed RL meant Rocket League? (which runs pretty well on my 2 year old gaming laptop)

I used to be pretty into building my own machines but it's been a long time since I've even used a desktop, and it's hard to see myself going back to one. My last couple computers have been Sagers and I don't feel like I've had to sacrifice any power in the process. If you demand the absolute most in performance though then they'll always lag behind desktops.

Last edited by goofyballer; 12-15-2016 at 08:49 PM.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-15-2016 , 09:41 PM
Quote:
Originally Posted by iversonian
what do you do with that beast if you don't game?
It might be powerful enough to run emacs.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-15-2016 , 10:13 PM
128 gigs of ram!! Christ! My first thought is "why" but then, the first computer I paid for with my own money had 32 megs of ram and everyone was like WHY?? at the time (and believe me, I used all of it, regularly)

And yeah, I'm old, our family computer when I was a kid had no hard drive and then when we got one it was 10 megabytes and I'm sure it's was unholy expensive.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote

      
m