Open Side Menu Go to the Top
Register
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** ** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD **

01-17-2019 , 10:48 PM
Yea one of the first things i tried. Memory usage, cpu utilization, even the thread stacks they are using look pretty much the same
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
01-17-2019 , 10:50 PM
Quote:
Originally Posted by jmakin
Yea one of the first things i tried. Memory usage, cpu utilization, even the thread stacks they are using look pretty much the same


Where is the time being spent in the code though?
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
01-18-2019 , 12:16 AM
Can you break this into smaller pieces and test each in isolation - or are you already at the smallest piece?
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
01-18-2019 , 12:38 AM
How does it perform at a 10M and 100M object insert?

Given the issues you've described I would think the worst case is your homegrown db having some inherit performance issue at scale, so I would look for it to be able to handle increase in scale in the performant environments.

If it stops at 300k/1M records at 300k records/s down to 10 records/s I would be very interested to see under what conditions I can **** with the "10 records/s" process. By figuring out how to move it from "10 records/s" to 8, you may figure out the issue.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
01-18-2019 , 12:45 AM
It feels hard to debug this by just trying different things. If you profile the code you should see where time is being spent - which should narrow the problem down.

It’s been awhile but Java has a bunch of free tools that work well for this kind of thing.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
01-18-2019 , 12:58 AM
Yeah heatmap might be a good idea.

Also it could literally be an issue with your DB or the driver or something. We found a lot of weird performance bugs that turned out to be some kind of throttle or concurrent connection limiter between two boundaries like that - which was either baked into some 3rd party package or some kind of security feature in our cloud hosting provider. Our cloud hosting provider denied the latter for a couple days until they finally found it.

There was one we never found that we only fixed by putting our node and play servers on the same physical machine (and then dedicating that node cluster to only talk to that play server). Definitely some kind of rate-limiter at the network level or something, but rackspace couldn't figure it out, so we just did what we had to do to fix it.

Last edited by suzzer99; 01-18-2019 at 01:03 AM.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
01-18-2019 , 02:18 PM
Quote:
Originally Posted by Larry Legend
How does it perform at a 10M and 100M object insert?

Given the issues you've described I would think the worst case is your homegrown db having some inherit performance issue at scale, so I would look for it to be able to handle increase in scale in the performant environments.

If it stops at 300k/1M records at 300k records/s down to 10 records/s I would be very interested to see under what conditions I can **** with the "10 records/s" process. By figuring out how to move it from "10 records/s" to 8, you may figure out the issue.


It is not the DB. Compiling the exact same code on its own main method runs fine. It is something goofy with the maven build - it has to be. There’s no other explanation.

I am on the smallest piece. The slowdown happens after N insertions. Doing the jvm pstack command (i forget what its called) les me to believe most of the time is being spent in GC. Which is why i figured it was the heap approaching its limits set by the JVM. If the JVM senses its running out of heap space it aggressively tries to GC. My program is supposed to use a lot of memory, this is expected.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
01-18-2019 , 03:33 PM
Quote:
Originally Posted by toots babos
Hi to all in the thread....

I am going for java associate exam(1z0-808) towards the end of this month and I'm wondering if anyone has any hints/tips in regards to how to approach code questions and how they analyse the questions.

Any insight at all from others could be a huge help

Thanks
Quote:
Originally Posted by toots babos
This is kind of a hard question to answer without knowing more. Do you have sample questions? Are there particular things you've struggled with when practicing?
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
01-18-2019 , 03:34 PM
I just did 200GB worth of insertions and no slowdown when compiled in its own main. Blaming the JVM and calling it a day, lol. What a waste of a few days.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
01-18-2019 , 06:34 PM
Weird implementation of ecmascript.

var myObj = { }
myObj.items = [ ]
myObj.items.push( { myThing : _createThing } )

myObj.things = [ ]


myObj.things.push.apply(myObj.things, _createThing)
myObj.things.push.apply(myObj.things, myObj.items.myThing)

In the first example it throws that the second arg must be an array. In the second it accepts it. In both it looks like an array after some serialization back and forth. I cant see the underlying implementation of the language.

Am I correct in that this seems dumb?
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
01-18-2019 , 07:35 PM
I know that the command to open the repo browser in SourceTree is either Cmd-o or Cmd-0. I've thought about it and talked myself into Cmd-o about 25 times in a row. It's Cmd-0. I would be infinitely better off just flipping a coin.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
01-18-2019 , 09:23 PM
Does anyone actually use a jsconfig.json file?
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
01-19-2019 , 12:29 AM
If i want to write a script to scrape a large thread and save either the entire thread 1 page at a time or if it makes more sense just posts with a timetamp, author, and content, what would be the least painful way to go about it?
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
01-19-2019 , 12:47 AM
Python + Beautiful Soup + patience
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
01-19-2019 , 12:49 AM
Im thinking its a weekend project, am i delusional?

Java/JSoup is what i was thinking because then i dont need to brush up on python
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
01-19-2019 , 12:59 AM
Once you get this going please archive politics before it gets nuked.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
01-19-2019 , 01:10 AM
I have 2+2 scraping code that I created a while back, in Scala using scala-scraper, which I think consumes Jsoup.

It's in a really crufty state because I just tacked it onto some other code for a different project, but I could clean it up and put it on github purely as a reference.

I think it's a roughly reasonable weekend project, although it would take a long time to archive politics if you're not hammering the server :P
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
01-19-2019 , 01:16 AM
i don't want to say if it's for politics or not but that is a use case. I don't know if that violates some 2.p2 TOS and I don't want to get banned. I was just going to open source it somewhere after I was satisfied of not embarrassing myself if it doesn't get me in hot water.

The main reason it's been on my todo list was because of the OOT drama, ni.ma/bayareabeast BS looking very likely to possibly end up with me in a courthouse someday and I wanted an archive in case it got nuked. Google cache is not very reliable for these things, pages go missing very easily.

I also want to archive my blog, it's kind of an autobiography at this point and it'd really ****ing be a huge blow to me if the site disappeared or I got OTBC'd or something.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
01-19-2019 , 01:20 AM
Oh yea feel free to pm me or post the repo btw thank you
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
01-19-2019 , 01:36 AM
Repo here: https://github.com/thewellnamed/2p2-scraper

You probably want to look mostly at the actual scraping code under https://github.com/thewellnamed/2p2-...ces/twoplustwo

You also probably have to install some old-ass version of scala and SBT to build this, but it compiled on my laptop. I stripped out a bunch of old stuff. No idea if it runs in the current state, I don't have the postgres db running anymore. :P
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
01-19-2019 , 01:38 AM
It's also possible they've changed some page template stuff since I last ran this. There's a lot of pretty specific DOM selectors. Who knows?
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
01-19-2019 , 02:44 AM
Quote:
Originally Posted by jmakin
Im thinking its a weekend project, am i delusional?

Java/JSoup is what i was thinking because then i dont need to brush up on python
Fair. I did this on a forum with about 600 pages in like 2 hours in python as a full time Java dev though. Python is dope
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
01-19-2019 , 02:44 AM
Quote:
Originally Posted by well named
Repo here: https://github.com/thewellnamed/2p2-scraper

You probably want to look mostly at the actual scraping code under https://github.com/thewellnamed/2p2-...ces/twoplustwo

You also probably have to install some old-ass version of scala and SBT to build this, but it compiled on my laptop. I stripped out a bunch of old stuff. No idea if it runs in the current state, I don't have the postgres db running anymore. :P
oh snap, someone else who has used Play!
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
01-19-2019 , 01:46 PM
maybe i try python. setting up java projects is a real pain. I don't know how anyone could do this without an IDE helping you and even that isn't that easy. I haven't done this in a while.

It's awesome they have vim plugins for eclipse though
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
01-19-2019 , 02:29 PM
You don't really need anything super complex Java Python soup to archive two threads really.

The URLs are formed in a predictable manner, so grab the last page, e.g.

https://forumserver.twoplustwo.com/2.../index176.html

change 176 all the way down to 1, give urls to wget - there we go, job done.

Note, thread opened in an incognito window to get default pagination and no need for cookie handling.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote

      
m