Open Side Menu Go to the Top
Register
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** ** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD **

07-29-2018 , 06:40 PM
I work at least partially on the real-time backend on one of the most heavily trafficked services in the world and I don't spend all that much time thinking about performance and nearly all performance regressions are due to design issues or mundane bugs rather than something algorithmic in nature. You never launch a new feature to everyone at once and if you have any kind of decent monitoring in place, you don't have to be able to analytically detect performance issues looking at the code - you can just look at the data. Things like big-O are important as a conceptual building block and also a short-hand for communication between engineers but you don't need most engineers to care about that kind of stuff most of the time.

I've talked about this maybe like 2 years ago but there's this weird notion shared by relatively inexperienced engineers and non-technical people that scaling is this super-hard problem that requires some kind of uber geniuses, CS wizards and some special-purpose built systems. Come on guys, this isn't 2005, all the primitives required for achieving decent performance even at true web scale are commodities now and all the knowledge required to put the pieces together is a google search away and starting to make their way into youtube videos that can easily be understood by high school students. I mean don't get me wrong, your system will be full of race conditions and won't be consistent or available, but this will be true of everything built quickly, only extensive real world usage with extensive logging can uncover some of these issues and it's not worth dealing with problems you don't have yet. Even Friendster got to millions of users before hitting scaling issues and this was like 15 years ago when most engineers had no idea what to do with respect to scaling, there wasn't much information on how to architect such systems, a lot of things had to be invented because frameworks and tools weren't quite there, and hardware was much more expensive and slow.

This is related to the whole microservices fad but there's definitely a strong correlation between obsessing about performance and scale at the wrong level at the wrong time and never having worked on anything at scale before. This is also true of bragging about scale - there's this trend where small startups working on small problems with trivial amounts of data are writing self-congratulatory blog posts about how they achieved scale. Meanwhile their recruiters are talking about how they are processing more or as much events/data/whatever than FAANGs, which is rarely true even on a per-engineer basis and comically implausible in aggregate.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-29-2018 , 06:58 PM
Quote:
Originally Posted by suzzer99
Can you give an example or two? Are we talking short string traversal or potentially long array/hash manipuation, etc?
Everything. Inefficient algorithms, inefficient db schemas, unreadable code.

I spent 80% of last month refactoring database operations/schema. We were inserting 50k+ records 1 at at time using create! (Rails function). I spent weeks transforming/refactoring this into 1 insert, speeding it up from ~3 minutes to 15 seconds. I did this like 4-5 different places.

During that time, I also encountered a situation where table 1 had a postgres array field referencing the ids of table 2. We were trying to query any record from table 1 which based on the ids of table 2. Despite the array field using a GIN index, this was awfully slow (and not conceptual/readable). I spent a week refactoring this by simply introducing a bridge table. This sped up query time from ~15 seconds to ~2, which is huge considering it was part of a view used by a UI.

I know you might be looking for inefficient algorithms, but here's an example of code I rewrote just last week. The code works, but it's just awful. I renamed some stuff before posting (e.g. MyModule, Operation1) and added comments. Without any context, you probably won't understand how bad and hacky this really is:

Code:
def status(session_id = nil)
  sq = session_id ?
         MyModule::MyClass.where(id: session_id) :
         MyModule::MyClass.where('created_at > ?', Time.zone.today)
  sessions = sq.to_a.map do |sess|
    evh = MyLibrary::Event.last('MyModule::MyClass', sess.id)
    next [sess.id, nil, nil, 0] unless evh 
    err = sess.last_event_error
    op = evh['operation']
    # gets ugly here
    use_op, color, tt = err ?
        [op + ' FAILED', 'RED-row', err] :
        [(op != 'Operation1' || evh['end_dt'].blank? ? op :
            sess.busy? ? 'Operation2' :
              MyModule::Scenario.where(session_id: sess.id).empty? ?
            'Operation3' : 'Complete'),
         '', nil]
    count = ['Operation3', 'Operation4'].include?(op) && MyModule::MyClass2.
                                where(session_id: sess.id).count
    [sess.id, use_op, count, color, tt] 
  end 
  sessions
end
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-29-2018 , 07:06 PM
Quote:
Originally Posted by jmakin
He doesnt have examples he’s just spouting crap he heard in his classes
Bahahaha, speaking of bad devs! You prove my point. Only a bad dev wouldn't recognize the importance of readable, efficient, and logically coherent code and dismiss it as purely academic. You're the exact type of person I would want to keep away.

I love how you passionately hate Java (and some other stuff), when you don't even know how to write code or know what you're ever talking about. Please post more:

Quote:
Originally Posted by jmakin

got a code golf problem for you guys

term frequency counter, where we want the top 25 terms in a text file, with a list of stop words to not include...

Code:
import java.io * ;
import java.util.*;
public class FrequencyCounter61 {
    public class Word implements Comparable < Word > {
        public String wo;public int fr;public Word() {}
        public Word(String s, int f) {
            wo = s;
            fr = f;
        }
        public int compareTo(Word w) {
            if (this.fr > w.fr) return -1;
            else if (this.fr == w.fr) return 0;
            else return 1;
        }
    }
    public static ArrayList < String > pi(Scanner i, ArrayList < String > s) {
        ArrayList < String > o = new ArrayList < String > ();
        i.useDelimiter("[^A-Za-z]+");
        while (i.hasNext()) {
            String w = i.next().toLowerCase();
            if (!s.contains(w) && w.length() > 1) o.add(w);
        }
        return o;
    }
    public static void main(String[] args) throws FileNotFoundException {
        ArrayList < String > p = pi(new Scanner(new File(args[0])),
            pi(new Scanner(new File("stop_words.txt")), new ArrayList < String > ()));
        FrequencyCounter61 fq = new FrequencyCounter61();
        ArrayList < Word > t = new ArrayList < Word > ();
        int count = 0;
        String s;
        while (p.size() > 0) {
            s = p.get(0);
            while (p.remove(s)) count++;
            t.add(fq.new Word(s, count));
            count = 0;
        }
        Collections.sort(t);
        for (int i = 0; i < 25; i++) {
            System.out.println(t.get(i).wo + " - " + t.get(i).fr);
        }
    }
}
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-29-2018 , 07:41 PM
Quote:
Originally Posted by :::grimReaper:::
I spent 80% of last month refactoring database operations/schema. We were inserting 50k+ records 1 at at time using create! (Rails function). I spent weeks transforming/refactoring this into 1 insert, speeding it up from ~3 minutes to 15 seconds. I did this like 4-5 different places.
Just an FYI, but this isn't "refactoring" - it's way more/better than that. "Refactoring" typically means changes that improve the structure without improving functionality/performance/etc, or at least not motivated by such. Same with the example below. If you drastically improve the end user experience, refactoring is a massive understatement.

Quote:
During that time, I also encountered a situation where table 1 had a postgres array field referencing the ids of table 2. We were trying to query any record from table 1 which based on the ids of table 2. Despite the array field using a GIN index, this was awfully slow (and not conceptual/readable). I spent a week refactoring this by simply introducing a bridge table. This sped up query time from ~15 seconds to ~2, which is huge considering it was part of a view used by a UI.
This is all good stuff but I think this supports suzzer's point rather than your own, in that whether someone could fix/spot the problems isn't particularly related to one's understanding of runtime complexity. These are improvements in constants, not in complexity and far more about the willing to dig in and solve problems, rather than being able to code algorithms or state their runtime complexity.

Quote:
I know you might be looking for inefficient algorithms, but here's an example of code I rewrote just last week. The code works, but it's just awful. I renamed some stuff before posting (e.g. MyModule, Operation1) and added comments. Without any context, you probably won't understand how bad and hacky this really is:

Code:
def status(session_id = nil)
  sq = session_id ?
         MyModule::MyClass.where(id: session_id) :
         MyModule::MyClass.where('created_at > ?', Time.zone.today)
  sessions = sq.to_a.map do |sess|
    evh = MyLibrary::Event.last('MyModule::MyClass', sess.id)
    next [sess.id, nil, nil, 0] unless evh 
    err = sess.last_event_error
    op = evh['operation']
    # gets ugly here
    use_op, color, tt = err ?
        [op + ' FAILED', 'RED-row', err] :
        [(op != 'Operation1' || evh['end_dt'].blank? ? op :
            sess.busy? ? 'Operation2' :
              MyModule::Scenario.where(session_id: sess.id).empty? ?
            'Operation3' : 'Complete'),
         '', nil]
    count = ['Operation3', 'Operation4'].include?(op) && MyModule::MyClass2.
                                where(session_id: sess.id).count
    [sess.id, use_op, count, color, tt] 
  end 
  sessions
end
Like you said, this isn't really understandable without more context though it looks like potentially expensive operations are being repeated inside of a loop for no reason (and the null id case seems qualitatively different that the logic may easily be incorrect). Again, this isn't really puzzly and it's almost entirely a matter of conscientiousness and motivation who writes such code and who cleans up.

With that said, I think it's easier if you can assume that everyone knows everything in X for some reasonable set X that includes basic CS concepts. But it's certainly not because there's some set of X you absolutely have to know to be productive or that there's an ideal set X that works better than others. It's more that at an organizational level, you have to implicitly decide what that X and the smaller that X is, the more friction you generate. A trivial example would be human language - you probably have to make sure that there's one common language that everyone speaks. This could be English, Russian or Mandarin or whatever depending where you're starting the company but it's going to be quite difficult if you have some engineers who only speak English and others who only speak Russian, whether for meetings, documentations or just impromptu chat for dealing with mundane operational stuff.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-29-2018 , 07:50 PM
Quote:
Originally Posted by Larry Legend
Wrt the frontend performance convo, at my last company the toxic guy on my team would always suggest changes that made code significantly less readable because it "improved performance" and it was clear he had no idea wtf he was talking about.

We would have a function that loops through an array to spit out Components and a line at the top like "const isActive = list[index] === this.state.activeSelection" and add that as a prop on the Component like "active=isActive" and he would make us change it to "active=list[index] === this.state.activeSelection".

One time I was feeling tilted and did the math to determine exactly much memory he was saving on each render with his proposed changes and it was like 212 bytes to remove clarity from the entire file.
Bolded is very often true imo fwiw but I concede there is no objective standard for code readability.

Is a bubble sort implementation easier to read than a heap sort implementation?

Last edited by adios; 07-29-2018 at 08:05 PM.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-29-2018 , 08:09 PM
Quote:
Originally Posted by Larry Legend
Wrt the frontend performance convo, at my last company the toxic guy on my team would always suggest changes that made code significantly less readable because it "improved performance" and it was clear he had no idea wtf he was talking about.

We would have a function that loops through an array to spit out Components and a line at the top like "const isActive = list[index] === this.state.activeSelection" and add that as a prop on the Component like "active=isActive" and he would make us change it to "active=list[index] === this.state.activeSelection".

One time I was feeling tilted and did the math to determine exactly much memory he was saving on each render with his proposed changes and it was like 212 bytes to remove clarity from the entire file.
To be honest, I think this is fairly neutral with respect to readability and more experienced people would prefer the latter anyhow but I don't think there's any performance implication here assuming even a decent compiler, which most browsers have. I'm not sure where you're coming up with 212 bytes but I don't think it's correct. The intermediate value needs to be stored either way and compiler should be able to inline or at least avoid capturing in a closure if it's not referenced elsewhere. Also, stack variables don't stay allocated forever so even if this incurs additional allocation, in a loop, you don't keep incurring the same penalty. Since most people use Babel, which I don't think does anything fancy, there may be like a difference of ~2 bytes in terms of JS payload.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-29-2018 , 08:20 PM
If you guys had to choose between really efficient but mostly unreadable code or generally inefficient (but not massively inefficient) but very readable code which would you choose?
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-29-2018 , 08:30 PM
Quote:
Originally Posted by Craggoo
If you guys had to choose between really efficient but mostly unreadable code or generally inefficient (but not massively inefficient) but very readable code which would you choose?
The latter until it's shown optimization is needed. This is true even in most embedded code.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-29-2018 , 08:36 PM
Quote:
Originally Posted by candybar
Just an FYI, but this isn't "refactoring" - it's way more/better than that. "Refactoring" typically means changes that improve the structure without improving functionality/performance/etc, or at least not motivated by such. Same with the example below. If you drastically improve the end user experience, refactoring is a massive understatement.
Thanks, I'll keep that in mind during my performance review


Quote:
Originally Posted by candybar
This is all good stuff but I think this supports suzzer's point rather than your own, in that whether someone could fix/spot the problems isn't particularly related to one's understanding of runtime complexity. These are improvements in constants, not in complexity and far more about the willing to dig in and solve problems, rather than being able to code algorithms or state their runtime complexity.
I gave database examples because that's mostly what I worked in the past month. In reality, I am and work with full stack devs. I could've easily given poor-performing model code, and I'm pretty I've had to rewrite model code at some point in my career. And the 50k record example isn't just a constant time improvement. Empirically, I'm pretty sure I could double size to 100k inserts and the old way would be ~2x slower, and the new way would be maybe a few seconds (20%?) slower.

So for a full stack dev, time complexity and database questions are fair game. I don't know much about pure front-end development, so I wasn't asking a rhetorical question when I asked whether or not a front-end dev ever writes nontrivial algorithms. If not, then sure, time complexity may not be the best go-to interview question.


Quote:
Originally Posted by candybar
Like you said, this isn't really understandable without more context though it looks like potentially expensive operations are being repeated inside of a loop for no reason (and the null id case seems qualitatively different that the logic may easily be incorrect). Again, this isn't really puzzly and it's almost entirely a matter of conscientiousness and motivation who writes such code and who cleans up.
The issue here isn't whether or not it's expensive, it's how unreadable the code/flow is for a function that isn't doing anything too complicated. It's an example of "good code isn't defined by whether or not it works".
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-29-2018 , 08:38 PM
Quote:
Originally Posted by Craggoo
If you guys had to choose between really efficient but mostly unreadable code or generally inefficient (but not massively inefficient) but very readable code which would you choose?
Generally the latter - usually any very inefficient system is dominated by a few hot spots or operations that dominate the runtime, so you just have to work on a small fraction of the code base. Mostly unreadable code likely requires refactoring on the whole code base. This is unless ongoing maintenance is not necessary - if you don't have to change/improve the system and you're in a performance sensitive domain, I suppose a currently more efficient system is more valuable. An extreme example would be if you had a super-efficient database system with very few well-defined primitives, but no source code. It may have some value in some specific domains where you can live with the feature limitations. Source code for a less efficient implementation may be entirely worthless if it's no better than open source alternatives.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-29-2018 , 08:38 PM
Quote:
Originally Posted by Craggoo
If you guys had to choose between really efficient but mostly unreadable code or generally inefficient (but not massively inefficient) but very readable code which would you choose?
Depends if it's the bottleneck or not, and how critical performance is.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-29-2018 , 08:44 PM
Quote:
Originally Posted by suzzer99
I made a silly blog post, mostly from my musings and rantings in this thread: https://medium.com/@suzzer99/how-not...h-1e99d96358c6
That was a great post. FWIW stating there are 2 types is an extreme over simplification. Also, there's a lot of software disasters happening so at some point companies will have to change how they go about hiring devs, again fwiw. Companies that make a lot of money and have a world wide presence rectify disasters by brute force. They have seemingly unlimited resources in bringing developers to address issues with troubled projects and can work on projects 24 hours a day, day after day because they have developers in NA, Europe, and Asia. Point being is that they have their share of disasters but can bring lots of resources to bare because they're wealthy basically. Their hiring practices certainly emphasize CS concepts we're discussing here but it doesn't make them immune from problem software.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-29-2018 , 08:50 PM
Quote:
Originally Posted by :::grimReaper:::
Thanks, I'll keep that in mind during my performance review
Sure, you should definitely put things in the best possible light and talk about the ultimate impact you're having. "Refactoring" is one of those things that engineers love but may not resonate the same way a couple of levels up.

Quote:
The issue here isn't whether or not it's expensive, it's how unreadable the code/flow is for a function that isn't doing anything too complicated. It's an example of "good code isn't defined by whether or not it works".
Gotcha, I see what you're saying but I think this is a bit of a tangent from the original point. I think you're mixing together too many things that aren't strongly related. Specifically, I don't think suzzer's lack of familiarity with algorithmic questions and runtime complexity concepts says much about his ability to structure code and not write nonsense like what you posted. Also, you should probably get a new job if you feel that your coworkers are not up to par. PM me if you're interested in a referral!
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-29-2018 , 11:03 PM
Quote:
Originally Posted by adios
Bolded is very often true imo fwiw but I concede there is no objective standard for code readability.

Is a bubble sort implementation easier to read than a heap sort implementation?
You work on embedded systems right? I always assumed performance trumps everything in that world.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-29-2018 , 11:05 PM
Quote:
Originally Posted by Craggoo
If you guys had to choose between really efficient but mostly unreadable code or generally inefficient (but not massively inefficient) but very readable code which would you choose?
I'd find a way to make the efficient code more readable.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-29-2018 , 11:14 PM
Quote:
Originally Posted by Larry Legend
Wrt the frontend performance convo, at my last company the toxic guy on my team would always suggest changes that made code significantly less readable because it "improved performance" and it was clear he had no idea wtf he was talking about.

We would have a function that loops through an array to spit out Components and a line at the top like "const isActive = list[index] === this.state.activeSelection" and add that as a prop on the Component like "active=isActive" and he would make us change it to "active=list[index] === this.state.activeSelection".

One time I was feeling tilted and did the math to determine exactly much memory he was saving on each render with his proposed changes and it was like 212 bytes to remove clarity from the entire file.
I had a very similar situation with some toxic devs at one of my earlier jobs.

Whenever we challenged how they were doing something - they would throw up PERFORMANCE as some kind of insta-discussion killer. Then we'd be like - wait a minute - this app is internal and may have 200 users total, not concurrent, but total users. This thing we're talking about happens when a user submits a form, not a batch process or anything. Do we really care about shaving off a few milliseconds?

Those arguments would go on for days and there was no higher authority to appeal to. I felt like I was taking crazy pills.

Oh yeah lol - we were using EJBs because EJBs were cool - even though there was no clear business value, they added an order of magnitude more cruft to the code, and they sucked for performance. But EJBs!
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-29-2018 , 11:15 PM
Quote:
Originally Posted by suzzer99
I'd find a way to make the efficient code more readable.
Efficient code is by definition less readable. Inefficient code usually reads more like a story where you have a beginning, middle, and end. Efficient code reads more like a bunch of paragraphs that are in random order.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-29-2018 , 11:19 PM
Captain Kirk found a way in the Kobayashi Maru. I'd find a way.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-29-2018 , 11:27 PM


LOOOOOL at the mascot - from this article: https://medium.freecodecamp.org/why-...d-34b088e5a5db
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-30-2018 , 01:30 AM
Quote:
Originally Posted by Craggoo
Efficient code is by definition less readable. Inefficient code usually reads more like a story where you have a beginning, middle, and end. Efficient code reads more like a bunch of paragraphs that are in random order.
Ha, no. Plenty of counterexamples in this thread.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-30-2018 , 02:07 AM
Quote:
Originally Posted by :::grimReaper:::
Ha, no. Plenty of counterexamples in this thread.
Perhaps you would like to link some if they are as abundant as you claim
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-30-2018 , 03:14 AM
Quote:
Originally Posted by Craggoo
Perhaps you would like to link some if they are as abundant as you claim
Just look the jmakin code I quoted a few posts above. It's neither readable or efficient.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-30-2018 , 05:55 AM
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-30-2018 , 06:52 AM
Quote:
Originally Posted by :::grimReaper:::
Everything. Inefficient algorithms, inefficient db schemas, unreadable code.

I spent 80% of last month refactoring database operations/schema. We were inserting 50k+ records 1 at at time using create! (Rails function). I spent weeks transforming/refactoring this into 1 insert, speeding it up from ~3 minutes to 15 seconds. I did this like 4-5 different places.

During that time, I also encountered a situation where table 1 had a postgres array field referencing the ids of table 2. We were trying to query any record from table 1 which based on the ids of table 2. Despite the array field using a GIN index, this was awfully slow (and not conceptual/readable). I spent a week refactoring this by simply introducing a bridge table. This sped up query time from ~15 seconds to ~2, which is huge considering it was part of a view used by a UI.

I know you might be looking for inefficient algorithms, but here's an example of code I rewrote just last week. The code works, but it's just awful. I renamed some stuff before posting (e.g. MyModule, Operation1) and added comments. Without any context, you probably won't understand how bad and hacky this really is:
grim, I have never had to write code that reads/writes to a db or create a schema. but I think you are correct that efficiently is important there. we have some hudge dbs, but then we have specific ppl that work on those dbs so its pretty rare that any of our devs need to mess with that.

as for algorithms, I mostly deal in javascript and it is rare that I need to roll my own algo. I either use someone elses helper method or combine javascript array operators like map, filter, or reduce.

one codebase I work on imports lodash so I use that there sometimes. that is an optimization I suppose since I am not importing lodash everywhere, just in the place it already exists.

but neither of those are optimizing the algo. I have no idea how they work under the hood and no idea if it is fast/efficient. I did see an article that a loop is faster than .foreach but who knows if the guy had a clue. to me, lodash or .reduce is readable though. readability is what I go for and what it seems my team prefers.

still, I have my doubts that on a pretty basic angular app that the way the front end combines and filters 2 lists of objects from a json is gonna be a deal breaker and require optimization. if it does, then we are in real trouble bc myself and the rest of the devs never care about that. but I suppose if it was causing issues, we could investigate and then implement better efficiency. its just that the api calls, and ahead of time build, and caching, and tree shaking, and which server we route to is probably a ton more important than how we combine or sort to dinky lists in typescript.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
07-30-2018 , 09:30 AM
Quote:
Originally Posted by :::grimReaper:::
Just look the jmakin code I quoted a few posts above. It's neither readable or efficient.
I can't speak to the efficiency of it since I don't know Java but if that is your definition of unreadable code I think you need to redefine unreadable.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote

      
m