Shoe, I hope someone with more knowledge answers your questions here, but this is my answer:
Quote:
Originally Posted by Shoe Lace
Dave, in your SQL examples isn't it up to the database platform to turn your readable but less efficient code into the more efficient path?
Yes, I think it should, but it is hard to fault the implementation in all cases. Some (most) database schemas are far from normalized, so it wouldn't be possible to optimize for all queries on all data sets.
I think that PostgreSQL does a pretty decent job of optimizing queries overall, though. I have quite a few not-so-good queries that run fine. A classical example is using cartesian products where using inner joins would clearly be superior.
As far as I've been able to measure, these two queries run in about the same time:
Code:
select fn.uid, fn.first_name, ln.last_name
from fnames fn, lnames ln
where fn.uid = ln.uid;
Code:
select fn.uid, fn.first_name, ln.last_name
from fnames fn
inner join lnames ln
on fn.uid = ln.uid;
The first one uses (visually at least) a Cartesian product, which is basically like running two for loops, but the query optimizes it as an inner join, which is similar to running one loop.
Oddly, many beginners are told to use this one:
Code:
select distinct fn.uid, fn.first_name, ln.last_name
from fnames fn, lnames ln
where fn.uid = ln.uid;
It does resolve an output just fine and gets incredible speed gains compared to not using the "distinct" clause, but oh my....
Quote:
I remember some article on HN the other week about this where a guy compared a query on postgres, mysql, mssql and oracle. Oracle generated the best query plan given the current query which resulted in it completely destroying postgres and others in performance.
This wouldn't surprise me one iota for many reasons. The most obvious is that Oracle is built by a large corporation and has been around longer than PostgreSQL. PostgreSQL is, of course, well-built, but it is open source and built on passion. Just can't compete with the compilers on a project like this.
The next reason is that PostgreSQL works in the opposite fashion as Oracle. In PostgreSQL, you have to consider the proper hardware and software foundation, and thus it takes some dirty work to get the full potential if you are aiming to use micro benchmarks. Oracle compiles down to a Virtual Machine, which of course, is optimized to work with Oracle. Without information on how the machines are set up, these benchmarks don't mean much. This of course goes for MySQL and any other database that doesn't compile to a VM. This doesn't exactly disagree with the query optimization problem, but it is something to consider when looking at benchmarks.
The final caveat is that Oracle also demands optimizations, and they are often at odds with what you would do with other databases, so it can be cherry-picking queries to make a point (not making an accusation, but it sounds a bit ingenuous to me to extend one case to all cases). I can just as well pick a query that blows up an Oracle database to prove PostgreSQL is better at optimizing queries. A good book that explores various optimizations on different databases is Refactoring SQL Applications. The first thing the author write is "don't take the benchmarks as proof that one database is better than another." The takeaway is that there are too many variables to consider.
Quote:
I think the moral of the story was you could get postgres to execute it as just as fast but it required a more complicated query to get the same plan.
I guess I sort of answered this one above.
Quote:
I'm not really sure if functional vs non-functional language styles even play a performance role in web development. Maybe in the 0.00000000001% case?
For example if you decide to use map vs a for loop in a language that supports both then maybe the map version will only do 20 million iterations per second while the for loop can do 50 million iterations per second but does that matter when as soon as you introduce IO it drops to 300 iterations per second for both?
Or if you're only drawing 50 elements on the screen, does it really matter that one of them completes 2 nanoseconds faster than the other?
I fear that too many developers take this attitude and take it some odd point of no return. I have 30+ mbp internet, and many sites take forever to load and often cause my fan to start spinning. I run various programs to prevent resource downloads and even turn off site-defined fonts to speed up my browsing. I can't imagine the hell people on 5mbp are living with these days.
So, imagine how many times they said "meh" and now they have a crap site that is barely usable. Imagine the time it would take them to reverse-engineer a site to be in the top 80% speed once they are sitting at sub-50%?
As for writing an algorithm for some NxN matrix, I don't think you should eschew using faster algorithms. A major point, in my interpretation, is to assume that N can be extremely large.