Quote:
Originally Posted by RustyBrooks
It depends on how data in a table is stored, but I think it's typical for rows to be stored sequentially in blocks. Sometimes rows have variable sizes. This means to count the rows by going through each one, you'd have to make a series of disk reads over a large section of disk - probably many different parts of the disk.
Not sure if you are talking about tables or indexes here, but if tables, the only sequence that is stored is the PK, but as you know, "select col" doesn't return a sorted order.
An index is "sorted" in the sense that it is in blocks. In common parlance, "index" refers to a b-tree (balanced tree), which is exactly what it sounds like in CS terms.
IRT to reading on memory or disk, this depends on two factors:
1- the buffer size of the index -vs- the max size of the index
2- the cache buffer -vs- cache sizes.
A table scan / sort / lookup will attempt to hold as much index / table in memory, but if there is not enough space allocated, then it will revert to disk IO.
I should also mention that if your times are all over the place, drastically shifting throughout the day, you should probably flush out your cache and test again. In general, you don't want to take the first 2 or 3 times, as you are trying to capture the real-use case, which does involve caching. A new query needs a bit of time to heat up.
Quote:
The index should be very small.
On it's nose, an index is a horrible idea. It requires a lot of ram to hold data, requires maintenance, slows down writes, and so on. The benefit for an index is the read speed, and this cost / benefit is the entire crux of tuning a database for indexes, deciding what indexes and how, and so on. If you have an index that is never used, then it does nothing but add write-time, etc.
Indexes require a lot more ram than you may intuit. I was going to say allocating 50% of your memory isn't totally out of bounds, but here MySQL suggests up to 70%.
http://mysql.rjweb.org/doc.php/memory
Quote:
In the case of things like primary keys the size of each index entry should be fixed.
The query I posted above shows otherwise. I haven't tuned my MySQL database, but all indexes are sitting between 250 and 260. I'd guess yours is in a similar (sane) range.
Quote:
Whether the index is in memory or on disk, stepping through all of it should be faster than doing the equivalent on a table.
I'm not sure about that one. I don't recall anything that says counting N nodes in an a non-perfect tree is faster than counting N items in an array, for example.
If I'm wrong on that, I'd definitely like to know about it.
Quote:
There are many cases in mysql where it does index scans instead of table scans. If they were equivalent, I don't think they'd do that. For example, if you do something like
select * from mytable where mycol like '%search%'
and mycol has an index on it, it will do a full index scan, instead of a full table scan.
This looks strange to me. Are you sure you are doing partial-string searches here?
A b-tree index is a series of scalar nodes. There is no mechanism (that I'm aware of) that says a b-tree is going to have nodes 'bone', 'ball', 'bat' and have nodes 'one', 'all', 'at' back-referencing to the larger words.
According to the doc, an index will search the first few letters as long as the index is specified.
https://dev.mysql.com/doc/refman/5.7...ate-index.html
However, I think FTS are better left to the correct index types.