Improve speed by using cache for counting the results

Started by balai, June 27, 2014, 19:33:34 PM

Previous topic - Next topic

balai

I realized that a big impact in perfomance is caused by the call of the _getListCount / getNumRows / mysql_num_rows.
What actually these functions do is to execute again the query without including the LIMIT and then count the found records.
So a query is executed once with the limits and once without them, which makes it heavier, just to get the total numbers of records/products.
The overhead is really important, especially in the category page and in shops with big tables.

The cases when the total number of products is changing are:
a.When a new product is added
b.When a product comes out of stock
c.When a product becomes published/unpublished
d.When the VM configuration concerning the stock is changed

The problem of avoiding the re-call of the _getListCount can be solved by using caching to store the total number of products of each category.
The only thing that needs to be done is some plugin trigers, which will be called: a. when a product comes out of stock b. when a new product is added. When any of these actions happen the cache should be deleted.

I am saying plugin triggers and not just the call of functions, because with the plugin triggers, this events can be used also by 3rd party developers who possibly need a similar functionality

Milbo

yes and that is the reason we do not use it.  Since vm2.0.0 times we use in vmmodel the function exeSortSearchListQuery, which uses $q = 'SELECT SQL_CALC_FOUND_ROWS to avoid this. So your complete post is without any relevance to VirtueMart. To call a trigger would be so unperformant, it is like saving seconds against hours.
Should I fix your bug, please support the VirtueMart project and become a member
______________________________________
Extensions approved by the core team: http://extensions.virtuemart.net/

balai

Hi

Even if SQL_CALC_FOUND_ROWS is faster than running 2 separate queries, it is much heavier to what i am suggesting.
My suggestion is to avoid completely the scanning of all the rows of a table which is done both by SQL_CALC_FOUND_ROWS and mysql_num_rows.
That scan will be done only once when the count changes. All the other times the cache will be used

Milbo

and how do you know that the amount changes?

Did you ever compare a call of a plugin with an execution of a simple query? What you seem not to understand is that the work is anyway done by mysql. If you want to change this, join the mysql team.
Should I fix your bug, please support the VirtueMart project and become a member
______________________________________
Extensions approved by the core team: http://extensions.virtuemart.net/

balai

Quoteand how do you know that the amount changes?

Please read my 1st post. I describe the cases when the count changes.

QuoteDid you ever compare a call of a plugin with an execution of a simple query? What you seem not to understand is that the work is anyway done by mysql. If you want to change this, join the mysql team.
Sorry but what i am stating has nothing to do with sql.

It mainly regards the use of joomla cache. The only thing that you have to do is to add some triggers and to read your cache to get the total amount of products per category/manufacturer.

Milbo

But it is quite interesting, I just read some more topics about it.
1. There seems to be a bug http://bugs.mysql.com/bug.php?id=18454
2. it seems that count is sometimes faster than SQL_CALC_FOUND_ROWS, so we could add a switch.
and 3. yes, it could spare some time to cache the result, but this is not easy todo. If we store it, we have to load it, find it, manage it. We would need another table for it. If we store just the total of any query, I think we also get a quite big table. If someone wants this for an extraordinary big store, I could write it.

But VM3 is already so much faster. An example, browsing of the category "Default Products" of the sample data needs in vm2 208 queries and in vm3 160 and you have the customfields correctly loaded in vm3. The load time is for sample data almost the same. It needs a store with maybe 10k products to really feel the difference (of course, depends also on your server ressources).
Should I fix your bug, please support the VirtueMart project and become a member
______________________________________
Extensions approved by the core team: http://extensions.virtuemart.net/

Milbo

Quote from: balai on June 27, 2014, 19:33:34 PM
a.When a new product is added
b.When a product comes out of stock
c.When a product becomes published/unpublished
d.When the VM configuration concerning the stock is changed

and different counts per shoppergroup, language, filter, ordering, search, manufacturer, permissions, ....

Using the joomla cache means most times to use the file system for caching. I doubt that makes sense.

It may make sense to add a switch so that we can use SQL_CALC_FOUND_ROWS or COUNT
Should I fix your bug, please support the VirtueMart project and become a member
______________________________________
Extensions approved by the core team: http://extensions.virtuemart.net/

balai

Hi

1 and 2. I am not so sure about that. The post is very old and there are different opinions for newer versions of mysql.
http://stackoverflow.com/questions/186588/which-is-fastest-select-sql-calc-found-rows-from-table-or-select-count
Check the newer posts at the bottom.

In any case some benchmarking will reveal the truth.
Quote
and different counts per shoppergroup, language, filter, ordering, search, manufacturer, permissions, ....

We can sort all the cases and find the solution. I think that the impcat of this modification will give Virtuemart a very strong boost since this is not a Jooml core feature that can be used by other competotors