News:

Looking for documentation? Take a look on our wiki

Main Menu

Random url in products

Started by Sillero, March 29, 2023, 13:02:17 PM

Previous topic - Next topic

Sillero

Hi, This is a problem that I have been seeing for a long time.

Any published product can have a different url when we use the full category tree but the canonical url is correct. For example: domain/category1/subcategory1/product1 if you change the category path the url show the product and the canonical is right but the breadcrumbs and the base url are both wrong: domain/category99/subacategory99/product.

You may think that since the canonical is correct there are no problems with indexing but it is not. Google is somehow indexing hundreds of urls with the wrong path even though the canonical is correct (since the canonical is a recommendation and not a directive)

I don't know why these urls are built or how Google gets to them. I already noticed in another post a problem with wrong urls in product variants and since then I have removed all ajax updates of product variants and neighboring products. https://forum.virtuemart.net/index.php?topic=149438.0

I use an outdated version of virtuemart 3.2.12 but I understand that this is not the problem.

Can someone guide me in the right direction?

Sillero

@GJC Web Design
You must have the same problem on this website that you gave as an example: https://www.escape-watersports.co.uk

If you change the route of any product, the page continues to exist and gives a status of 200, the canonical will be correct but Google seems to crawl these products with different routes than what it should have. You really don't have a problem with this? Anyone?

I think this should give a 404 error and not give access to the product.

GJC Web Design

Hi,

please can u give some actual urls as examples of what you mean - struggling to understand the problem your discussing

GJC Web Design
VirtueMart and Joomla Developers - php developers https://www.gjcwebdesign.com
VM4 AusPost Shipping Plugin - e-go Shipping Plugin - VM4 Postcode Shipping Plugin - Radius Shipping Plugin - VM4 NZ Post Shipping Plugin - AusPost Estimator
Samport Payment Plugin - EcomMerchant Payment Plugin - ccBill payment Plugin
VM2 Product Lock Extension - VM2 Preconfig Adresses Extension - TaxCloud USA Taxes Plugin - Virtuemart  Product Review Component
https://extensions.joomla.org/profile/profile/details/67210
Contact for any VirtueMart or Joomla development & customisation

pinochico

#3
we had problem with url of products in google search console:
- we hacked canonical plugin from JSitemap Pro and setup a lot of URL in JSitemap Pro for sitemap
- we hacked breadcrumbs modul
- we hacked our rich snippets plugin for VirtueMart

Now we have all urls in sitemap and GSC right and url for product (I came form differents categories) in breadcrumb modul is still the same with the right canonical URL (product, category, articles)

We worked on this etc 40 hours and payed 2 licencies and 2 external developpers and put into our 8 shops.

Now you know a journey :)

You can check on www.zelenazeme.cz
www.minijoomla.org  - new portal for Joomla!, Virtuemart and other extensions
XML Easy Feeder - feeds for FB, GMC,.. from products, categories, orders, users, articles, acymailing subscribers and database table
Virtuemart Email Manager - customs email templates
Import products for Virtuemart - from CSV and XML
Rich Snippets - Google Structured Data
VirtueMart Products Extended - Slider with products, show Others bought, Products by CF ID and others filtering products

Sillero

@ GJC Web Design
This is what I mean:
https://www.escape-watersports.co.uk/clothing/drysuits/mens-drysuits/crewsaver-atacama-pro-suit-detail
https://www.escape-watersports.co.uk/equipment/helmets/crewsaver-atacama-pro-suit-detail
Both products have the same canonical but the base url it's different. Somehow Google is crawling hundreds of pages with the wrong path. Can you check your GSC reports? Especially the indexed pages not submitted in the sitemap.

@pinochico
Thanks, now I'm worried a lot more...
I see that the urls of the products have the root of /eshop/ but the breadcrumb does have all the categories, it is a good example and I see that there is a lot of work

Both websites are awesome, congratulations.

So far the only major modification I have made has been to eliminate the crawling and possible indexing of urls that contain parameters and other unwanted urls such as sorting and filtering urls (product_name, product_price dirDesc, Keyword, manage, results... .) I recommend you to set no index this urls and unset the canonical, you will get a better crawl budget from google. Example: https://www.escape-watersports.co.uk/equipment/helmets/kayaking-helmets/by,price?keyword=

I haven't been able to find much information on the forum, do you know if this has been discussed?

balai

Given that the url of a product is based on the category, i cannot think of how you can solve that from within VM.
Also to my knowledge this is how it works in other e-commerce platforms as well.

Indeed canonical is a preference hint and not a directive. Also Google can choose another page as canonical.

Other methods to deal with duplicate content as proposed by Google are:
Redirects and Site Map Inclusion/Exclusion
https://developers.google.com/search/docs/crawling-indexing/consolidate-duplicate-urls

Studio 42

Google index most time the first page he find and use canonial link after
SO. It should not be a problem about SEO to have a wrong link some weeks
I have a customer shop using more then 20 categories per product and have no report about duplicates from google

Sillero

@balai
Yes this is how VM works if you use full category tree option. I personally like working with the full path to the product better, this adds context to the product url. Google understands this and has no problem with it, although it also recommends using short urls, so as long as two category levels are not exceeded, there should be no problem.

On other platforms I have been able to see different solutions. In several, when you try to change the category path there is a 301 redirect to the correct url. In others, a 404 is simply generated. The first option seems the most appropriate to me and the second could be a solution.

@Studio 42
Yes, Google usually indexes the first page it hits (sometimes even if it's blocked by robots.txt) It may not be a problem initially if the correct url is later updated but a lot of crawl budget is wasted as hundreds of urls can be generated to crawl. Also, if the url were from an alternative category to which the article also belongs, I wouldn't care so much.

The main problem is that I don't understand why Google builds those urls, where do they come from? As far as I know, when I had enabled the multivariates that were refreshed through Ajax, these urls could be built, since the category of the product previously seen was taken. But I no longer have it enabled, I work with the multivariates with the products_horizon.php sublayout and there is an http request when you enter to the variant product.

Studio 42

THe problem for by,price?keyword= is from Sort by Product Name, Product Price
This generate a href to invert sort by.
The link should have a rel='nofollow' so only 1 page is indexed
But in all case the canonical link does not add this informations so it should be safe

Sillero

Yes, I seted this a long time ago but again this is a recommendation not a directive. Google continues to crawl hundreds of pages despite this. In my case all the filtering urls (product_name, product_price...) I have set them to noindex and removed the canonical. So yes, I think you should worry about it when you have a large catalog of products. But it also has an easy solution.

What I'm concerned about is why those urls are being crawled and indexed if they don't exist and aren't linked (supposedly) from any other page.

pinochico

because you don't setup right robots.txt
because you don't setup right view for nofollow, noindex
because a lot of others :)

This is complex problem, not only one URL.
We are hard working o SEO with GSC two years and in this time google change rules 3 times :D
www.minijoomla.org  - new portal for Joomla!, Virtuemart and other extensions
XML Easy Feeder - feeds for FB, GMC,.. from products, categories, orders, users, articles, acymailing subscribers and database table
Virtuemart Email Manager - customs email templates
Import products for Virtuemart - from CSV and XML
Rich Snippets - Google Structured Data
VirtueMart Products Extended - Slider with products, show Others bought, Products by CF ID and others filtering products

Sillero

Yes, the robots.txt rules can be tricky. Once you discover the problems you first have to dexindex and then block by robots.txt. Every website is different. Your robots.txt file is very interesting

There is a lot of work to set nofollow links and to noindex some urls in a new virtuemart setup. I think this aspect should be improved by taking more consideration in SEO.

I was able to find a possible solution to my problem and I want to share it. I am not a programmer and it is possible that it will not work for other web pages.
Since the canonical url is always generated correctly and in my case I don't have the same product in several categories, I can do a 301 redirect from the base url that is reached to the canonical url. This is my code implemented in the view .../templates/.../productsdetails/default.php

I would like to hear your comments

$flag = false;
$document = JFactory::getDocument ();
foreach ($document->_links as $k => $array) {
if ($k != $document->base)
if ( $array['relation'] == 'canonical' ) {
//unset($document->_links[$k]);
$flag = true;
}

}
if($flag) {
//$document->setMetaData( 'robots', 'noindex' );
header("HTTP/1.1 301 Moved Permanently");
header("Location: $k");
header("Connection: close");
}

pinochico

default.php is view in FE - this is place where souldn't be developping, only view

right place is model or system plugin, some little can be in view.html.php.

but some best is use system plugin for canonicall url and this plugin customize.

QuoteI don't have the same product in several categories

This is specially option, a lot of shops are different

for some view which we don't want indexing and not as the menu item then we setup as noindex, nofollow or noindex,follow with template:


//SEO Analyse
$document = JFactory::getDocument();
$document->setMetaData('robots', "noindex, nofollow");
//END

www.minijoomla.org  - new portal for Joomla!, Virtuemart and other extensions
XML Easy Feeder - feeds for FB, GMC,.. from products, categories, orders, users, articles, acymailing subscribers and database table
Virtuemart Email Manager - customs email templates
Import products for Virtuemart - from CSV and XML
Rich Snippets - Google Structured Data
VirtueMart Products Extended - Slider with products, show Others bought, Products by CF ID and others filtering products

Sillero

Thank for the tip, like I said, I'm not a programmer ;) but I will try.

That other code is the one I use in the category view when certain conditions are met and thus many unwanted urls are deindexed.

I see that in GSC many urls of product variants are being indexed but with the wrong route, I repeat, the conincal is correct (the url of the parent product) Can anyone point me to how I can make the base url the correct one as well since I can't prevent those urls from being generated?
When I output all the data for $this and compare with the right path I only find that [Itemid] and [categoryId] are wrong.

pinochico

Itemid must be only one - the high level - but we use ArtioSEF and on the one shop without Artio now we develop solution :)
CategoryID is from canonical URL of products

Sorry but its complex and not for forum :(
And this is my job (work for money :)

I can tell you a journey, but develop you have to self
Or buy some support on minijoomla.org

www.minijoomla.org  - new portal for Joomla!, Virtuemart and other extensions
XML Easy Feeder - feeds for FB, GMC,.. from products, categories, orders, users, articles, acymailing subscribers and database table
Virtuemart Email Manager - customs email templates
Import products for Virtuemart - from CSV and XML
Rich Snippets - Google Structured Data
VirtueMart Products Extended - Slider with products, show Others bought, Products by CF ID and others filtering products