Any Path leads to an existing page...really any path

Started by Cococoder, June 07, 2018, 10:39:16 AM

Previous topic - Next topic

Cococoder

Hi Guys,

Let me explain shortly my issue, let's say I have a category called test category
https://store.seobytes.eu/index.php/test-category

I can type anything between index.php and test category and that will lead to the same page. example:
https://store.seobytes.eu/index.php/banana/potatoes/test-category/

Examples are live.

Sounds quite bad for SEO in my opinion. It should give a 404 a redirect to the home page.

I tried with .htaccess and URL rewrite on with the same result and also consulted various thread such as https://forum.virtuemart.net/index.php?topic=69544.0

Any feedback on the matter would be appreciated.

Jörgen

Vm version, Joomla versio etc

Regards

Jörgen @ Kreativ Fotografi
Joomla 3.9.18
Virtuemart 3.4.x
Olympiantheme Hera (customized)
This reflects current status when viewing old post.

Cococoder

#2
Sorry for that!
Joomla! 3.8.8 Stable
VirtueMart 3.2.14
PHP Version   5.6.35
Currently on Beez3

Note: tested on other virtuemart site with similar results

Cococoder

Tested with and without default htaccess with the same result

jenkinhill

The VM 404 error handling is on by default, to avoid any potential loss of sales in case joe shopper should type in some stupid URL. The important URL is, of course, the canonical, which remains the same for that page.
Kelvyn
Lowestoft, Suffolk, UK

Retired from forum life November 2023

Please mention your VirtueMart, Joomla and PHP versions when asking a question in this forum

Cococoder

Thanks for your clear and concise answer. I understand how useful this kind of error handling can be, although nowadays I don't know anyone typing full URLs. I realized some fancy url were indexed by google, the result is that the page shows a product with the home page layout with a URL like /404/productname leading to a bad user experience and potential duplicate content issue. I will play with the error handling option and keep you posted.

Cococoder

Ok, did further check:
Enable VirtueMart 404 error handling: Tested with and without on two different sites (in VM config)
Use URL Rewriting: tested with and without on two different sites (In Joomla config)

Same behavior.

I tested on a joomla blog article and it returns the expected 404. So indeed it is implemented in virtuemart.

Any pointers on how to disable or "fix" this "all path lead to Roma" behavior to a normal "Page not found, how can we help you?" approach?


Studio 42

Cocoder, Google should never see link that you manually set, so it's not a real problem.
https://store.seobytes.eu/index.php/banana/potatoes/test-category/ is not giving a 404 because test-category is a valid slug.
banana/potatoes should set in your case a menu ID in Joomla, so in this case Virtuemart try to set the menu ID from DB and fall back to root category menu ID

Cococoder

Hi, thanks for the reply,
I raised the concern because google did index some wacky slugs.
/whatever/younameit/validPage is a valid slug if the CMS manage it as avalid slug.
I'd like to have pointers to go back to the default joomla behavior, which doesn't handle such slugs as a valid slug, and other community advice regarding how to handle the case.

Hope I can get some help on that.

Thanks guys

Studio 42

I dont mean that this can be invalidate directly.
But you can add some rules in .htaccess to redirect your bad links using
RewriteRule ^/?whatever/(.*)$ newfolder/$1 [R=301,L]
or to your 404 page
RewriteRule ^/?whatever/(.*)$ /my404page [R=404,L]

jenkinhill

If Google has indexed those strange URLs then it must be indexing your access log. A SE bot follows links, it should not "type in" stupid URLs. AFAIK.
Kelvyn
Lowestoft, Suffolk, UK

Retired from forum life November 2023

Please mention your VirtueMart, Joomla and PHP versions when asking a question in this forum

GJC Web Design

Just to add my 2 pennies worth....

I have also seen googled indexed nonsense urls  to some sites I run.  How they got indexed for me is not that interesting .. IMHO if the url is not valid it should return a 404 so it would drop out of the index eventually..

But currently with VM these are just reverting to to the root category view so google thinks they are valid and keeps the urls indexed.

I ( a while ago now so don't exactly remember the full scenario)  added this snippet in the vm router.php around line 750

search for the string  if (!isset($vars['virtuemart_category_id'])){

I added

/* GJC check that there is a category segment*/
$catseg = '';
foreach($segments as $segment){
if($segment == 'category') {
$catseg = '1';
}
}
//if (!isset($vars['virtuemart_category_id'])){
if (!isset($vars['virtuemart_category_id']) && $catseg){
/* GJC check that there is a category segment*/


this as it says checks if there is a category in the non sef url - from memory the code is : after passing various tests the default treats the first segment as a category - if not found then sends to the root cat.

now nonsense urls return 404

I haven't fully tested this but it works for me .. I especially had problems using a vmextended plugin where the new "view" was wrongly seen as a category  therefore the plugin wasn't useable with SEF on
GJC Web Design
VirtueMart and Joomla Developers - php developers https://www.gjcwebdesign.com
VM4 AusPost Shipping Plugin - e-go Shipping Plugin - VM4 Postcode Shipping Plugin - Radius Shipping Plugin - VM4 NZ Post Shipping Plugin - AusPost Estimator
Samport Payment Plugin - EcomMerchant Payment Plugin - ccBill payment Plugin
VM2 Product Lock Extension - VM2 Preconfig Adresses Extension - TaxCloud USA Taxes Plugin - Virtuemart  Product Review Component
https://extensions.joomla.org/profile/profile/details/67210
Contact for any VirtueMart or Joomla development & customisation

Cococoder

Hey Thanks GJC Web Design!

I am going to test that asap and give feedback here!

Cococoder

#13
Hey!

Thanks GJC Web Design!

Well I tested your solution but it didn't change anything for me. It still revert to home page view with or without SEF URL activated with Virtuemart 404 enabled or disabled.

Any other settings I should be aware off?

@Jenkinhill: This is an interesting but that mean I should have typed this URL in the first place, but the thing is it is the other way around.

Cococoder

Well, solution was easy, just disable virtuemart 404 error handling to fall back on joomla 404 handling which returns a proper 404 for wacky URLs