Author Topic: Any Path leads to an existing page...really any path  (Read 624 times)

Cococoder

  • Beginner
  • *
  • Posts: 30
  • A beginner
Any Path leads to an existing page...really any path
« on: June 07, 2018, 10:39:16 am »
Hi Guys,

Let me explain shortly my issue, let's say I have a category called test category
https://store.seobytes.eu/index.php/test-category

I can type anything between index.php and test category and that will lead to the same page. example:
https://store.seobytes.eu/index.php/banana/potatoes/test-category/

Examples are live.

Sounds quite bad for SEO in my opinion. It should give a 404 a redirect to the home page.

I tried with .htaccess and URL rewrite on with the same result and also consulted various thread such as https://forum.virtuemart.net/index.php?topic=69544.0

Any feedback on the matter would be appreciated.

Jörgen

  • Global Moderator
  • Full Member
  • *
  • Posts: 1601
    • Kreativ Fotografi
  • VirtueMart Version: 3.2.12
Re: Any Path leads to an existing page...really any path
« Reply #1 on: June 07, 2018, 11:01:24 am »
Vm version, Joomla versio etc

Regards

Jörgen @ Kreativ Fotografi
Joomla 3.8.3
Virtuemart 3.2.12
Olympiantheme Hera (customized)

Cococoder

  • Beginner
  • *
  • Posts: 30
  • A beginner
Re: Any Path leads to an existing page...really any path
« Reply #2 on: June 07, 2018, 11:15:50 am »
Sorry for that!
Joomla! 3.8.8 Stable
VirtueMart 3.2.14
PHP Version   5.6.35
Currently on Beez3

Note: tested on other virtuemart site with similar results

Cococoder

  • Beginner
  • *
  • Posts: 30
  • A beginner
Re: Any Path leads to an existing page...really any path
« Reply #3 on: June 07, 2018, 11:52:34 am »
Tested with and without default htaccess with the same result

jenkinhill

  • UK Web Developer & Consultant
  • Global Moderator
  • Super Hero
  • *
  • Posts: 26915
  • Always on vacation
    • Jenkin Hill Internet
Re: Any Path leads to an existing page...really any path
« Reply #4 on: June 07, 2018, 16:35:32 pm »
The VM 404 error handling is on by default, to avoid any potential loss of sales in case joe shopper should type in some stupid URL. The important URL is, of course, the canonical, which remains the same for that page.
Kelvyn

Jenkin Hill Internet,
Lowestoft, Suffolk, UK

Unsolicited PMs/emails will be ignored.

Please mention your VirtueMart, Joomla and PHP versions when asking a question in this forum

Currently using VM.3.2.15.9866 on Joomla 3.8.10 PHP 7.0.30

Testing VM.3.2.15.9898 on J3.8.10

Cococoder

  • Beginner
  • *
  • Posts: 30
  • A beginner
Re: Any Path leads to an existing page...really any path
« Reply #5 on: June 08, 2018, 12:27:33 pm »
Thanks for your clear and concise answer. I understand how useful this kind of error handling can be, although nowadays I don't know anyone typing full URLs. I realized some fancy url were indexed by google, the result is that the page shows a product with the home page layout with a URL like /404/productname leading to a bad user experience and potential duplicate content issue. I will play with the error handling option and keep you posted.

Cococoder

  • Beginner
  • *
  • Posts: 30
  • A beginner
Re: Any Path leads to an existing page...really any path
« Reply #6 on: June 08, 2018, 12:52:32 pm »
Ok, did further check:
Enable VirtueMart 404 error handling: Tested with and without on two different sites (in VM config)
Use URL Rewriting: tested with and without on two different sites (In Joomla config)

Same behavior.

I tested on a joomla blog article and it returns the expected 404. So indeed it is implemented in virtuemart.

Any pointers on how to disable or "fix" this "all path lead to Roma" behavior to a normal "Page not found, how can we help you?" approach?


Studio 42

  • Contributing Developer
  • Sr. Member
  • *
  • Posts: 3241
  • Joomla & Virtuemart addon developper
    • Studio 42 - Virtuemart & Joomla extentions
  • VirtueMart Version: 2.6 & 3.0.x.y
Re: Any Path leads to an existing page...really any path
« Reply #7 on: June 08, 2018, 13:43:42 pm »
Cocoder, Google should never see link that you manually set, so it's not a real problem.
https://store.seobytes.eu/index.php/banana/potatoes/test-category/ is not giving a 404 because test-category is a valid slug.
banana/potatoes should set in your case a menu ID in Joomla, so in this case Virtuemart try to set the menu ID from DB and fall back to root category menu ID

Cococoder

  • Beginner
  • *
  • Posts: 30
  • A beginner
Re: Any Path leads to an existing page...really any path
« Reply #8 on: June 09, 2018, 13:41:19 pm »
Hi, thanks for the reply,
I raised the concern because google did index some wacky slugs.
/whatever/younameit/validPage is a valid slug if the CMS manage it as avalid slug.
I'd like to have pointers to go back to the default joomla behavior, which doesn't handle such slugs as a valid slug, and other community advice regarding how to handle the case.

Hope I can get some help on that.

Thanks guys

Studio 42

  • Contributing Developer
  • Sr. Member
  • *
  • Posts: 3241
  • Joomla & Virtuemart addon developper
    • Studio 42 - Virtuemart & Joomla extentions
  • VirtueMart Version: 2.6 & 3.0.x.y
Re: Any Path leads to an existing page...really any path
« Reply #9 on: June 09, 2018, 15:04:18 pm »
I dont mean that this can be invalidate directly.
But you can add some rules in .htaccess to redirect your bad links using
RewriteRule ^/?whatever/(.*)$ newfolder/$1 [R=301,L]
or to your 404 page
RewriteRule ^/?whatever/(.*)$ /my404page [R=404,L]

jenkinhill

  • UK Web Developer & Consultant
  • Global Moderator
  • Super Hero
  • *
  • Posts: 26915
  • Always on vacation
    • Jenkin Hill Internet
Re: Any Path leads to an existing page...really any path
« Reply #10 on: June 09, 2018, 16:28:53 pm »
If Google has indexed those strange URLs then it must be indexing your access log. A SE bot follows links, it should not "type in" stupid URLs. AFAIK.
Kelvyn

Jenkin Hill Internet,
Lowestoft, Suffolk, UK

Unsolicited PMs/emails will be ignored.

Please mention your VirtueMart, Joomla and PHP versions when asking a question in this forum

Currently using VM.3.2.15.9866 on Joomla 3.8.10 PHP 7.0.30

Testing VM.3.2.15.9898 on J3.8.10

GJC Web Design

  • 3rd party VirtueMart Developer
  • Super Hero
  • *
  • Posts: 7874
  • Virtuemart, Joomla & php developer
    • GJC Web Design
  • VirtueMart Version: 2.6.22 & 3.2.14
Re: Any Path leads to an existing page...really any path
« Reply #11 on: June 09, 2018, 23:37:00 pm »
Just to add my 2 pennies worth....

I have also seen googled indexed nonsense urls  to some sites I run.  How they got indexed for me is not that interesting .. IMHO if the url is not valid it should return a 404 so it would drop out of the index eventually..

But currently with VM these are just reverting to to the root category view so google thinks they are valid and keeps the urls indexed.

I ( a while ago now so don't exactly remember the full scenario)  added this snippet in the vm router.php around line 750

search for the string  if (!isset($vars['virtuemart_category_id'])){

I added

Code: [Select]
/* GJC check that there is a category segment*/
$catseg = '';
foreach($segments as $segment){
if($segment == 'category') {
$catseg = '1';
}
}
//if (!isset($vars['virtuemart_category_id'])){
if (!isset($vars['virtuemart_category_id']) && $catseg){
/* GJC check that there is a category segment*/

this as it says checks if there is a category in the non sef url - from memory the code is : after passing various tests the default treats the first segment as a category - if not found then sends to the root cat.

now nonsense urls return 404

I haven't fully tested this but it works for me .. I especially had problems using a vmextended plugin where the new "view" was wrongly seen as a category  therefore the plugin wasn't useable with SEF on
GJC Web Design
VirtueMart and Joomla Developers - php developers http://www.gjcwebdesign.com
VM3 AusPost Shipping Plugin - e-go Shipping Plugin - VM3 Postcode Shipping Plugin - Radius Shipping Plugin - VM3 NZ Post Shipping Plugin - AusPost Estimator
Samport Payment Plugin - EcomMerchant Payment Plugin - ccBill payment Plugin
VM2 Product Lock Extension - VM2 Preconfig Adresses Extension - TaxCloud USA Taxes Plugin - Virtuemart  Product Review Component
http://extensions.joomla.org/profile/profile/details/67210
Contact for any VirtueMart or Joomla development & customisation

Cococoder

  • Beginner
  • *
  • Posts: 30
  • A beginner
Re: Any Path leads to an existing page...really any path
« Reply #12 on: June 11, 2018, 11:28:38 am »
Hey Thanks GJC Web Design!

I am going to test that asap and give feedback here!

Cococoder

  • Beginner
  • *
  • Posts: 30
  • A beginner
Re: Any Path leads to an existing page...really any path
« Reply #13 on: June 11, 2018, 11:33:48 am »
Hey!

Thanks GJC Web Design!

Well I tested your solution but it didn't change anything for me. It still revert to home page view with or without SEF URL activated with Virtuemart 404 enabled or disabled.

Any other settings I should be aware off?

@Jenkinhill: This is an interesting but that mean I should have typed this URL in the first place, but the thing is it is the other way around.

Cococoder

  • Beginner
  • *
  • Posts: 30
  • A beginner
Re: Any Path leads to an existing page...really any path
« Reply #14 on: June 18, 2018, 09:01:34 am »
Well, solution was easy, just disable virtuemart 404 error handling to fall back on joomla 404 handling which returns a proper 404 for wacky URLs