Author Topic: Google crawl errors  (Read 7499 times)

zazu

  • Jr. Member
  • **
  • Posts: 59
Google crawl errors
« on: March 31, 2010, 14:42:26 pm »
I'm getting a lot of google crawl errors. One in particular mystifies me and there are several hundred in this format. Would appreciate any explanation and suggestions on a fix

/index.php/childrens-young-adult-educational/childrens-young-adult-fiction-true-stories/animal-stories-childrensya/index2.php?page=shop.recommend&product_id=6770&pop=1&tmpl=component&option=com_virtuemart&Itemid=71

I'm using sh404SEF. There are several issues here

How come index2 etc gets appended to the url
What's with pop1 - I also notice this and pop0 get appended to next and previous urls on the flypage

I think this is an issue with the pdf, print, email options on the flypage - the pdf option doesn't work either - flags up an error message like

FPDF error: Alpha channel not supported: /home/.sites/***/site2/web/components/com_virtuemart/shop_image/vendor/mydomain.com_4ae802f2e989e.png

jenkinhill

  • UK Web Developer & Consultant
  • Global Moderator
  • Super Hero
  • *
  • Posts: 27779
  • Always on vacation
    • Jenkin Hill Internet
Re: Google crawl errors
« Reply #1 on: March 31, 2010, 15:33:28 pm »
When I see issues like this I can see why Google Webmaster Central once again has recommended that rewriting of urls should not be used.

See Question 4 - Would you advise to use the rewrite URL function?

See also http://googlewebmastercentral.blogspot.com/2008/09/dynamic-urls-vs-static-urls.html

For sh404SEF questions you could ask on the "official" support forum: http://dev.anything-digital.com/Forum/43-Virtuemart/
Kelvyn

Jenkin Hill Internet,
Lowestoft, Suffolk, UK

Unsolicited PMs/emails will be ignored.

Please mention your VirtueMart, Joomla and PHP versions when asking a question in this forum

Currently using VM 3.8.4.10335 on Joomla 3.9.19 PHP 7.3.18

stinga

  • Contributing Developer
  • Full Member
  • *
  • Posts: 872
    • Squangle ltd
Re: Google crawl errors
« Reply #2 on: March 31, 2010, 16:26:51 pm »
that url does not look correct to me.
I don't think I have seen an index.php and a index2.php in the same url.

You can block index.php and index2.php in your robots.txt if you are using SEF and some sore of sitemap.

I posted on how to get rid of pop=0, it is not needed.
I also posted on how to remove keyword= since that will cause duplicate urls

I think pop=1 means show that page without menus etc.
Stinga.
614869 products in 747 categories with 15749 products in 1 category.
                                             Document Complete   Fully Loaded
                Load Time First Byte Start Render   Time      Requests      Time      Requests
First View     2.470s     0.635s     1.276s          2.470s       31            2.470s      31
Repeat View  1.064s     0.561s     1.100s          1.064s       4             1.221s       4

zazu

  • Jr. Member
  • **
  • Posts: 59
Re: Google crawl errors
« Reply #3 on: March 31, 2010, 21:16:17 pm »
Thanks for the replies

the sh404SEF forum has advised me that the problem is a VM problem. However will consider the Google advise about rewriting urls.

I did change some code in the flypage to get rid of the pop= issue, but it didn't work. I found the code in a post on this forum


Forrest

  • Full Member
  • ***
  • Posts: 1972
  • Me and my baby
    • Web Developer
Re: Google crawl errors
« Reply #4 on: March 31, 2010, 21:30:34 pm »
A VM problem? My instinct is: not likely. I have not encountered this issue with VM with NON SEF, and likewise, with SEF using other SEF components.

Therefore seems to me like a sh404SEF problem. After all, it is sh404 what is rewriting your URLS.

stinga

  • Contributing Developer
  • Full Member
  • *
  • Posts: 872
    • Squangle ltd
Re: Google crawl errors
« Reply #5 on: April 01, 2010, 11:55:34 am »
G'day,

To be honest, this sort or error is not going to easy to fix via forums, there are too many variables.
I shouldn't say this (since I am currently unemployed), but if you give me access I can have a look.

There are two parts to this.
1 - the url rewrite
2 - google getting the url.

where is google getting the url from it might not be a problem with your site at all.
are you using a sitemap?

sh404sef works fine for me, the major problem was with sefservicemap2 that create duplicate urls and other junk and slows your site down.
Stinga.
614869 products in 747 categories with 15749 products in 1 category.
                                             Document Complete   Fully Loaded
                Load Time First Byte Start Render   Time      Requests      Time      Requests
First View     2.470s     0.635s     1.276s          2.470s       31            2.470s      31
Repeat View  1.064s     0.561s     1.100s          1.064s       4             1.221s       4

zazu

  • Jr. Member
  • **
  • Posts: 59
Re: Google crawl errors
« Reply #6 on: April 01, 2010, 13:50:42 pm »
My site has turned to custard in the past few hours.

I disable sh404SEF - everything worked as it should. I read about the HACK for VM meta tags, downloaded and installed that. Seemed to work fine. Then read about router.php which worked ok to a point - got internal server errors with.htaccess so disabled that and router.php

Problem now is my browse pages are all blank

Forrest

  • Full Member
  • ***
  • Posts: 1972
  • Me and my baby
    • Web Developer
Re: Google crawl errors
« Reply #7 on: April 01, 2010, 21:52:00 pm »
I suggest you re-enable .htaccess , then try removing the router.php file. Wait a few minutes and try navigating site.


zazu

  • Jr. Member
  • **
  • Posts: 59
Re: Google crawl errors
« Reply #8 on: April 01, 2010, 22:07:27 pm »
Thanks

I get an Internal Server Error when I enable .htaccess - I am waiting for advice from my host.

I've since purged the Joomla cache and disabled caching and the site seems to have returned to normal.

I'll keep testing and reload the meta tag Hack as that was a super innovation

zazu

  • Jr. Member
  • **
  • Posts: 59
Re: Google crawl errors
« Reply #9 on: April 09, 2010, 01:21:55 am »
I am gradually fixing these crawl errors, but I have one that is proving somewhat elusive to track down. I get a 404 error on this url

http://www.mysite.co.nz/index.html and google says there are 100 pages trying to link. If I look at the source code on any of these 100 pages there is no ../index.html. There are however plenty of references to ../index.php

Forrest

  • Full Member
  • ***
  • Posts: 1972
  • Me and my baby
    • Web Developer
Re: Google crawl errors
« Reply #10 on: April 09, 2010, 02:30:46 am »
Well an easy fix is to get Jredirect plugin (on the JED) and redirect the index.html to index.php.