?cff_ pattern in crawled addresses for sitemap

Started by NoOneLt, March 21, 2020, 09:49:27 AM

Previous topic - Next topic

NoOneLt

Hello, Can someone help me to understand why crawler gets strange addresses on my site, and there is a lot of this links, i would like to understand the reason,

Using latest Joomla and VM versions as of today

<url><loc>https://www.mysite.lt/products?cff_3%5B0%5D=536175736169206f646169</loc><lastmod>2020-03-21</lastmod></url>
<url><loc>https://www.mysite.lt/products?cff_3%5B0%5D=2050726965c5a1206f646f732073656ec4976a696dc485</loc><lastmod>2020-03-21</lastmod></url>
<url><loc>https://www.mysite.lt/products?cff_3%5B0%5D=204d69c5a172696169204f646169</loc><lastmod>2020-03-21</lastmod></url>
<url><loc>https://www.mysite.lt/products?cff_3%5B0%5D=204272616e64c5be696169206f646169</loc><lastmod>2020-03-21</lastmod></url>
<url><loc>https://www.mysite.lt/products?cff_3%5B0%5D=20506c61756b616d73</loc><lastmod>2020-03-21</lastmod></url>
<url><loc>https://www.mysite.lt/products?cff_3%5B0%5D=2050726965c5a1207069676d656e746163696ac485</loc><lastmod>2020-03-21</lastmod></url>
<url><loc>https://www.mysite.lt/products?cff_3%5B0%5D=204f646f7320c5a176656974696d6173</loc><lastmod>2020-03-21</lastmod></url>
<url><loc>https://www.mysite.lt/products?cff_3%5B0%5D=2050726965c5a120616b6ec49920</loc><lastmod>2020-03-21</lastmod></url>
<url><loc>https://www.mysite.lt/products?cff_9%5B0%5D=20c5a0616d70c5ab6e6173</loc><lastmod>2020-03-21</lastmod></url>
<url><loc>https://www.mysite.lt/products?cff_9%5B0%5D=20536572756d6173</loc><lastmod>2020-03-21</lastmod></url>
<url><loc>https://www.mysite.lt/products?cff_9%5B0%5D=205069656e656c6973</loc><lastmod>2020-03-21</lastmod></url>
<url><loc>https://www.mysite.lt/products?cff_9%5B0%5D=4b72656d6173</loc><lastmod>2020-03-21</lastmod></url>
<url><loc>https://www.mysite.lt/products?cff_9%5B0%5D=204b61756bc497</loc><lastmod>2020-03-21</lastmod></url>

Thank you!


NoOneLt

Not mine but same template.. :)

And seems you are right, its filtering.

GJC Web Design

yes .. the VirtualPlanet filter thing

try

Disallow: /*?cff

in your robots.txt
GJC Web Design
VirtueMart and Joomla Developers - php developers https://www.gjcwebdesign.com
VM4 AusPost Shipping Plugin - e-go Shipping Plugin - VM4 Postcode Shipping Plugin - Radius Shipping Plugin - VM4 NZ Post Shipping Plugin - AusPost Estimator
Samport Payment Plugin - EcomMerchant Payment Plugin - ccBill payment Plugin
VM2 Product Lock Extension - VM2 Preconfig Adresses Extension - TaxCloud USA Taxes Plugin - Virtuemart  Product Review Component
https://extensions.joomla.org/profile/profile/details/67210
Contact for any VirtueMart or Joomla development & customisation


Ventsi Genchev

I had a similar problem and so I use the following in the robots.txt to eliminate all possible unwanted indexing.

Disallow: /?start=*
Disallow: /*by,product_name*
Disallow: /*by,created_on*
Disallow: /*by,product_price*
Disallow: /*dirDesc*
Disallow: /*dirAsc*
Disallow: /*results,*


Of course, by,xxxx depends on the sorting options used.
Audio Store:
https://vsystem.bg - Bulgarian language
https://vsystem.bg/en - English