Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

How to improve the abnormal googlebot-crawling behavior for my site?

         

matthee

7:04 am on Jun 3, 2016 (gmt 0)

10+ Year Member



Hi, all

My site comes across an abnormal crawling behavior of googlebot.:
for one day, Crawlers continuously crawl a few hours, then,it would stop for few hours, crawling none. This phenomenon often repeated.
Puzzled that, sometimes, bots crawled a considerable number of data in few hours, (eg, 10 times in one second), then stopped for wa few hours or slowed down.
This phenomenon looks very sick for my site. But i don’t know why. Also, it has already impacted indexing, and traffic is unstable.

Logs and crawling datas are as follows:
[lh3.googleusercontent.com...]

[lh3.googleusercontent.com...]

Is there something wrong with my site? Because, my other site’crawling is relatively normal,that crawlers continuously crawl datas. Did you encounter this situation? How will I analyze my site?
PS: I tried to submit sitemaps to Googlebot on May 8, but no positive effect.
[lh3.googleusercontent.com...]

Thank for your discussing about this questions.

aakk9999

12:23 pm on Jun 3, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I have seen such spikes in cases where Googlebot discovers something new on the site and it is then busy consuming all these new pages. For example, launching a new sub-section on the site or as a result of site redevelopment when URLs change.

What does look odd on the crawl stats is that the number of pages crawled per day goes from 500-ish (which seem to be most of the time) to over 140,000 when there is a spike.

Here are some questions:
- How many pages has the site?
- URLs that are crawled when these spikes occur - are they all individual URLs or the same URL is crawled repeatedly?
- Do you maybe have (or had) a technical problem of so called "infinite URL space" where relative link within the "href=" causes the page to create huge number of URLs for the same page (e.g. example.com/?parm=1&parm=1&parm=1)

In other words, I would firstly more worry about the huge difference between the number of pages crawled rather than the actual timing of crawls.

tangor

12:14 am on Jun 4, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Is this a CMS or hand coded site? Sounds like a configuration error in a CMS to me.

3zero

12:50 am on Jun 4, 2016 (gmt 0)



Yeah alott more detail on the site please in order to help, I suspect googlebot is crawling parameters and you can control that in WSC.