Forum Moderators: Robert Charlton & goodroi
I recall discussing last year whether adding too many pages suddenly might trigger a flag or some kind of "Sandboxing". And we were guessing at that time.
However, our kind fellow member member Matt Cutts has posted on his blog [mattcutts.com] recently very interesting remark which might confirm what we were guessing:
"We saw so many urls suddenly showing up on spaces.live.com that it triggered a flag in our system which requires more trust in individual urls in order for them to rank (this is despite the crawl guys trying to increase our hostload thresholds and taking similar measures to make the migration go smoothly for Spaces). We cleared that flag, and things look much better now."
So it seems that we should be very careful in future when adding too many pages at the same time, otherwise sandboxing of our new pages would be a high possibility!
Thoughts?
The problem is with "unnatural" growth I believe. Add comments or rate wont make your site double at once.
Immagine linking campaing: you have something like 100 Bls would you add another 100 Bls at once?
My policy is to be far, far, away from google radar.
only my 0,002
sure it would. Each page would have that...
It will if you include those links in your sitemap unknowingly. I think everyone understand if you add loads of BL's its going to cause a problem but that really isn't what the discussion is about. Its about adding thousands of pages to your own site with or without a BL campaign.
Technically if you have a site that has 500 pages out of 2000-5000 pages indexed because you either didn't have proper titles, meta's ,etc... and you fix this site, submit a sitemap and add thousands of pages it certainly doesn't look natural.
[edited by: iProgram at 2:54 pm (utc) on Sep. 5, 2006]
Here is a prime example of the paranoia that Google has created with their constant updates, beta programs, lack luster public relations and a host of other problems.
People aren't even confident in following instructions straight from Matt Cutts anymore.
[edited by: Aforum at 3:09 pm (utc) on Sep. 5, 2006]
Everything is relative. ;-)
Funny sentence, don't you think? I could swear someone has been telling us that Google results changes were completely automated...
Penalties (or the removal of penalties) aren't completely automated. If they were, Matt Cutts and GoogleGuy wouldn't keep mentioning "reinclusion requests."
Here is a prime example of the paranoia that Google has created with their constant updates, beta programs, lack luster public relations and a host of other problems.
I don't think Google or Matt Cutts can be blamed for the fact that some Webmasters have reasons for not wanting to draw attention to themselves, or that others have guilty consciences. :-)
Penalties (or the removal of penalties) aren't completely automated. If they were, Matt Cutts and GoogleGuy wouldn't keep mentioning "reinclusion requests."
<<<<
Not to even mention actual foulups in Google's code, there has to be some way to both detect the problem and to provide a means of mitigating the effects of the problem.
Software is well known for having "bugs" (actually wetware [wetware be us] brain f@rts) and hardware sometimes can have undetected failures (frequently because of wetware issues as well).
Would you care to explain that?
I didn't blame Matt Cutts, I actually posted what he said in his video which states you will NOT get penalized unless its extreme.
[edited by: Aforum at 7:57 pm (utc) on Sep. 5, 2006]
If your an old site like webmasterworld.com that has trust rank, I would not worry. If you are a new site, then it will take time.
That is nothing new in google and it has been that way for a while. Some people call it the sandbox...
I recently launched a site into a domain that had PR0.
Thousands of pages. All of them were indexed, and since there is actual content, they come up for some things but...
Even after being indexed for several times, the PR won't move, backlinks won't show. The original PR0 state was due to a completely irrelevant, and 0 backlink admin panel Googlebot saw accidentally. ( pre-launch )
Thought that the site's PR might go up once it's indexed with its real and relevant content, through the links that point to it... or once it's reindexed... or indexed for the 3rd or 4th time now... ( almost three months have passed ), but nothing happens.
This could be the problem.
I mean the too many pages at once flag.
Or dupe content. Or how could anyone be sure.
P!
A customer of mine (large german publishing house) has an archive unseen on the Web so far. Published it would create 2-3 million new pages on his domain which has 2-3 thousand pages right now. That content is highly unique, parts of it can only be accessed from german universities if you pay for it. Maybe we play safe and publish it in waves of 4 weeks over a period of a year? That is pain!
It wouldn't be surprising if "trustrank" were a factor. And why not? An established publishing house probably deserves to be cut more slack than yet-another-new-keyword-driven-user-review-site-with-a-million-empty-pages.com.
The more a site in a given category fails to conform to certain averages for the category, the greater the risk. And generally there are some transcendant ways that a site can create risk for itself: Too many links too fast, too many pages too fast, and moving to a new domain are three examples. All of these are high risk under certain conditions. All of them are doable, within certain parameters. CNN for example, will not disappear from the SERP's if they move to a new domain. ;-)
I agree and I also think the subject matter is starting to widen a bit.
There certainly is a major difference in starting a new site from a new domain with thousands of page vs. an established site adding pages. I was more concerned about the latter because I believe starting a new site has many more obstacles to overcome and the number of pages is low on the list. I think its going to have the sandbox effect whether you have 50 or 1000 pages so basing an opinion on new sites isn't conclusive. Even then you have the curve ball that the domain spammers throw at you that blows the sandbox effect out of the water.
I think the biggest question I have is what type of trigger is there to established sites. You could have a site that basically ignored the sitemap program then all of sudden submit thousands of links adding thousands of pages of content. The question still remains is there a penalty based on the number of pages? Is it solely based on trustrank? Popular opinion here seems to think yes, Matt Cutts specifically says no. Who would you believe? :)
[edited by: Aforum at 12:35 am (utc) on Sep. 6, 2006]
Technically if you have a site that has 500 pages out of 2000-5000 pages indexed because you either didn't have proper titles, meta's ,etc... and you fix this site, submit a sitemap and add thousands of pages it certainly doesn't look natural.
what type of trigger is there to established sites. Is it solely based on trustrank?.
It is also very clear that the signals of quality being given off by a site by virtue of its age, size, growth pattern, backlink quality, and many other factors, have an impact on lots of things, almost certainly including the issue being discussed in this thread.
Do we think that if a high quality affiliate site quadrupled in size tomorrow that it would suffer in some respects? Eh, probably. Maybe even very probably.
Do we think that if the White House's site quadrupled in size tomorrow that it would lose rankings? Eh, prolly not. ;-)
The same questions remains. Who do you believe?
It would be nice to get some sort of confirmation on a range pertaining to the topic. Is that vague enough? :)
Do we think that if the White House's site quadrupled in size tomorrow that it would lose rankings? Eh, prolly not
Someone want to tell the genuises over at G, this is exactly what the spammers are doing?
Currently I see 3 "trusted" .edu domains spamming and ranking with (hundreds of) thousands of newly added pages.
Oh wait. One doesn't need a .edu to rank with millions of new pages, do they? (cough. bad data push 5 million page subdomain spam)
I'm unsure why people continue to listen to what MC says, when the results are right there in the SERPS on what works and doesn't work.
If MC knew, there wouldn't be any spam ranking, would there?
In this case, one comment that is worth paying some attention to, I think, is
So this is not something that a typical site owner needs to think about or worry about if they're not adding hundreds of thousands or millions of URLs very quickly.
Here he gives some sense of the scale that would trip this particular flag -- and it's not in the thousands of urls, or even the tens of thousands. Certainly seasonal changes at completely legitimate ecommerce sites can require a couple thousand new urls at once. And Google would also have a historical record of that kind of seasonal change to help boost their confidence and trust.
And then there's the phrase "typical site owner". That may be just a bit cloudier in meaning. But the first thing that comes to mind for me is that it is NOT common to introduce 6 or 7 digits worth of new urls at one time. I've worked with some pretty big companies with lots of indexed URLs, and I can't recall any new release on a domain that ever approached that level. First of all, marketing and editorial would never have the resources to oversee and approve that much new content all at once.
I think many people who listen to Matt are those who are trying to rank "legitimately" in Google
And I say, they'd be better off meticulously analyzing "what works" and "what doesn't" by examining the SERPS.
MC has said alot of things about "what to do" and "what not to do".
And the list of things that G is "supposedly" penalizing for but still actually work is much longer than the things they warn about and sites get punished for.
By no means am I promoting a "black hat" philosophy, but let's get real here for a sec. I'm tired of the follow-the-G-rules mindset of some of the webmasters here wondering why MC's latest "warning" hasn't been applied to their competitors while there lily white site (or grey) is tanking.
Cause it's simply alot of bark vs. bite and scare tactics.
If more people would take MC's comments with a grain (or two) of salt and then look for actually proof that the newest algo is applying those new standards, that would save most webmasters here a lot of grief - worrying and fretting over the interpretation of MC's comments.
This whole discussion is case in point.
"How much is too much?"
"With what (unknown) Trustrank can I add X number of pages?"
By definition, if one is trying to "rank legitimately" and needs to add 100k "legitimate" pages to their 2k page site. Then I say do it.
Those pages will rank as they should in time. Sandboxed or not.