Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Matt Cutts: Adding Too Many URLs Triggers A Flag!

Shouldn't We Be More Careful When Adding New Contents?

         

reseller

10:19 pm on Sep 3, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Hi Folks

I recall discussing last year whether adding too many pages suddenly might trigger a flag or some kind of "Sandboxing". And we were guessing at that time.

However, our kind fellow member member Matt Cutts has posted on his blog [mattcutts.com] recently very interesting remark which might confirm what we were guessing:

"We saw so many urls suddenly showing up on spaces.live.com that it triggered a flag in our system which requires more trust in individual urls in order for them to rank (this is despite the crawl guys trying to increase our hostload thresholds and taking similar measures to make the migration go smoothly for Spaces). We cleared that flag, and things look much better now."

So it seems that we should be very careful in future when adding too many pages at the same time, otherwise sandboxing of our new pages would be a high possibility!

Thoughts?

Alex70

1:47 pm on Sep 5, 2006 (gmt 0)

10+ Year Member



>>Imagine if you add an "Add," "Rate," "Comment," or "Send" feature and your pages quadrupled?<<

The problem is with "unnatural" growth I believe. Add comments or rate wont make your site double at once.
Immagine linking campaing: you have something like 100 Bls would you add another 100 Bls at once?
My policy is to be far, far, away from google radar.
only my 0,002

walkman

2:18 pm on Sep 5, 2006 (gmt 0)



>> Add comments or rate wont make your site double at once.

sure it would. Each page would have that...

Aforum

2:19 pm on Sep 5, 2006 (gmt 0)

10+ Year Member



"Add comments or rate wont make your site double at once. "

It will if you include those links in your sitemap unknowingly. I think everyone understand if you add loads of BL's its going to cause a problem but that really isn't what the discussion is about. Its about adding thousands of pages to your own site with or without a BL campaign.

Technically if you have a site that has 500 pages out of 2000-5000 pages indexed because you either didn't have proper titles, meta's ,etc... and you fix this site, submit a sitemap and add thousands of pages it certainly doesn't look natural.

iProgram

2:53 pm on Sep 5, 2006 (gmt 0)

10+ Year Member



So how do you release new content site? A new content site was built on dev computer and pages are linked to each other for reference. We have to build some contents before release it, so that there won't have many empty categories. Now we heard G will flag this new site, so how to release a new site? Adding 1 page per day and send 404 to G bot?

[edited by: iProgram at 2:54 pm (utc) on Sep. 5, 2006]

Aforum

3:08 pm on Sep 5, 2006 (gmt 0)

10+ Year Member



If you believe Matt, just launch it.

Here is a prime example of the paranoia that Google has created with their constant updates, beta programs, lack luster public relations and a host of other problems.

People aren't even confident in following instructions straight from Matt Cutts anymore.

[edited by: Aforum at 3:09 pm (utc) on Sep. 5, 2006]

caveman

4:18 pm on Sep 5, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



reseller, IMHO, your comments reflect a good understanding of the situation. And without getting into great detail, I can confirm that the issue is real. Also, the reason some did not believe it for a long time is that -- as is true with so many of G's algo elements -- things are co-dependent. So the addition of x 1,000 pages for site A will not have the same effect as the addition of 1,000 pages for site B, and the difference is not just limted to the pre-existing number of pages on each site. Many other factors involved.

Everything is relative. ;-)

europeforvisitors

5:05 pm on Sep 5, 2006 (gmt 0)



Funny sentence, don't you think? I could swear someone has been telling us that Google results changes were completely automated...

Penalties (or the removal of penalties) aren't completely automated. If they were, Matt Cutts and GoogleGuy wouldn't keep mentioning "reinclusion requests."

Here is a prime example of the paranoia that Google has created with their constant updates, beta programs, lack luster public relations and a host of other problems.

I don't think Google or Matt Cutts can be blamed for the fact that some Webmasters have reasons for not wanting to draw attention to themselves, or that others have guilty consciences. :-)

theBear

5:49 pm on Sep 5, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>>>>

Penalties (or the removal of penalties) aren't completely automated. If they were, Matt Cutts and GoogleGuy wouldn't keep mentioning "reinclusion requests."

<<<<

Not to even mention actual foulups in Google's code, there has to be some way to both detect the problem and to provide a means of mitigating the effects of the problem.

Software is well known for having "bugs" (actually wetware [wetware be us] brain f@rts) and hardware sometimes can have undetected failures (frequently because of wetware issues as well).

ashear

6:28 pm on Sep 5, 2006 (gmt 0)

10+ Year Member



I can tell you if you release too many pages at once it will hurt, you will get traffic for about a month. Then it will vanish.

Aforum

7:40 pm on Sep 5, 2006 (gmt 0)

10+ Year Member



"I don't think Google or Matt Cutts can be blamed for the fact that some Webmasters have reasons for not wanting to draw attention to themselves, or that others have guilty consciences. :-) "

Would you care to explain that?

I didn't blame Matt Cutts, I actually posted what he said in his video which states you will NOT get penalized unless its extreme.

[edited by: Aforum at 7:57 pm (utc) on Sep. 5, 2006]

trinorthlighting

9:19 pm on Sep 5, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes, releasing too many pages at once will kill a site. You have to look at spam and how its created, most of it is automated and a true spam site can create hundreds or thousands of pages a day.

If your an old site like webmasterworld.com that has trust rank, I would not worry. If you are a new site, then it will take time.

That is nothing new in google and it has been that way for a while. Some people call it the sandbox...

Aforum

9:33 pm on Sep 5, 2006 (gmt 0)

10+ Year Member



I understand that, but Matt says otherwise, hence the dilemma.

photopassjapan

9:59 pm on Sep 5, 2006 (gmt 0)

10+ Year Member



Hmm.

I recently launched a site into a domain that had PR0.
Thousands of pages. All of them were indexed, and since there is actual content, they come up for some things but...

Even after being indexed for several times, the PR won't move, backlinks won't show. The original PR0 state was due to a completely irrelevant, and 0 backlink admin panel Googlebot saw accidentally. ( pre-launch )

Thought that the site's PR might go up once it's indexed with its real and relevant content, through the links that point to it... or once it's reindexed... or indexed for the 3rd or 4th time now... ( almost three months have passed ), but nothing happens.

This could be the problem.
I mean the too many pages at once flag.
Or dupe content. Or how could anyone be sure.

pontifex

10:15 pm on Sep 5, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



There are a lot of IFs and COULDs, but the average guts feelings indicate that most of you believe that adding 1000s of pages even to established sites can hurt... That would mean, that Google is actually slowing down the whole web in first gathering everybody who searches something and then punishing the publishers of large portions of content. A customer of mine (large german publishing house) has an archive unseen on the Web so far. Published it would create 2-3 million new pages on his domain which has 2-3 thousand pages right now. That content is highly unique, parts of it can only be accessed from german universities if you pay for it. Maybe we play safe and publish it in waves of 4 weeks over a period of a year? That is pain!

P!

fischermx

10:19 pm on Sep 5, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I will launch a yellow-pages like website within a few months. It will be preloaded with a couple thousands hundreds listings ...
I am dead, right?

trinorthlighting

10:29 pm on Sep 5, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Another yellow pages type of site? There are already a ton of them.

europeforvisitors

10:34 pm on Sep 5, 2006 (gmt 0)



A customer of mine (large german publishing house) has an archive unseen on the Web so far. Published it would create 2-3 million new pages on his domain which has 2-3 thousand pages right now. That content is highly unique, parts of it can only be accessed from german universities if you pay for it. Maybe we play safe and publish it in waves of 4 weeks over a period of a year? That is pain!

It wouldn't be surprising if "trustrank" were a factor. And why not? An established publishing house probably deserves to be cut more slack than yet-another-new-keyword-driven-user-review-site-with-a-million-empty-pages.com.

fischermx

10:36 pm on Sep 5, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Not in my country, my dear!
Even the yellow pages sponsored by the monopoly national telephone company is lame!
I have a chance! LOL

hutchins13

10:43 pm on Sep 5, 2006 (gmt 0)

10+ Year Member



caveman,

I'm curious to know if your site discussed in the 2004 thread "Can a Load of New Pages Hurt an Existing Site?" recovered. If so, did you make further changes which may have helped or wait it out and how long did it take?

caveman

11:57 pm on Sep 5, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes it did eventually. Certain kinds of links were the solution IMHO, because they served to confirm for G that the site had validity. (We are sure that it was not dup filters that caused the issue outlined in that thread.) But, you should not take that to mean that links cure all, they don't. I believe it's all about seeming 'natural' ... or as my team refers to it, "mapping the category."

The more a site in a given category fails to conform to certain averages for the category, the greater the risk. And generally there are some transcendant ways that a site can create risk for itself: Too many links too fast, too many pages too fast, and moving to a new domain are three examples. All of these are high risk under certain conditions. All of them are doable, within certain parameters. CNN for example, will not disappear from the SERP's if they move to a new domain. ;-)

Aforum

12:34 am on Sep 6, 2006 (gmt 0)

10+ Year Member



" There are a lot of IFs and COULDs, but the average guts feelings indicate that most of you believe that adding 1000s of pages even to established sites can hurt... "

I agree and I also think the subject matter is starting to widen a bit.

There certainly is a major difference in starting a new site from a new domain with thousands of page vs. an established site adding pages. I was more concerned about the latter because I believe starting a new site has many more obstacles to overcome and the number of pages is low on the list. I think its going to have the sandbox effect whether you have 50 or 1000 pages so basing an opinion on new sites isn't conclusive. Even then you have the curve ball that the domain spammers throw at you that blows the sandbox effect out of the water.

I think the biggest question I have is what type of trigger is there to established sites. You could have a site that basically ignored the sitemap program then all of sudden submit thousands of links adding thousands of pages of content. The question still remains is there a penalty based on the number of pages? Is it solely based on trustrank? Popular opinion here seems to think yes, Matt Cutts specifically says no. Who would you believe? :)

[edited by: Aforum at 12:35 am (utc) on Sep. 6, 2006]

caveman

1:24 am on Sep 6, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Technically if you have a site that has 500 pages out of 2000-5000 pages indexed because you either didn't have proper titles, meta's ,etc... and you fix this site, submit a sitemap and add thousands of pages it certainly doesn't look natural.

Let's be clear on terms. If you have a site with 2000-5000 pages and many pages don't have proper titles or meta's, there is a very good chance that the pages are all still indexed. You can check if they are indexed by simply searching on the URL. Fixing the titles and meta's is almost certain to help the site overall.

what type of trigger is there to established sites. Is it solely based on trustrank?.

Only G knows for sure. But it's pretty clear, if one has been paying attention to what had been happening in the last several years and what the SE's say at the conferences, etc., that adding too many pages too fast can trigger filters or algo elements that either hurt a site, or raise red flags for further inspection.

It is also very clear that the signals of quality being given off by a site by virtue of its age, size, growth pattern, backlink quality, and many other factors, have an impact on lots of things, almost certainly including the issue being discussed in this thread.

Do we think that if a high quality affiliate site quadrupled in size tomorrow that it would suffer in some respects? Eh, probably. Maybe even very probably.

Do we think that if the White House's site quadrupled in size tomorrow that it would lose rankings? Eh, prolly not. ;-)

Aforum

3:31 am on Sep 6, 2006 (gmt 0)

10+ Year Member



I agree with a lot of what you are saying and considering the number of variables involved I do agree that many of the problems are webmaster generated. The problem that I see now is that even though MC had attempted to clear up a lot of issues with his current collection of videos, especially with this specific topic, its falling on deaf ears or people simply don't even believe them.

The same questions remains. Who do you believe?

It would be nice to get some sort of confirmation on a range pertaining to the topic. Is that vague enough? :)

caveman

3:38 am on Sep 6, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It would be nice to get some sort of confirmation on a range pertaining to the topic.

Ah, yes. Well, mathematics is a good field for that. Very related to SEO too. ;-)

whitenight

3:51 am on Sep 6, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Do we think that if the White House's site quadrupled in size tomorrow that it would lose rankings? Eh, prolly not

Someone want to tell the genuises over at G, this is exactly what the spammers are doing?

Currently I see 3 "trusted" .edu domains spamming and ranking with (hundreds of) thousands of newly added pages.

Oh wait. One doesn't need a .edu to rank with millions of new pages, do they? (cough. bad data push 5 million page subdomain spam)

I'm unsure why people continue to listen to what MC says, when the results are right there in the SERPS on what works and doesn't work.

If MC knew, there wouldn't be any spam ranking, would there?

caveman

4:03 am on Sep 6, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Infinite chess game, white knight. ;-)

tedster

4:09 am on Sep 6, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I think many people who listen to Matt are those who are trying to rank "legitimately" in Google. That is, they are mostly concerned about sending false positives, or getting caught accidentally in a Google net intended to catch spammers. This is one good reason to listen to what Matt says -- there is usually enough meat there to at least help you steer clear of big problem areas, and sometimes problem areas that only recently rose in importance.

In this case, one comment that is worth paying some attention to, I think, is

So this is not something that a typical site owner needs to think about or worry about if they're not adding hundreds of thousands or millions of URLs very quickly.

Here he gives some sense of the scale that would trip this particular flag -- and it's not in the thousands of urls, or even the tens of thousands. Certainly seasonal changes at completely legitimate ecommerce sites can require a couple thousand new urls at once. And Google would also have a historical record of that kind of seasonal change to help boost their confidence and trust.

And then there's the phrase "typical site owner". That may be just a bit cloudier in meaning. But the first thing that comes to mind for me is that it is NOT common to introduce 6 or 7 digits worth of new urls at one time. I've worked with some pretty big companies with lots of indexed URLs, and I can't recall any new release on a domain that ever approached that level. First of all, marketing and editorial would never have the resources to oversee and approve that much new content all at once.

theBear

4:24 am on Sep 6, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Caveman

Causality breakdown, operation failure, event ordering error .... bad data pop. ;)

[edited by: theBear at 4:25 am (utc) on Sep. 6, 2006]

caveman

4:37 am on Sep 6, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



theBear, now, there's no need to get into my personal life. :P

whitenight

4:50 am on Sep 6, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I think many people who listen to Matt are those who are trying to rank "legitimately" in Google

And I say, they'd be better off meticulously analyzing "what works" and "what doesn't" by examining the SERPS.
MC has said alot of things about "what to do" and "what not to do".

And the list of things that G is "supposedly" penalizing for but still actually work is much longer than the things they warn about and sites get punished for.

By no means am I promoting a "black hat" philosophy, but let's get real here for a sec. I'm tired of the follow-the-G-rules mindset of some of the webmasters here wondering why MC's latest "warning" hasn't been applied to their competitors while there lily white site (or grey) is tanking.

Cause it's simply alot of bark vs. bite and scare tactics.

If more people would take MC's comments with a grain (or two) of salt and then look for actually proof that the newest algo is applying those new standards, that would save most webmasters here a lot of grief - worrying and fretting over the interpretation of MC's comments.

This whole discussion is case in point.
"How much is too much?"
"With what (unknown) Trustrank can I add X number of pages?"

By definition, if one is trying to "rank legitimately" and needs to add 100k "legitimate" pages to their 2k page site. Then I say do it.
Those pages will rank as they should in time. Sandboxed or not.

This 106 message thread spans 4 pages: 106