Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Matt Cutts: Adding Too Many URLs Triggers A Flag!

Shouldn't We Be More Careful When Adding New Contents?

         

reseller

10:19 pm on Sep 3, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Hi Folks

I recall discussing last year whether adding too many pages suddenly might trigger a flag or some kind of "Sandboxing". And we were guessing at that time.

However, our kind fellow member member Matt Cutts has posted on his blog [mattcutts.com] recently very interesting remark which might confirm what we were guessing:

"We saw so many urls suddenly showing up on spaces.live.com that it triggered a flag in our system which requires more trust in individual urls in order for them to rank (this is despite the crawl guys trying to increase our hostload thresholds and taking similar measures to make the migration go smoothly for Spaces). We cleared that flag, and things look much better now."

So it seems that we should be very careful in future when adding too many pages at the same time, otherwise sandboxing of our new pages would be a high possibility!

Thoughts?

walkman

5:21 am on Sep 6, 2006 (gmt 0)



>> A customer of mine (large german publishing house)
In your case, where the site is legit and well known, an email to Google, or any other form of contact will solve any potential problems. Google loves to have more good content indexed, and you have nothing to hide.

buckworks

5:25 am on Sep 6, 2006 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



but still actually work

It's wise to keep an eye on what works today, but it's even wiser to try to understand what will still work next week, or next year. That's where I want to focus my efforts.

I interpret Matt's comments as a signpost indicating the direction they're trying to go, rather than a milestone for how far they've come so far. If you take his advice you'll be fairly close to the right path a year from now.

reseller

11:04 am on Sep 6, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Hi Folks

First off, it isn't my intention in this particular post to start a Google-Bashing campaign. Rather I wish to share with you few thoughts and hopefully trigger few objective feedbacks.

It seems that the majority here agree that Google might sandbox sites adding suddenly relatively big portion of files/contents.

Reasons for such action could be fighting back on spam or/and resolving capacity problems of Google-boxes. And of course there could be other reasons too.

Regardless of the reasons, the results of such Google policy could be limiting the legitimate contents growth of legitimate websites. And accordingly limiting the growth and exchange of available information on the web. I.e Google mightbe playing the rule of Information-growth-Inhibitor instead of "to organize the world's information and make it universally accessible and useful", unfortunately. And I really doubt that Google's founders Larry Page and Sergey Brin will accept or encourage such policy once they are made aware of.

That matter could be a sign of very serious threat to the availability and flow of information on the web. And I don't wish to accuse Google of anything at the moment. But I do wish that Google pays much serious attention to such serious or rather critical matter.

Therefore we need at this point of discussion some feedback from Google employees as to how to resolve the problem of sandboxing ligitimate sites, once they add suddenly relatively large number of legitimate files, and thereby limiting the availability of ligitimate contents on the web.

europeforvisitors

3:47 pm on Sep 6, 2006 (gmt 0)



...the results of such Google policy could be limiting the legitimate contents growth of legitimate websites. And accordingly limiting the growth and exchange of available information on the web. I.e Google mightbe playing the rule of Information-growth-Inhibitor instead of "to organize the world's information and make it universally accessible and useful"

- Just because a site is "legitimate" doesn't mean its content is intrinsically valuable or unique. For Google, usable, uncluttered search results are more important than an indexed site's legitimacy as a business.

- SERPs, like most other things in life, involve trade-offs. If Google's experience and statistical analysis suggested that the addition of 10,000 or 100,000 new pages overnight was a negative "signal of quality," then it wouldn't be unreasonable for Google to judge those pages by a higher standard than it might do under normal circumstances.

- I personally doubt that a site is going to get whacked only because it added a bunch of pages at once--or even that large numbers of new pages from every site will be sandboxed. And if either of things is happening, it's probably only until Google's profiling of "whole buncha pages added" sites is refined.

reseller

10:24 pm on Sep 6, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



SERPs, like most other things in life, involve trade-offs. If Google's experience and statistical analysis suggested that the addition of 10,000 or 100,000 new pages overnight was a negative "signal of quality," then it wouldn't be unreasonable for Google to judge those pages by a higher standard than it might do under normal circumstances.

You assume that those filters that trigger flags are intelligent ones, which I doubt its the case. IMO, we are dealing with very primitive filters which flags files based on proportional numbers .. not the quality of files.

In his article, Matt confirmed that it was a number game not a quality one.

We saw so many urls suddenly showing up on spaces.live.com that it triggered a flag in our system

WebPixie

10:39 pm on Sep 6, 2006 (gmt 0)

10+ Year Member



I have some recent experience with this issue. A few months ago we added a database of local retail shops that sell products related to the products that we sell online. We went back and forth on this choice but eventually decide that it was a useful resource for the visitors to our site so we added the database. The net result was adding around 15k pages to a site that previously had under 2k pages.

We had guessed that adding these pages might help with Y! and MSN but worried that Google would see it as spam. So far it appears that Google has not penalized us at all. We've even climbed a few position on our main search phrase, though I obviously can't point only to the added pages as the reason. But, it appears that adding 15k related pages to a site with under 2k pages does not trigger the alarm, at least in our case.

decaff

10:43 pm on Sep 6, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You assume that those filters that trigger flags are intelligent ones, which I doubt its the case. IMO, we are dealing with very primitive filters which flags files based on proportional numbers .. not the quality of files.

Google has historical data on every page and associated site...if a site (legitimate/established in the SERPs) suddently shows an unusual spike of new urls being either crawled or submitted (who submits pages these days?)...then this could certainly trigger a filter (suppression, penalty, exclusion) and possibly even a manual review...as to what is going on with this established site and why are there suddenly 100,000 (example) new pages...

(How are these new pages adding value to the respective sector?...and from a usability perspective ... for positive click thru indicators from Google's SERPs-to-page traffic tracking techniques...)

Regarding the quality of the files...this can be determined when the files are crawled and run through the ranking algo variables...do these pages indicate over optimization...or are they set up for the human visitor? (easy for Google to get this)...

texasville

2:34 am on Sep 7, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This is only one more symptom of their flag throwing. Think how many sites (one of mine included) have been "sandboxed" due to a redesign.
I believe it is to prevent spammers from buying old sites and taking advantage of existing "trust" and pr.
If...as Matt seems to indicate with his statement, it draws human eyes and then things are put back on track, that is fine. The exception I take with it is that it seems to take so long. Google exhibits a real problem with understaff. They seem to be able to invest in a lot of projects but not in the basics of what brought them to the dance.

reseller

3:09 pm on Sep 7, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The problem in Google's approach to the subject of this thread, as I see it, is:

- "Trusted Sites" are able to add too many files suddenly without triggering a flag.

- The rest of sites can't add the same number of "too many files" without triggering a flag and accordingly being sandboxed. I.e such sites have the inhereted disadvantage of being equated to spam sites! I.e such sites are judged guilty until they prove otherwise!

Its that way of thinking from our friends at Googleplex which I question. As such, this thread isn't only about technicality, filters and flags rather its also about moral issues which the friends at the plex need to address!

soapystar

3:44 pm on Sep 7, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



this is something ive pointed out continually. These type of sites can spam on a large scale basis and at worst just have those factors ignored. For the small guy and mom and pops simply falling on the wrong side of some grey hat micro onsite filter line that moves around its own axis can bin your entire site. The affect may be just what they were looking for. Bringing big brands to the fore gives a cleaner look to the serps. So i guess they have decided the end justifies the means, or at least the affect anyway. It may not have been a process they aimed at but like the sandbox the affect is desirable so no need for adjustment.

europeforvisitors

4:26 pm on Sep 7, 2006 (gmt 0)



As such, this thread isn't only about technicality, filters and flags rather its also about moral issues which the friends at the plex need to address!

What moral issues?

If some sites (or types of sites) are more trusted than others, what's immoral about that?

It's legitimate to question whether "whole buncha pages added" filters are implemented correctly, or whether they work as well as they should, but what's morally wrong about discriminating between pages that are statistically likely to be junk and pages that aren't? Webmasters need to get over the idea that they're entitled to listings no matter what. Search results are based on editorial judgments (whether made by humans or by algorithms that use criteria set by humans), and SEs like Google have the right to make editorial decisions just as we do.

Aforum

5:31 pm on Sep 7, 2006 (gmt 0)

10+ Year Member



"but what's morally wrong about discriminating between pages that are statistically likely to be junk and pages that aren't?"

I don't think anyone is saying it isn't. What they are saying is that the criteria can cause massive amounts of collateral damage.

Its pretty tough to follow all the rules when the rules are constantly changing. Its even tougher when you have to read through the cryptic lines of the unofficially blog. On top of that it seems everyone thinks the rules are different based on trustrank, a concept we really know nothing about. Throw that all together and you have a lot of confused people that are being told that Matt Cutts video's aren't really accurate or truthful based on the some the experts of this site.

decaff

7:47 pm on Sep 7, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Its pretty tough to follow all the rules when the rules are constantly changing.

The "rules" are constantly changing because the demand in the marketplace at Google's level is constantly changing...

One rule that never changes...know everything you can about your target audience...and speak to them directly through your web design, information design, copy...etc..etc..stop chasing the algos...and you think smoking shortens you life?

It just so happens that Google is targeting everyone on the planet...so their need to constantly adjust their approach to satisfy this is important...plus their insatiatable appetite to reap in staggering profits every quarter..

Aforum

8:49 pm on Sep 7, 2006 (gmt 0)

10+ Year Member



I understand why they are changing. I am simply stating its extremely frustrating for those who don't have 5 year old sites to rely upon and have to listen to others for guidance when it appears even they don't know what is acceptable. According to MC I shouldn't be listening to people in those "SEO" forums.

:)

[edited by: Aforum at 8:49 pm (utc) on Sep. 7, 2006]

steveb

10:26 pm on Sep 7, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The ability to register on a website doesn't make someone an expert.

This is yet another example where there is nothing to be confused about, but webmaster fud starts on a Chicken Little spiral.

Adding a few hundred thousands of pages in a day will probably get them viewed as what they are, not very important.

theBear

10:42 pm on Sep 7, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



steveb, in general I agree about the adding a couple hundred thousand a day, but golly gee, MSN isn't exactly a fly by night spaming operation.

Nor are a lot of folks who setup a site only to discover that they are being treated as spammers.

There are too many cooks in the process and the chicken has been over spiced and under cooked.

I know of many members here who could all of a sudden look like they were spammers.

In short we heve met the enemy, and they are we.

Aforum

12:04 am on Sep 8, 2006 (gmt 0)

10+ Year Member



" The ability to register on a website doesn't make someone an expert."

If someone is confident enough to contradict Matt Cutts on an indexing issue, they have labeled themselves.

:)

[edited by: Aforum at 12:20 am (utc) on Sep. 8, 2006]

steveb

1:08 am on Sep 8, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Um, no they haven't.

MSN's free host pages are "fly by night". The addition of hundreds of thousands pages on MSN spaces is simply the addition of free host pages, which should always be viewed very, very skeptically.

The real problem of course is Google's TrustSpam algo continues to love these free host pages. It seems suddenly they woke up, duh, and noticed their index is full of putrid garbage on free web hosts, that ranks due to hundreds of thousands of links from other free webpages/blogs.

They like free hosts. They like blogs. They are so far up their wrong end in this regard they will need a year or more to extract their heads. Finally noticing, duh again, that spammers are exploiting their incredibly stupid algo leads to a simplistic response wholly inadequate to deal with the problem they have created.

whitenight

1:20 am on Sep 8, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If someone is confident enough to contradict Matt Cutts on an indexing issue, they have labeled themselves.

I'll assume you mean me. And as oft uttered, The SERPS speak for themselves.

MC is not God, nor even all-knowing as to how the 1000s of algo factors interact with each other.

So I'll repeat again, anyone can be an "expert" on to what really works if they study the various SERPS for various keywords thoroughly, whether it's for white, grey, or black hat techniques.

--------
Note- if you're worried about "where" you rank for any given page and trying to "avoid" sandboxes, filters, or whatever, by definition you are trying to "game" Google. I have no problem with that per se, but at least be honest about it.

True "legitimate" white hatters focus on building authority sites that can live off of word-of-mouth, bookmarks, natural memes, etc. The top rankings come naturally... whenever they come.

decaff

1:41 am on Sep 8, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



if one is trying to "rank legitimately" and needs to add 100k "legitimate" pages to their 2k page site....

So you have a relatively small 2K site...and you discover overnight that in order for your site to rank in your sector you need to add 100K pages immediately..(and forget "legitimate")...adding this many pages constitutes spam...plain and simple...it is not for your user base...but as you describe...for ranking purposes....you are spamming the index ... plain and simple...and this should trigger some sort of suppression filter (at the least)...

whitenight

1:55 am on Sep 8, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



but as you describe... for ranking purposes....you are spamming the index ... plain and simple

Lol, you put quite a few words in my mouth.

1.) I never said it was spamming or not spamming.
2.) When did I say for "ranking purposes"?
3.) Who says that is G going to see it as spamming?
MC says it will be seen as spamming, but the results for certain sites currently ranking using this techniques says different.

My point is...make your own informed judgement call on what will happen. Not to take MC's proclamation as the final decision.

I recently added 50k pages to a 2k "authority" site as I wanted to add a backend "cafepress-like" store to it.
No ranking issues for the existing pages.
New pages are ranking probably as they should considering how new they are.
Will this work for your site?
How the heck do I know!?
Do your research and take an educated guess.
Do you know if your site is "trusted" by G?
Are your pages able to pass a visual inspection?

Or take MC's comments as gospel and throw the pages up on a different domain and be sandboxed anyways.

I can't make anybody do their own research for what will or will not work for their site.

Aforum

2:48 am on Sep 8, 2006 (gmt 0)

10+ Year Member



"I'll assume you mean me. And as oft uttered, The SERPS speak for themselves."

Actually, no, I didn't at all. Its more of a general statement as I've seen numerous people contradict what Matt says.

[edited by: Aforum at 2:48 am (utc) on Sep. 8, 2006]

Aforum

2:50 am on Sep 8, 2006 (gmt 0)

10+ Year Member



" Um, no they haven't."

Um, I believe they have. Think what you want, doesn't matter to me.

[edited by: Aforum at 2:52 am (utc) on Sep. 8, 2006]

whitenight

2:55 am on Sep 8, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Actually, no, I didn't at all. Its more of a general statement as I've seen numerous people contradict what Matt says.

haha, no worries.

Just remember, no matter how nice (or knowledgeable) a guy he may be, his paycheck is signed by G. And their interests and goals don't always coincide with the people on this board.

He has every motive to "claim" they are cracking down on sites adding gobs of pages because currently that's what every good black hatter is getting away with.

Whether it's true or not may be a different story.

Aforum

2:58 am on Sep 8, 2006 (gmt 0)

10+ Year Member



I agree.

I think the situation in which we sit here and question which is right when videos have been released to answer these specific questions shows how truthful and trustful people think their advice is. Its a problem they created for themselves and will only continue to get worse IMO.

Aforum

3:01 am on Sep 8, 2006 (gmt 0)

10+ Year Member



And to add on that...

The whole point was "IF" a sudden influx can create a sandbox effect or trigger a flag. Most experienced people here think it does.

I just hate to see those who were told to submit sitemaps and a sudden influx of links form a site that has previously not used one be penalized for following Google's own advice.

reseller

6:39 am on Sep 8, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Hi Folks

Lets take a look again at Matt's statement:

By the way, it looks like the primary issue with the Windows Live Writer blog was the large-scale migration from spaces.msn.com to spaces.live.com about a month ago. We saw so many urls suddenly showing up on spaces.live.com that it triggered a flag in our system which requires more trust in individual urls in order for them to rank (this is despite the crawl guys trying to increase our hostload thresholds and taking similar measures to make the migration go smoothly for Spaces). We cleared that flag, and things look much better now.

There is no doubt whatsover that : so many urls suddenly showing up on a site have triggered a flag in Google "system".

Then we have a site (Windows Live Writer) which was sandboxed because of that.

Then somebody write about the problem on two popular blogs. The same matter was discussed by a person visiting Googleplex.

Then the kind folks at the plex resolve the problem MANUALLY .

Now... allow me just to ask you simple questions:

Do you think the same problem would have been resolved at the same manner if the owner of the site was a public-mom or a public-pop?

Do you think that if the same problem happen to your site, would the folks at the plex resolve it as they did in the case of (Windows Live Writer)?

steveb

7:43 am on Sep 8, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The irony is that what they had in place worked -- spaces.msn.com was one of the worst spam holes on the Internet. Creation of those pages on spaces.live.com was the largest scale spam incident in the history of the Internet, and Google flagged it.

So what do they do? Panic because they just LOVE this sort of utterly useless puke in their index. So what happens? spaces.msn.com spam had been put a great deal under control after being so dominant earlier this year, but now... Google races to highly rank a lot of spam on spaces.live.com

What kind of kool aid did they serve down there at the plex the first week of july that got everybody to go along with the group lobotomy?

hvacdirect

8:48 am on Sep 8, 2006 (gmt 0)

10+ Year Member



Reseller,

The answers to your questions are of course no. Mom-and-pop, small business, small site, whatever...does not have that kind of access to the plex to get a manual review over anything. Had the same thing happened, as I'm sure it has thousands of times, to a webmaster without the ability to go visit google, he/she would be posting here, there, and everywhere. They'd be told to write good original content and get great links that are not bought or traided, check their titles and descriptions, correct the 301 redirect...blah blah.

Of course if my site gets banned, then I fix it, it probably won't be back in the index tomorrow either, like the BMW incident, as google doesn't accept phone calls or emails regarding this sort of request, they only have a reinclusion request that does not give any feedback whether it was read, denied, ignored, or deleted.

reseller

3:11 pm on Sep 8, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



hvacdirect

Thanks for a truthful reply. Much appreciated.

I must say, that I'm both surprised and disappointed at Google's dicriminating policy at the moment.

What Google consider spaming the index is allowed if you are a multi million company or a popular figure or having a popular website. You can add suddenly all the files you wish or even add all the gateway pages you think of. In that case you aren't only allowed to spam but you will recieve proudly a free promotion on a Google employee blog.

You might even be invited to Googleplex to discuss your issues.

While if you are just a public-mom or public-pop, nobody cares. No ivitation to Googleplex. No discussion of issues. Only filing a reinclusion request is all what you get.

Talking about Google needs to address few moral issues!

[edited by: reseller at 3:18 pm (utc) on Sep. 8, 2006]

This 106 message thread spans 4 pages: 106