Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Is Thin Content Bad - Always?

         

Dan01

8:09 am on Apr 3, 2011 (gmt 0)

10+ Year Member



I usually see people say that thin content is bad. A few times I have noticed that the answer sites take some top rankings. There are more than Yahoo Answers too.

Talk about thin content! They are usually a question and then an answer (or three answers with one deemed "best").

In some cases I think thin is better - or at least in Google's eyes.

kd454

5:48 pm on Apr 3, 2011 (gmt 0)

10+ Year Member



Thin content is only bad if you were a Panda target.

If your a big name, big brand, big money site thin content is a good thing as it is ranking all over the place.

Just ask Martha Stewart (queen of thin content) or About and the list goes on and on.

Dan01

12:10 am on Apr 4, 2011 (gmt 0)

10+ Year Member



The Huff Post also has thin content, but they rank.

Swanny007

12:19 am on Apr 4, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I think if your site has a lot of good unique, quality content, then thin pages are probably acceptable in some cases. If your site has very little unique content then you're probably going to get hit a bit harder with things like Panda.

tedster

12:20 am on Apr 4, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It's hard to answer this question because I don't think we all have the same idea of what thin content actually means.

It's worth noting that Google described Panda as targeting "shallow" content, not "thin". They usually use the word "thin" to describe affiliate sites whose pages only reproduce the program's RSS feed or the manufacturer's descriptions without adding any unique value.

Dan01

2:39 am on Apr 4, 2011 (gmt 0)

10+ Year Member



What is shallow content compare to thin? Thin means fewer words right, shallow must mean few words but.... ?

The reason I started this thread is because I produce quite a bit of content. The tendency is to do exhaustive research, add custom graphics, videos etc.

Sometimes I wonder if it is better to try to break these Wikipedia style pages into separate topics. I go back and forth on that and sometimes do and sometimes don't split it up.

tedster

3:07 am on Apr 4, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The way I hear Matt Cutts and Amit Singhal talk about "shallow content" they're not talking so much about length or even uniqueness, they're talking about a subjective sense. Does the content really say anything? Would you trust the advice? Would you give them your credit card number?

Those sets (the yes and no answers) were the seed sets where Google's machine learning began - and what the machine learned about those almost gut-level qualities became the Panda Update.

[edited by: tedster at 4:18 am (utc) on Apr 4, 2011]

netmeg

4:13 am on Apr 4, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I have pages with thin - or shallow - content. They simply have as much content as the users require and no more. They also get shared and linked to a lot, so so far at least, I haven't suffered for it. If that changed, I'd probably have to start taking them out.

Dan01

4:14 am on Apr 4, 2011 (gmt 0)

10+ Year Member



I guess shallow content would depend on the query.

Shatner

1:57 am on Apr 5, 2011 (gmt 0)

10+ Year Member



Tons of sites with thing, non-existent, irrelevant content ranking very, very high in the SERPs often above thicker, more relevant content.

So no, thin content isn't necessarily bad at all.

ken_b

2:07 am on Apr 5, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I guess shallow content would depend on the query.

And probably Googles "best guess" at the intent of the actual query.

Dan01

4:08 am on Apr 5, 2011 (gmt 0)

10+ Year Member



Yah Ken, that is what I was thinking. Perhaps a question will take you to an answer site.

tedster

4:10 am on Apr 5, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Shallow content for shallow queries?

Dan01

5:01 am on Apr 5, 2011 (gmt 0)

10+ Year Member



The way I hear Matt Cutts and Amit Singhal talk about "shallow content" they're not talking so much about length or even uniqueness, they're talking about a subjective sense. Does the content really say anything? Would you trust the advice? Would you give them your credit card number?

Those sets (the yes and no answers) were the seed sets where Google's machine learning began - and what the machine learned about those almost gut-level qualities became the Panda Update.


How did they determine whether a site was trustworthy? What criteria?

Dan01

5:11 am on Apr 5, 2011 (gmt 0)

10+ Year Member



Here is what I found at SEW

What's it all mean for link builders? Well, it's time we say goodbye to low quality link building altogether.


A lot of low quality sites have tons of outbound links. They are the article directories used for SEO.

Too many ads for the amount of content - or is it just too many ads? I don't know.

tedster

5:55 am on Apr 5, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



How did they determine whether a site was trustworthy? What criteria?

That's what a machine learning algorithm is all about. Standard practice would be to establish seed collections (both positive and negative) and then let the learning algorithm have free reign over ALL the data points you've collected about websites.

That algorithm would gradually build a document classifier that uses decisions trees to weight the factors it was noticing. Some would get thrown out, some would have only a small weight, some would be weighted only if some other characteristic was present in the document, and so on.

Crunching factors this way, eventually the machine learning algorithm would establish a set of rules that classify sites in a way that was very strongly correlated with the original seed sets, both positive and negative. This would be the Panda "document classifier." Then it goes live and MORE machine learning continues to modify it, based on the metrics of user response to the results.

It's interesting to me that Panda (the engineer) is being credited with a break-through in this area - and Baswinath Panda has published some advanced papers, including Massively Parallel Learning of Tree Ensembles with MapReduce [research.google.com].

Dan01

6:07 am on Apr 5, 2011 (gmt 0)

10+ Year Member



What is important is the "data points".

It sounds like they are interested in out-bound links and the number of ads, according to SEW.

tedster

6:28 am on Apr 5, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You're talking about what data points the machine learning algorithm eventually settled on. I'm sure it's quite a complex collection.

The machine learning process would have ranged over every bit of back-end data Google has tagged for every URL, whether it was being used in the then-active algorithm or not. Outbound links might well be involved, but I would assume some very fine-tuned factor(s) related to outbound links.

IMO, Panda did not settle on big broad brush strokes. But right now we have very little hard data to go on when we try to pick out the individual data points and how they interrelate.

--------

I feel pretty solid about my own gut sense of the big picture - I know what shallow content is, whether or not some of it is being mis-classified right now. What I'm hoping to get a better sense of is how and when thin URLs begin to poison the other URLs on the domain. If there's any area where I think Panda is currently being tweaked. And I've been a bit baffled about how that part of the Update was originally created too.

My answer to this thread's title would be "no - not always." One characteristic of shallow content is that it is built to rank. If a site has the occasional shallow page that never ranked and clearly isn't "built to rank" (for instance, an informational pop-up or a URL created to display in an iframe) I don't see evidence that it spreads any kind of "poison" to the rest of the domain.