Forum Moderators: Robert Charlton & goodroi
Google Updates PageRank Patent
What is claimed is:
1. A method, comprising: obtaining data identifying a set of pages to be ranked, wherein each page in the set of pages is connected to at least one other page in the set of pages by a page link; obtaining data identifying a set of n seed pages that each include at least one outgoing link to a page in the set of pages, wherein n is greater than one; accessing respective lengths assigned to one or more of the page links and one or more of the outgoing links; and for each page in the set of pages: identifying a kth-closest seed page to the page according to the respective lengths, wherein k is greater than one and less than n, determining a shortest distance from the kth-closest seed page to the page; and determining a ranking score for the page based on the determined shortest distance, wherein the ranking score is a measure of a relative quality of the page relative to other pages in the set of pages.
One possible variation of PageRank that would reduce the effect of these techniques is to select a few "trusted" pages (also referred to as the seed pages) and discovers other pages which are likely to be good by following the links from the trusted pages.
[edited by: goodroi at 4:50 pm (utc) on Apr 25, 2018]
[edit reason] thread formatting [/edit]
How Does this Affect Link Building?
This changes the game for link building. Actually the game has been changed for awhile now. The algorithms described here are closely tied to what we know about the Penguin Algorithm.
This affects link building because it is calculating link distances between an authoritative and spam free site and the sites it links to. These links are also divided by topic.
For link building, the ideal link is going to be a link from a site that is as close as possible to the most authoritative and high quality site in that niche. The difference is that the high quality sites are different for every niche. This changes what is meant by an authority site.
Google’s Patent Does Not Use the Word Trust
Google’s patent doesn’t even use the word trust. And they are not using a thing called Trust to calculate PageRank.
The difference is that the high quality sites are different for every niche.Would I be wrong if I said the "high quality" sites in a niche are those sites that rank consistently for keywords in that niche, short or long-tailed? And the only way to winning the SEO game through link building is to somehow get linked by such sites, nofollow or not?
Would I be wrong if I said the "high quality" sites in a niche are those sites that rank consistently for keywords in that niche, short or long-tailed?
All this patent does is create a reduced link graph, a starting point to begin the ranking calculations.
In order to rank you have to be in the reduced link graph.How does a new site (or a site not yet part of this coterie) gatecrash into this reduced link graph? I would assume it must get linked by sites within this link graph (Besides relevancy and popularity among users). Shorter the link distance from niche-authority sites to this wannabe site, closely knit it is within the group? Or am I missing something?
Being spam free or authoritative about an entire niche or slice of a niche topic does not make a site useful for answering questions across a wide range of user intents within that niche slice. It's simply spam free and trustworthy for being spam free. Thus it is useful for being a seed site.Thanks for explaining. Now I have more questions :-)
...and from among the n seed pages, a kth-closest seed page to a first web page in the plurality of web pages according to the lengths of the links, wherein k is greater than one and less than n;...
does not make a site useful for answering questions across a wide range of user intents within that niche sliceAbsolutely. A page can't conceivably answer all questions, no matter how micro that niche is. That necessitates Google to assess the quality of the plurality of webpages, so that when a keyword is typed for which seed-pages cannot sufficiently answer, it looks in the plurality of webpages and fishes out pages that are 1. Relevant, in that they answer the question sufficiently 2. Trustworthy, determined by the kth distance from seed pages. If ever a question is typed in Google for which seed pages themselves answer sufficiently, you will find the plurality of pages always trailing the seed-pages. Makes sense?
over the course of the 12 months from launch, a regular website undertaking regular business activities will at some point acquire natural links from "seed" or "near-seed" sitesMakes sense, though it is strictly not a function of time, but the rate of link acquisition from seed-sites or from "seeded" sites. A newly launched official Commonwealth Games website for instance might acquire links from seed pages on day one and start ranking from the first week.
...why can't we assume the sites that are consistently ranking high and wide for keywords in the niche as actually the seed sites?
The criteria for being chosen as a seed set are vastly different from the criteria for being chosen to rank in the SERPs. Two different criteria.Agreed, my surmise is based on deduction and reasoning and not on any written evidence.
...what other factors(spam-free, trust, relevancy) make a page a seed page.
That's a reasonable question. Let's speculate. :)
Clean and on topic Inbound links
Clean and on topic Outbound links
If the page was about products and the user intent is shopping, then a shopping page would outrank it.That may be so if you start with the premise that seed pages need not be the pages that rank high for a given query. That, somehow Google designates a pool of pages as seed pages for every niche and sub-niche.