Hi aakk9999, thanks for making an exception!
Regarding your question about tabbed pages or hidden text behind "Read more": Usually such functionality is implemented by some JavaScript that inserts the content on a click event. Unfortunately, the Googlebot doesn't do clicks. It just loads pages, extracts the links, and follows the links by loading those pages in separate requests.
A common technique to implement tabs or a read more link is to use respective routes.
I.e. for a tabbed page with tabs "A", "B", and "C" you would use the routes
http://mysite.com/mypage#!A
,
http://mysite.com/mypage#!B
, and
http://mysite.com/mypage#!C
. Now, the tabs themselves don't have a
onClick
event anymore but instead contain a real
<a href="...">
.
If the page is served to a
human it will work like this:
- The human opens
http://mysite.com/mypage
.
- The human clicks on tab A; to be precise on
<a href="http://mysite.com/mypage#!A">...</a>
- The browser will change the URL hash to
#!A
according to the anchor the human clicked on.
- A routing JS library recognizes the hash change.
- The routing lib invokes the JS code that activates tab A according to the hash. (BTW, it invokes the exact same code that otherwise was bound to the
onClick
of the tab.)
- The human can now see tab A and its content.
If the page is served to the
Googlebot - serving up the identical sources as to a human - it will work like this:
- The Googlebot opens
http://mysite.com/mypage
.
- The Googlebot crawls the page for anchors and finds those for the tabs.
- The Googlebot will repeat the following steps for each tab link:
- The Googlebot makes a separate request to e.g.
http://mysite.com/mypage#!A
- While the page is loaded the routing JS library is initialized and recognized the hash for tab A.
- The routing lib immediately invokes the code that activates tab A according to the hash.
- The Googlebot can now crawl tab A and its content.
BTW, this procedure is exactly the same as if a human would directly open
http://mysite.com/mypage#!A
in a browser. So this code is not just implemented for the sake of crawling.
In my example I used hashbangs. Nowadays you would usually use pushState per default (i.e.
http://mysite.com/mypage/A
) and use a routing lib that falls back to hashes if the browser doesn't support pushState. By design all crawlers support pushState.