Forum Moderators: open

Message Too Old, No Replies

apple.bot

massive requests

         

johnhh

11:12 pm on Jan 28, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Anybobody got experience of massive apple.bot requests - seems official ?

lucy24

2:26 am on Jan 29, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Do you mean Applebot from 17.138 and 17.142?
Mozilla/5.0 (compatible; Applebot/0.3; +http://www.apple.com/go/applebot)
AND
Mozilla/5.0 (iPhone; CPU iPhone OS 6_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A5376e Safari/8536.25 (compatible; Applebot/0.3; +http://www.apple.com/go/applebot)

Or some unrelated entity really called "apple.bot"?

johnhh

9:04 am on Jan 29, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



lucy@24

after a look through yesterdays rather large log....

17.138.55.240 - - [28/Jan/2016:22:10:48 +0000] "GET /example.htm?id=xxx HTTP/1.0" 200 35124 "-" "Mozilla/5.0 (compatible; Applebot/0.3; +http://www.apple.com/go/applebot)"

Acting as a bot, no page elements accessed

Thanks.

lucy24

9:25 pm on Jan 29, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



There's a thread about it here [webmasterworld.com] with supplementary information and link here [webmasterworld.com]. (Note the comments about Googlebot robots.txt directives in the main thread.) I've noticed it increasingly often in recent months. It seems to be primarily interested in one specific directory, which suggests that it's paying attention to someone else's links and/or RSS feed.

There are two versions, vanilla and mobile. It does not ask for robots.txt as often as one would like. (My personal preference is at the beginning of every visit for sporadic robots, or at least once every 24 hours for regular crawlers.)

johnhh

11:27 pm on Jan 29, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



but is it a bad bot or a good bot ?

lucy24

3:37 am on Jan 30, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



That's what we would all like to know :(

The other thread does point out some benefits in the specific context of local businesses. I don't know what it's good for (or not-good for) when it comes to purely informational sites. With me, it has primarily been visiting ebooks. No idea what it does with them.

johnhh

1:05 am on Feb 3, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Um the bot was back again today. I have decided not to block it. Seems to go to a sub-directory and does about 8 pages then moves on. No idea why. We are a UK info site ( for reference)

w3bmastine

7:23 am on Feb 3, 2016 (gmt 0)

10+ Year Member



I block Applebot. They don't stick to the rules outlined in my robots.txt. For me that's enough to treat them as a bad bot. Blocked by User-agent.

keyplyr

10:30 am on Feb 4, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Rumor has it Apple is building an index but exact purpose is so far unknown.

I've not seen any issues with this bot and I keep a diligent watch.

w3bmastine

10:49 am on Feb 4, 2016 (gmt 0)

10+ Year Member



@keyplyr Applebot does react to User-agent: * where Googlebot is mentioned.

keyplyr

11:03 am on Feb 4, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Some bots implement robots.txt differently. Yes robots.txt is a web standard, but how it is implemented is up to a very wide interpretation sadly.

I use wild cards. Google, Yandex & DuckDuck read & obey them just fine. Bing does not. Go figure.

w3bmastine

7:02 pm on Feb 4, 2016 (gmt 0)

10+ Year Member



Bing does not respect the wildcard? Wow, that's weak.
At least Bing does not follow Googlebot instructions if instructions don't explicitly mention Bing... one plus over Applebot.

jtbell

8:55 pm on Feb 7, 2016 (gmt 0)

10+ Year Member



On my site at least, applebot doesn't seem to recognize the 'base href' tag. For example, on a page http://www.example.com/foo/bar/ I have

<base href="http://www.example.com/foo/">

and internal links like

<a href="baz/">

which is supposed to refer to http://www.example.com/foo/baz/ but Applebot tries to crawl http://www.example.com/foo/bar/baz/ instead. I had a whole slew of these yesterday.

keyplyr

10:57 pm on Feb 7, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



applebot doesn't seem to recognize the 'base href' tag.
A lot of agents don't support base href.

lucy24

11:07 pm on Feb 7, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I have

<base href="http://www.example.com/foo/">

and internal links like

<a href="baz/">
You may as well play it safe and use site-absolute links beginning in / (slash) or in this case /foo/. Then the robot will have no excuse for misunderstanding. Some robots still will get it wrong-- I've seen them-- but this will be due purely to the robot's own gratuitous stupidity. Save the relative links for URLs that you know will always be in the same directory, even if the whole directory packs up and moves.