Forum Moderators: martinibuster

Message Too Old, No Replies

Facebook bot: Be warned

         

dolcevita

1:19 pm on Sep 8, 2024 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Two days ago I noticed a 5x higher number of pageviews in the Adsense account and at the same time a terrible number of CTR and RPM compared to the previous days and months. The number of impressions remained approximately the same.

I went directly to analytics from Cloudflare and saw three variants of Facebook bots. Two with user agent facebookexternalhit/1.0 and facebookexternalhit/1.1 and one meta-externalfetcher/1.1
The log showed that in just half an hour they made 3.5k visits, all behind AS32934 which belongs to FB.

So the sudden addition of the FB bot, within the list of verified bots by CF, caused the anarchic FB bot to excessively consume resources, index everything, make the response from the server slower and give drastically less income for Adsense. I don't understand how the CF team could add such a bot, even from FB, to the list of verified bots!?

The solution was simple. Blocked through CF rules the above three useragents and AS32934 which belongs to FB and put it in the first place as a rule, since CF added FB bot to the list of verified bots a few days ago.

Namely, FB is not a search engine, so I don't have any benefit from the fact that FB will index my website and use the content for AI training, and I don't think the blocking has any effect on the eventual sharing of content on FB, as well as visits by potential FB users.

CommandDork

1:58 pm on Sep 8, 2024 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thanks for the heads-up. Haven't noticed anything on my end just yet...but will keep an eye out for this nonsense.

dolcevita

4:57 am on Sep 11, 2024 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



What's strange is that today I just scrolled through the list of CF verified bots and there is no facebook bot there at all, but CF still lets it go as a verified bot:
[radar.cloudflare.com...]

universenet

2:34 pm on Sep 11, 2024 (gmt 0)

Top Contributors Of The Month



What's strange is that today I just scrolled through the list of CF verified bots and there is no facebook bot there at all, but CF still lets it go as a verified bot:


AI says it is from facebook, maybe is new name

not2easy

3:21 pm on Sep 11, 2024 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



There is a newer FB bot, it was detailed in the UA/Spiders forum on 9/5 here: [webmasterworld.com...]

the UA is meta-externalagent/1.1

dolcevita

4:45 pm on Sep 11, 2024 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



From my CF log

User Agents:
facebookexternalhit/1.1 - 1.54k
facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php) - 1.05k
meta-externalagent/1.1 (+https://developers.facebook.com/docs/sharing/webmasters/crawler) - 1.02k
Mozilla/5.0 (iPhone; CPU iPhone OS 16_0 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.4 Mobile/15E148 Safari/604.1 -751
Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/128.0.6613.84 Safari/537.36 - 704
Mozilla/5.0 (Linux; Android 11; Pixel 5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.6312.4 Mobile Safari/537.36 - 703
facebook_www_script - 3
facebookexternalhit/1.1;line-poker/1.0 - 2
facebookexternalhit/1.1; kakaotalk-scrap/1.0; +https://devtalk.kakao.com/t/scrap/33984 - 1



Source ASNs
32934 - FACEBOOK - 5.77k
38631 - LINE LINE Corporation - 2
38099 - KAKAO-AS-KR Kakao Corp - 1

dolcevita

4:52 pm on Sep 11, 2024 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



@universenet @not2easy or anybody else...

Any idea why is not any FB bot on CF list of bots? At least i can not find any FB bot there on the list.

not2easy

5:25 pm on Sep 11, 2024 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Only CF can tell you that though from the known habits, FB bots request robots.txt, then they ignore your settings there.

universenet

8:03 pm on Sep 11, 2024 (gmt 0)

Top Contributors Of The Month



@universenet @not2easy or anybody else...

Any idea why is not any FB bot on CF list of bots? At least i can not find any FB bot there on the list.



@dolcevita

List is not updated maybe and maybe is way for escape robots.txt (or .htaccess) or they just not care about antyhing more
I did ask chatgpt AI (best AI for me) and answer is next:

Reasons to Block facebookexternalhit:
Bandwidth concerns: If Facebook is consuming too much bandwidth and causing performance issues, blocking it might help.
No value from Facebook shares: If you don't need content to be shared on Facebook or don't want Facebook to access your site, blocking it can prevent unnecessary crawls.

Reasons to Not Block facebookexternalhit:
Link previews: If your content is being shared on Facebook, blocking this bot will prevent proper previews (e.g., images, titles) from appearing when someone shares a link to your site.
Traffic from Facebook: If a significant portion of your traffic comes from Facebook, you may want to keep the bot unblocked to ensure accurate previews, which can attract more visitors.

I have bandwidth limit 10 TB so I do not care and even if someone take contents from my website I do not care too

dolcevita

5:25 am on Sep 12, 2024 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



@universenet
Thanks for information. It's not about bandwidth, but dramatically worse RPM and CTR than usual.

azlinda

11:00 pm on Nov 1, 2024 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I found five visits from meta-externalagent/1.1 today. How do we block this if we are not on CF? Thanks.

not2easy

11:18 pm on Nov 1, 2024 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



How do we block this...
See this thread: [webmasterworld.com...] It could help.

azlinda

12:19 am on Nov 2, 2024 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thank you, not2easy. That does help.

Dimitri

2:39 pm on Nov 3, 2024 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member Top Contributors Of The Month



There is something weird with the facebookexternalhit bot.

It's supposed to be only related to sharing page on Facebook, and it used to be like that, then earlier this year, this bot started to hit thousands of pages per hours to my sites, and I don't have that much page shared at Facebook.

It's not supposed to be used for Facebook AI training, since there is another user agent for it.

Then, 2 weeks ago, all of a sudden the facebookexternalhit bot returned its normal behavior (a couple of pages per hours).

So I don't know if Facebook used its facebookexternalhit bot to train its AI, or what.

[edited by: not2easy at 4:55 pm (utc) on Nov 3, 2024]
[edit reason] fixed typo: sued = used [/edit]

lucy24

4:09 pm on Nov 3, 2024 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Dang! Does that mean we have to unblock it, and then check regularly to make sure they’re not back to their old misbehavior?

ashleydent4u

3:22 am on Nov 26, 2024 (gmt 0)



Wow, that’s a frustrating situation, but kudos to you for catching it and acting fast! It’s crazy how the FB bots, even verified ones, can wreak havoc on server resources and skew metrics like CTR and RPM. Blocking those user agents and AS32934 through Cloudflare was a smart move—totally agree there’s no real benefit to having FB index everything, especially if it’s just for AI training or unnecessary crawling. Hopefully, this tweak keeps things running smoother and gets your AdSense metrics back on track!