Forum Moderators: open

Message Too Old, No Replies

LanaiBot Agressive Scraping

         

WebOpz

8:27 pm on Apr 8, 2021 (gmt 0)

5+ Year Member Top Contributors Of The Month



I'm seeing a new bot called LanaiBotapr1 (most recent agent name) or LanaiBot that is repeatedly changing its user agent string and aggressively scraping my site. It looks to be LanaiBot

Anyone have any experience with LanaiBot?

TIA



[edited by: not2easy at 2:04 am (utc) on Apr 9, 2021]
[edit reason] snipped link [/edit]

not2easy

2:21 am on Apr 9, 2021 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Hi WebOpz and welcome to WebmasterWorld [webmasterworld.com]

You could block the UA even if it is using variations. See the general method here: [webmasterworld.com...] and add something like 'lanai' to your UA blocks.

lucy24

5:54 am on Apr 9, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If anyone recognizes the name, I'd like to hear about it. It sounds naggingly familiar, but I couldn't find it anywhere in archived logs (going back almost 10 years) :(

jmccormac

8:28 am on Apr 9, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Don't recognise the UA. Is it coming from a single IP address or a range of IPs?

Regards...jmcc

not2easy

12:56 pm on Apr 9, 2021 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



I couldn't find it anywhere in archived logs
Me too neither.

I searched for it (DDG) and found it was first noticed anywhere this month. I am hoping WebOpz will be able to share more insight from their own experience.

WebOpz

1:17 pm on Apr 9, 2021 (gmt 0)

5+ Year Member Top Contributors Of The Month



Basically it was scraping my entire site from the following IPs:

52.33.58.74ec2-52-33-58-74.us-west-2.compute.amazonaws.com[Amazon AWS]
52.12.168.229ec2-52-12-168-229.us-west-2.compute.amazonaws.com[Amazon AWS]
52.12.13.44ec2-52-12-13-44.us-west-2.compute.amazonaws.com[Amazon AWS]

Anyone else seen it?

lammert

2:08 pm on Apr 9, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I haven't seen them yet in the logs. Lanai is an Hawaiian island and us-west-2 are the Amazon data centers in Oregon. Could be a false flag, but looks like something operating from the Pacific side of the US. A University research project or a Silicon Valley operation?

wilderness

6:24 pm on Apr 9, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



"52.33.58.74ec2-52-33-58-74.us-west-2.compute.amazonaws.com[Amazon AWS] "

Amazon has the 0-79 of 52.
just block the range

WebOpz

6:31 pm on Apr 9, 2021 (gmt 0)

5+ Year Member Top Contributors Of The Month



@wilderness Indeed I did do that already. =) I was looking for insights into if anyone had seen what this aggressive scraper is up to...

not2easy

8:46 pm on Apr 9, 2021 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Those other IPs are 'Direct Allocation' - within AWS Dogfish Routing whatever that is.
ARIN shows "Updated: 2021-02-10"

52.0.0.0 - 52.79.255.255
52.0.0.0/10 and 52.64.0.0/12

If the bot is not requesting/complying with robots.txt it is a good bet they are not one you want to permit.

jmccormac

10:17 pm on Apr 9, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If it is changing its user agent then it may be an ordinary scraper rather than a declared crawler. Having the IPs clustered on Amazon would indicate that it is a fairly low quality scraper. Look for other unusual activities in your logs coming from data centre IP ranges. One of the signals is the same page being requested in a narrow time frame by a number of IPs. Some bots generally scrape the HTML only and do not load the images or CSS files so it is possible to check the logs for this kind of activity.

Regards...jmcc