Forum Moderators: open

Message Too Old, No Replies

Mb2345Browser/9.0

I'm pretty sure this is actually a crawler, not a browser

         

notriddle

11:24 pm on Jul 31, 2019 (gmt 0)

5+ Year Member



This UA has been showing up with a bunch of 404's.


Mozilla/5.0(Linux;Android 5.1.1;OPPO A33 Build/LMY47V;wv) AppleWebKit/537.36(KHTML,link Gecko) Version/4.0 Chrome/42.0.2311.138 Mobile Safari/537.36 Mb2345Browser/9.0


It escapes stuff that oughtn't be escaped. For example, there are pages with paths like 'SomeScript.aspx?id=NNN' (they're all legacy redirects, actually, but the site used that scheme for a long time and still gets plenty of hits to that path). It requests 'SomeScript.aspx%252525253Fid%252525253DNNN' instead, which you'll notice is not only wrong, it's quadruple-escaped! I also see double-escaping in my logs, but quadruple is funnier.

I'm also pretty sure it's actually a crawler, because it keeps requesting different legacy URLs constantly without ever visiting the home page. Plus, it claims to be based on Chrome, but it obviously has bugs that Chrome doesn't.

According to whois, it all seems to be coming from China.

iamlost

2:04 am on Aug 1, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



mb2345browser identifies mobile browser for Chinese web directory 2345.com.

Once upon a time 2345.com ran a notorious adware browser hijacker but the browser appears to be more legitimate :)

lucy24

3:51 am on Aug 1, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I'm also pretty sure it's actually a crawler
:: detour to raw logs ::

The fact that the only request I find is for a page without supporting files is also strongly redolent of robotitude.

:: further exploration ::

###! I forgot to update the relevant site's htaccess for /includes/ when the site went to 2.4, so I can't check headers and confirm what grounds got them blocked. But they did get blocked.

notriddle

9:21 pm on Aug 1, 2019 (gmt 0)

5+ Year Member



mb2345browser identifies mobile browser for Chinese web directory 2345.com.


According to [handsetdetection.com...] the real 2345browser has "like Gecko" in its UA string. This one has "link Gecko".

lucy24

11:04 pm on Aug 1, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



This one has "link Gecko".
LOL. My Deny list includes a whole string of misspellings, not only in the UA but in header names: “Referrer” *, “Useragent” (not to be confused with the UA string that begins with the literal text “User-Agent:”), “X-Fowarded-For” ...


* Hmph.

notriddle

8:21 pm on Aug 2, 2019 (gmt 0)

5+ Year Member



Welp, I added a rule blocking based on "link Gecko". Apparently, getting 403's caused to send in the clones, and started crawling so hard it caused noticeable slowdown on the site! (literally the thing I was trying to prevent... sigh)

So now I added a block rule based on the "%253F" and "%2525" (404's this time). It's still hammering me from a wide variety of UA's and IP's (enough that my IP-based rate limit doesn't kick in), but most of them are reaching my application backend. Eventually, it'll at least give up crawling the broken URLs.

kode54

1:13 am on Sep 10, 2019 (gmt 0)

5+ Year Member



I added blocks to try to direct 403 errors for some of this mess, and fail2ban did absolutely nothing. I guess my logs are not to fail2ban's liking. It only made them angrier. They started flooding ten times as many requests, and a lot of them more "properly" formatted.

I have no idea how to make Nginx outright drop all traffic requests with "PHPSESSID" in the URL. I block it with tests for $arg_PHPSESSID, and it still gets through from the idiots sending requests for "/index.php%3FPHPSESSID%3D..."

notriddle

2:51 am on Sep 10, 2019 (gmt 0)

5+ Year Member



My Nginx has these in it. You can add "%3FPHPSESSID".


if ($request_uri ~* "%2525") {
return 404;
}
if ($request_uri ~* "%253F") {
return 404;
}