Forum Moderators: open

Message Too Old, No Replies

Mojeek crawler is active?

         

SumGuy

1:39 am on Jul 24, 2023 (gmt 0)

5+ Year Member Top Contributors Of The Month



I'm blocking 5.102.173.71 in my router (not sure what CIDR it's part of) and I'm seeing today that it was trying to hit my web server. This IP is host-name (crawl-5-102-173-71.mojeek.com).

So I throw mojeek.com into my address bar and it claims to be "a growing independent search engine which does not track you".

I've never heard of it, but I might try it.

I'll probably un-block at least the /24 and see what it does.

I take it that it was known here in the past, but doesn't have much of a profile as an active crawler?

dstiles

9:10 am on Jul 24, 2023 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Mojeek is a valid UK SE with an address in Sussex. I allow:
MojeekBot
from 5.102.173.64/28

Peter_S

9:45 am on Jul 24, 2023 (gmt 0)

5+ Year Member Top Contributors Of The Month



Hi,

[en.wikipedia.org...]

It's not a big player, but I think this is good to keep an eye on it.

I also noticed a surge of activity , since one month.

brotherhood of LAN

12:09 pm on Jul 24, 2023 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Employee of Mojeek here.

Yes, we are very small fry compared to G,B. Mojeek respects robots.txt and rate limits requests. We also provide results to other search engines.

Peter_S

1:43 pm on Jul 24, 2023 (gmt 0)

5+ Year Member Top Contributors Of The Month



Keep up the good job brotherhood of LAN (and your co workers) !

brotherhood of LAN

2:00 pm on Jul 24, 2023 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Cheers. The work is mostly of Marc Smith who started it in 2004. He can claim the world's first 'no tracking policy' well before other 'privacy' engines. Appreciate any support for alternative views into the web beyond big tech (like allowing our crawler). Makes for a healthier web.

lucy24

3:42 pm on Jul 24, 2023 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



doesn't have much of a profile as an active crawler
It's harmless. Within the present calendar year I find a whopping total of 22 Mojeek visits across three sites, each of them robots.txt + one page. One of those was a 410 from ten years back, another was for an URL that has been redirected for pretty exactly two years--both suggesting that they're following some very, very old links.

mack

3:52 pm on Jul 24, 2023 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



When it comes to blocking anything I would tend to see if you have a reason to do so first. If we were all to block things simply because we don't know of them it would be incredibly hard for any new search company to build an index.

As for Mojeek, I have been watching them for several years and they do exactly what it says on the tin.

Mack.

Peter_S

4:45 pm on Jul 24, 2023 (gmt 0)

5+ Year Member Top Contributors Of The Month



For me, if :

- ua includes RobotID + url to read about it (page must exist)
- access / respect robot.txt
- IP or IP range identifiable by reverse dns, or listed on the about page.

It's welcome :-)

brotherhood of LAN

4:56 pm on Jul 24, 2023 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



reverse DNS is definitely the finer point, after all anyone can proclaim a UA.

Peter_S

6:33 pm on Jul 24, 2023 (gmt 0)

5+ Year Member Top Contributors Of The Month



reverse DNS is definitely the finer point, after all anyone can proclaim a UA.


Yes, this is a concern I mentioned here [webmasterworld.com...]

tangor

10:54 pm on Jul 27, 2023 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Not on my hit list ... has always been respectful!

lucy24

7:08 pm on Jul 28, 2023 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Anyone can put on a fake name. But not many UAs are actually spoofed; 99 times out of 100, if someone claims to be {name}, that’s who they are.

Remember when fake Googlebots were everywhere? Over the years they’ve plummeted--to the point where in ten years’ time it may not even be necessary to check for them. There are lots of fake Baidus, for all the good it does them. Other names can probably be counted on the fingers of one hand.

Peter_S

8:58 pm on Jul 28, 2023 (gmt 0)

5+ Year Member Top Contributors Of The Month



This is very true. I wonder where all these scrappers are now. May be some of Google's refines of its algorithms put out of business plenty of these sites which were rehashing scrapped content. However, at the time of the holy AI, there is no more need to scrap, may be.

jmccormac

12:59 am on Jul 29, 2023 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Well behaved and active.

Regards...jmcc