Forum Moderators: open

Message Too Old, No Replies

thum.io

Website Screenshot Generator

         

SumGuy

3:39 pm on Jan 28, 2022 (gmt 0)

5+ Year Member Top Contributors Of The Month



This blocked IP (ip-blocked by router) hit me about 50 times before giving up.

52.87.44.246 (url.thum.io). Amazon IP address (I block all of Amazon from hitting my server).

Thum.io is apparently a Website Screenshot Generator. Because it didn't get through to my web server I don't know what the actual behavior of this bot is (robots.txt) or its user-agent.

Be aware, for anyone who doesn't want these things crawling your site.

Dimitri

10:55 pm on Jan 28, 2022 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member Top Contributors Of The Month



A site, which is making screen captures of other sites, without authorization, that is copyright infringement, and monetizing these captures, this is worse, and not even identifying the company or guys behind the service, this is a no-no...

blend27

6:18 pm on Jan 31, 2022 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Actual data from their request:

IP: 52.87.44.246 (url.thum.io)

HEADERS:

sec-ch-ua-mobile: ?0
Accept-Language: en-US,en;q=0.9
user-agent: Mozilla/5.0 (X11; Linux aarch64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Safari/537.36
Sec-Fetch-Mode: navigate
Sec-Fetch-Site: none
sec-ch-ua: "Chromium";v="97", " Not;A Brand";v="99"
host: example.com
Sec-Fetch-User: ?1
connection: keep-alive
accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
Accept-Encoding: gzip, deflate, br
Upgrade-Insecure-Requests: 1
sec-ch-ua-platform: "Linux"
content-length: 0
Sec-Fetch-Dest: document

request_method: GET
server_protocol: HTTP/1.1
http_content:


The UA is theirs, and the actual "Sec-Fetch-" headers are present, so my assumption is setup on their end. I used IE11(which does not transmit Sec-Fetch- headers) to hit the URL below.

This data was captured by visiting: https: // image.thum.io/get/width/600/crop/600/https://example.com/ <this generates PDF with the screen shot.

p.s.
ColdFusion Code to capture this data on server-side:
<cfsavecontent variable="info"><cfoutput>
<pre>
----------------------------------------
#cgi.remote_addr#
<cfset x = GetHttpRequestData()>
<b>HTTP Request item: <b>Value</b>
<cfloop collection = #x.headers# item = "http_item">#http_item#: #StructFind(x.headers, http_item)# #chr(13)#</cfloop>
request_method: #x.method#
server_protocol: #x.protocol#
<b>http_content:</b>#x.content#
</pre></cfoutput></cfsavecontent>
<cffile output="#info#" action="append" file="#GetDirectoryFromPath(GetCurrentTemplatePath())#these_nuts_have_headers.txt" addnewline="true">
<cfoutput>#info#</cfoutput>