Forum Moderators: coopster

Message Too Old, No Replies

Using cURL to find end result of a 301 redirect

         

csdude55

12:19 am on May 13, 2022 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I found this on how to embed TikTok videos:

https://developers.tiktok.com/doc/embed-videos

This requires that the link to be embedded look like:

https://www.tiktok.com/@scout2015/video/6718335390845095173 (real link, from their guide)

And that's great, but I have a few users on my site that link videos more like this:

https://vm.tiktok.com/AbCdEfGhI/?k=1 (fake link, just used as an example)

When I plug this in to cURL, it returns a "Moved Permanently" error that links to the format that TikTok wants. It doesn't explicitly say "301", just "Moved Permanently" so I'm assuming it's a 301.

Any suggestions on how to take https://vm.tiktok.com/AbCdEfGhI/?k=1 and figure out (programmatically) that it will redirect to https://www.tiktok.com/@scout2015/video/6718335390845095173 ?

Here's the PHP code I'm using:

function getFile($url, $getInfo=false) {
$t = false;

if (strpos($url, 'http') === 0) {
$ch = curl_init(filter_var($url, FILTER_VALIDATE_URL));
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36');
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($ch, CURLOPT_TIMEOUT, 180);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

$t = curl_exec($ch);

// not actually used here, it's part of the general script that I sometimes use
// to make sure that an external file is responding
if ($getInfo && $t) {
$arr = curl_getinfo($ch);
$t = $arr['http_code'] === 200 ? $arr[$getInfo] : false;
}

curl_close($ch);
}

return $t;
}

$url = 'https://vm.tiktok.com/AbCdEfGhI/?k=1';
$data = getFile($url);
echo $data;



[edited by: not2easy at 2:46 am (utc) on May 13, 2022]
[edit reason] readability [/edit]

phranque

4:29 am on May 13, 2022 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



i would try the following:
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);

csdude55

4:47 am on May 13, 2022 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I'm getting closer!

I added these just before $t = curl_exec($ch);:

curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);


[php.net...]

This returns 55 lines of data that I don't need, but the 5th line is the location that it's supposed to redirect to! So this hack job works:

function getFile($url, $getInfo=false, $getRedirect=false) {
$t = false;

if (strpos($url, 'http') === 0) {
$ch = curl_init(filter_var($url, FILTER_VALIDATE_URL));
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36');
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($ch, CURLOPT_TIMEOUT, 180);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

if ($getRedirect) {
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
}

$t = curl_exec($ch);

if ($getInfo && $t) {
$arr = curl_getinfo($ch);
$t = $arr['http_code'] === 200 ? $arr[$getInfo] : false;
}

curl_close($ch);
}

return $t;
}

// not a real link, this is just a dummy for the thread
$url = 'https://vm.tiktok.com/AbCdEfGhI/?k=1';

$data = getFile($url, false, true);

$arr = explode("\n", $data);

$arr[4] = ltrim($arr[4], 'Location: ');

$final_url = 'https://www.tiktok.com/oembed?url=' . $arr[4];

// should return the response data
$newData = getFile($final_url);


I obviously don't love doing two cURLs like that. And I can't be 100% sure that I'll always get the right information on the 5th line. So I think I'm "closer", but not quite at the finish line yet.

csdude55

4:48 am on May 13, 2022 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Haha @phranque, I was typing that up while you posted :-) We were on the same page, at least! LOL

phranque

7:28 am on May 13, 2022 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



I obviously don't love doing two cURLs like that.

you cannot avoid making two GET requests if you want to follow the redirect chain.
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1)

this option should have done both GET requests in a single curl_exec.
was $getRedirect true when you called this function?

csdude55

6:07 pm on May 13, 2022 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



this option should have done both GET requests in a single curl_exec.
was $getRedirect true when you called this function?

It is set to true, yes, in this line:

$data = getFile($url, false, true);

But what happens is that I first have to convert https://vm.tiktok.com/AbCdEfGhI/?k=1 to https://www.tiktok.com/@scout2015/video/6718335390845095173, then I convert that to https://www.tiktok.com/oembed?url=https://www.tiktok.com/@scout2015/video/6718335390845095173 for the second request.

I've discovered that I can actually eliminate FOLLOWLOCATION and just use:

curl_setopt($ch, CURLOPT_HEADER, 1);


This returns 20 lines instead of 55, so it should be faster to process. I didn't bench test it.

I also figured out that I could do this with the original script, without using either of the new setopts! I just needed to modify this line:

$t = $arr['http_code'] === 200 ? $arr[$getInfo] : false;

Using curl_getinfo was returning an 'http_code' of 301, but also has a 'redirect_url' key! Which means that I can just grab the headers, which should be the fastest option.

Here's where I am now, and I think this will be the final version:

// added $http_code=200
function getFile($url, $getInfo=false, $http_code=200) {
$t = false;

if (strpos($url, 'http') === 0) {
$ch = curl_init(filter_var($url, FILTER_VALIDATE_URL));
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36');
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($ch, CURLOPT_TIMEOUT, 180);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

$t = curl_exec($ch);

if ($getInfo && $t) {
$arr = curl_getinfo($ch);

// changed "=== 200" to "=== $http_code"
$t = $arr['http_code'] === $http_code ? $arr[$getInfo] : false;
}

curl_close($ch);
}

return $t;
}

// not a real link, this is just a dummy for the thread
$url = 'https://vm.tiktok.com/AbCdEfGhI/?k=1';

// defining $getInfo as 'redirect_url' and $http_code as 301
// this should return the final destination link
$data = getFile($url, 'redirect_url', 301);

$final_url = 'https://www.tiktok.com/oembed?url=' . $data;

// should return the final response data
$newData = getFile($final_url);

phranque

9:45 pm on May 13, 2022 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



apparently setting CURLOPT_FOLLOWLOCATION isn't working as advertised in your configuration.

csdude55

5:18 am on May 14, 2022 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I don't think it's really the setopt's problem, it's just a weird situation.

The URL that I need to get data is prepended with https://www.tiktok.com/oembed?url=, but if I prepend the shortened link with that then it doesn't return anything. So I need to find the final destination for the shortened URL before prepending it.

FOLLOWLOCATION works, it just doesn't return the final destination without some work. So fetching the headers instead seems to be the fastest way to get the final destination, then I can prepend it and do a second cURL.

phranque

9:51 pm on May 14, 2022 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



sorry i misunderstood the problem.
i hadn't read the tiktok embed doc and was assuming this url was part of the redirect chain.

csdude55

7:43 pm on May 15, 2022 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Making it redirect properly would definitely help! LOL I'm surprised that TikTok doesn't do things more logically, and if it were any other company I would have abandoned it and not cared. But more and more I'm seeing users post TikTok videos instead of YouTube videos, so I have to work around whatever nonsense they've created :-/