Forum Moderators: coopster

Message Too Old, No Replies

Benchmark surprise: strpos() is faster than substr()

         

csdude55

5:42 pm on Apr 1, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I was surprised by this, so wanted to share for input.

I thought that using substr() would be faster since I'm only looking at a defined number of characters, where strpos() searches the whole string. But my results after 10,000 iterations are:

strpos(): 0.0023541450500488
substr(): 0.0035181045532227

strstr() was also faster than substr(), and only slightly slower than strpos():

strstr(): 0.0024318695068359

Direct link:
[sandbox.onlinephpfunctions.com...]

The benchmark code:
// Test 1
$test = false;
$url = 'https://www.example.com/foo/this-is-a-test/12345';

$start_time = microtime(TRUE);

for ($i = 0; $i < 10000; $i++) {

if (strpos($url, 'http') !== false)
$test = 1;

}

$end_time = microtime(TRUE);

echo 'strpos(): ';
echo "#$test#\n";
echo $end_time - $start_time;
echo "\n\n";

// Test 2
$test = false;
$start_time = microtime(TRUE);

for ($i = 0; $i < 10000; $i++) {

if (substr($url, 0, 4) === 'http')
$test = 1;

}

$end_time = microtime(TRUE);

echo 'substr(): ';
echo "#$test#\n";
echo $end_time - $start_time;

JayDub

6:01 pm on Apr 1, 2020 (gmt 0)

5+ Year Member Top Contributors Of The Month



Nice ... Thanks for sharing!

JorgeV

6:20 pm on Apr 1, 2020 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member Top Contributors Of The Month



Hello,

This is not surprising.at all.

- "strpos" stops as soon as it finds the string, it does not necessarily browse the whole string.

- "substr" extracts a fragment of the string, which means that it has to allocate a memory zone, copy the fragment into this memory zone.

w3dk

9:15 pm on Apr 1, 2020 (gmt 0)

10+ Year Member Top Contributors Of The Month



With regards to performance, the PHP docs only seem to mention strpos() with respect to strstr():


If you only want to determine if a particular needle occurs within haystack, use the faster and less memory intensive function strpos() instead.
[php.net...]


(Although I do rather like strstr() for its brevity.)




Aside:


if (strpos($url, 'http') !== false)
if (substr($url, 0, 4) === 'http')


Note that these are testing two different things.... the first is successful if "http" is found anywhere in the $url, whereas the second is only successful if "http" is found at the start. For the equivalent using strpos(), you would need:


// Is "http" at the start of $url?
if (strpos($url, 'http') === 0)

w3dk

9:20 pm on Apr 1, 2020 (gmt 0)

10+ Year Member Top Contributors Of The Month



- "strpos" stops as soon as it finds the string, it does not necessarily browse the whole string.


Try the same test, but with "http" at the end of the source string - or remove it completely?

EDIT: I just did...
strpos(): 0.00060105323791504
substr(): 0.00050592422485352

Note, however, even testing your original script with "http" at the start of the $url, I do find the timings a bit "variable" (using that online tool). Occasionally, substr() comes out marginally quicker than strpos()!?

csdude55

10:05 pm on Apr 1, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Great points! I tried testing last night and a few times today, but of course it's a live server so I'm sure the results vary based on current server load. Every time I ran it, though, the result was consistently in favor of strpos().

I also tried matching "foo" instead of "http", and strpos() was still faster.

Your speed results are a LOT lower than mine, though, so I wonder if user's internet speed impacts the results? In theory it shouldn't, unless there's something on the tool's end that could cause it.

Great catch on matching strpos() === 0 instead of !== false! That made it slightly faster, too. In my case I'm just testing to see if the string at least looks like a link before attempting to use cURL, so I'm only interested in the 0 position. But you're right, that means my original sample wasn't an apple-to-apple comparison.

JorgeV

9:12 am on Apr 2, 2020 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member Top Contributors Of The Month



Hello,

In my case I'm just testing to see if the string at least looks like a link before attempting to use cURL


You should use filter_var . This might be slightly slower than strpos, but it enures the string is a valid URL(protocol, schema, domain, path, query string, etc...)
filter_var($url, FILTER_VALIDATE_URL)

[php.net...]

Do not forget to sanitize the string, otherwise malicious combination of characters could exploit a security flow.

csdude55

6:56 pm on Apr 2, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



That's a great one, I hadn't seen that one before!

I did a speed test and it is considerably slower; after 10,000 iterations:

strpos(): 0.0021529197692871
filter_var(): 0.0082831382751465

If it passes the condition, the next thing I do with $url is:

$ch = curl_init($url);

I can't find if this automatically sanitizes the url, but I don't see any examples where they do anything else to sanitize it. What do you think, should I use both? Eg,

if (strpos($url, 'http') === 0) {
// using filter_var() to sanitize before processing
$ch = curl_init(filter_var($url, FILTER_VALIDATE_URL));
...
}