Forum Moderators: phranque
utm_\w+?
cid
ocid
trkid
gclid
fbclid
refer+er
share
mkt_tok
mkwid
pgrid
ptaid
_*source
amp_\w+?
usqp
ref_src
ref_url
mtrref
gwh
gwt
subsource
refcode[0-9]* s#(\?|&(amp;)?)(utm_\w+?|...|refcode[0-9]*)=[^&]*#$1#gi
I typically remove tracking IDs
s#(\?|&(amp;)?)(utm_\w+?|...|refcode[0-9]*)=[^&]*#$1#gi
refer+er=
ref_src
ref_url
refcode[0-9]*
how are you checking to insure it is actually a tracking parameter?
i once worked on a cms where the cid parameter was the "content id".
have you tested this on urls with several parameters in the query string?
if i'm reading this correctly, this:
http://www.example.com/some-path?cid=123¶meter2=xyz
will result in this:
http://www.example.com/some-path¶meter2=xyz
s#(\?|&(amp;)?)(utm_\w+?|...|refcode[0-9]*)=[^&]*#?$1#gi;
s#(?)|(?&)#?#; ref(er+er|_src|_url|code[0-9]*)
Sorry. I just couldn’t help myself.
What’s the significance of the two \w+? constructions? The ? (“capture the smallest number possible that still enables a match, leaving room for other stuff to follow”) would seem to be superfluous, since = is in any case a non-word character.
Any interest in keeping a list of tracking ID parameters that can safely be removed?
s#(\?[^&]*)(&(amp;)?)?(utm_\w+|...|refcode[0-9]*)=[^&]*#?$1#gi; Are you dealing with tracking ids, or just providing the actual true URL and all the rest is just a bad dream nightmare you should wake up from and say "whew!"
s#https?://[\w-]+\.cdn\.ampproject\.org/v/s/(.*?)/amp/?.*#$1#i; $pattern = '(\?[^&]*)(&(amp;)?)?(utm_\w+|...|refcode[0-9]*)=[^&]*&?';
foreach (@_) {
# 10/4/19, AMP links don't work, not sure why, so let's fix them here
s#https?://[\w-]+\.cdn\.ampproject\.org/v/s/(http.*?)/amp/?.*#$1#i;
# not sure that I should use /g here, I might be making it a tad slower for no reason
while (m#$pattern#gi) {
$_ =~ s#$pattern#$1$2#gi;
}
# shouldn't have any repeating &, but just in case
s#(&(amp;)?&(amp;)?)+#&#;
# might as well remove any trailing ? or &
s#(\?|&(amp;)?)+$##;
}
# not sure that I should use /g here, I might be making it a tad slower for no reason“while” and “g” does seem redundant. If it’s a perfectly coded /g/ there should be no need for the repeating iterations of “while”, and contrariwise if “while” is found to be the only way to do it, then the /g/ would seem to be superfluous (though I don’t know if in this case it would actually affect processing speed).
while (m#$pattern#gi)