Forum Moderators: martinibuster
One thing that has always made me curious was the fact that -for us- daily data (EPC, eCPM, revenue) is almost random and unpredictable. Yet when we look at longer terms, sometimes there seems to be no impact on the numbers whatsoever. It's almost like Adsense is on autopilot, following a pre-determined route no matter what.
For example: EPC
Doing a little variance analysis over the past few years, I found that the individual daily data is indeed almost random (variance coefficient > 13%). It's no surprise that a variance analysis performed on the long-term 200 day averages results in a much more robust data set (variance coefficient < 1.5%).
But what amazes me is the fact that over the past couple of months, the individual data has become even more unpredictable (variance coefficient usually > 20%) while the 200 day average set has become EVEN BETTER in consistency (variance coefficient <= 0.4%).
In other words: long-term EPC seems to act as if nothing has happened since September (remember - there is an economical crisis going on!). I have to assume that there has been a major shakeup of advertisers; we're seeing shrinking CTRs. We're seeing shrinking eCPM. Yet long-term average EPC is rock solid while individual daily data suggests major movements in the market.
Who or what is the driving force here? Why do daily values violently go up and down without any effect on long-term averages? Could it be that Adsense is adjusting daily EPC according to a long-term value (i.e. not based on advertisers bids at all)? Are there other explanantions?
I'm puzzled.
As for what creates the noise - there's going to be a random characteristic to visitors, and I suspect Google are trying things out with the algo on a daily basis too. Add into that delays in reporting and it should account for day to day variance.
Isn't that just noise in the data.
There's too much "noise" in that data for my taste, and I wonder why.
The other day I had my best day (EPC-wise) of 2008, and the NEXT day I had the worst day of 2008! I do not believe that this can be explained by "natural" market forces.
To me it looks like our sites have been assigned a certain EPC value, and this is being achieved no matter what traffic hits the site. It looks like the pre-determined long-term EPC value will be achieved by tweaking daily values. That would explain the extreme volatility of daily data in conjunction with increasing robustness of long-term data.
Call it smart pricing.. whatever.. and then ASA has recently emphatically stated that ' There is No Ceiling '..
All our 'evidence' is for naught..especially given the various variables at play..
Today the Net has died on my sites, virtually no traffic whatsoever yet my average EPC is at an all-time average high.
I know I am new to this - do not have years of data but wonder what is going on.
Who or what is the driving force here? Why do daily values violently go up and down without any effect on long-term averages? Could it be that Adsense is adjusting daily EPC according to a long-term value (i.e. not based on advertisers bids at all)? Are there other explanantions?
Well it's possible google is doing it but to answer your question, there's probably many more also plausible reasons to explain this.
I'd actually like to see some better statistical analysis here, since a) I'm not familiar with a "variance coefficient", and I used to make my living in data analysis, and b) I suspect it's a poor measure.
However, while I don't have time to go through the issue in detail, you have to look at a) regression to the mean, and advertiser behavior that is actually quite consistent over time but is inconsistent for any given day or portion of day.
Remember too that there are compensation factors going on, so as ctr goes down, for whatever reason, clicks become overvalued in the system due to supply and demand therefore creating an equilibrium over time.
On the regression to mean, remember that for any outlier day point on the low side, there will be a similar one on the plus side and vice versa. More extensive stats analysis would probably provide more insight but probably not that much.
First thing I'd look at is whether the S.D.s are changing over time for daily stats.
The seeming volatility that Zett is experiencing may be statistical "noise," but it may also be affected by the number of pages, amount of traffic, diversity of keywords and advertisers, etc. A 100-page site about red widget fasteners for the doughnut-machine industry probably won't look as stable as a 5,000-page site about all kinds of widget-related products in dozens of different industries.
Afterwards, the overall general trend, be it up or down, resumes without spikes.
If you disagree, fine. I am only guessing, and I know next to nothing about the math of probability and statistics.
Personally, I think I am close to giving up any hope of making serious money with AdSense, and I am working hard to reduce their share of advertising to below 25%. But nevertheless I watch with interest all that they do. There seems to be no rhyme or reason.
Not sure if I subscribe to the theory of the glass ceiling, but I can not help but notice that my eCPM trajectory over the past 5 years is always downwards, long term. Now I can look at several years' worth of data, I'd say my eCPM gets roughly halved every 18 months or so. Sometimes, with traffic growth, it is possible to make up for this and at least stay flat year-on-year, but lately the fall-off has been faster than traffic growth, hence my fatalistic attitude.
The range of pageviews would vary only 1000 +/- but the clicks stay the same.
I had high pageviews low CTR same as before - med eCPM but had the best days I had seen when I was rudely interrupted and cut off like someone turned off the lights.
I also don't buy that there is a 1:1 correlation between clicks reported during a dump and income during that dump. Sometimes it seems that the income will be reported one dump before or after the clicks are reported.
I find it easier to follow the daily numbers, but not worry about them, because they always seem to work out.
I've never bought Google's daily data, that's why I don't really care about it. Back when adlogger worked it became very clear that hey didn't have just the little intraday click dumps, but that some clicks disappeared into the system for several days, and that the "best days" were the ones where they cleared the filters.
BigDave, if you're right, that actually explains a lot. Including persistent discrepancies that I see between AdSense and Analytics. By discrepancies I mean that Analytics is reporting impressions and clicks on a certain page and AdSense simply says, in reference to the page's channel "nope, nothing today". (There's only one channel per page and every page has both A/S and Analytics code). One of them has got to be wrong!
But the "flushing out after filtering" theory would explain that and certain other irregularities I noticed (like, a mysterious high-paying click materializing in Analytics tied to a certain page... several days after the page was removed from the site!)
I'm not familiar with a "variance coefficient"
The variance coefficient is an accepted method of testing the consistency of a given data set. E.g. if two shooters fire, one hits twelve times bulls eye, and the other one scatters his shots accross the board, then the first one will have a much lower variance coefficient than the second one. ("Variance coefficient" is in principle the relationship of the standard deviation of a data set to it's average.)
What I have been looking at are two sets of data, "moving" across the entire lifetime of my account.
SET A consists of 30 consecutive days of individual data (the "raw data")
SET B consists of the same 30 consecutive days of 200 day averages (the "long-term data")
In theory I'd expect (and accept) that when the inconsistency of SET A increases (i.e. higher volatility), then SET B should follow (to a much smaller degree, as we're talking about averages).
In other words: I expect that short-term effects to the data set (caused by big market movements, e.g. advertisers changing their behaviour dramatically) also impact the long-term trend. But it does not. In fact, when plotting the 200 day moving average for EPC on the data set, I see a straight flat line since May/June.
Just to confirm this, I took todays EPC long-term value (200 day average for the previous 200 days) and calculated the deviation of all the previous 200 day averages against this value. Setting todays value to 100, I see (since early June 2008) a minimum of 98.2 (-1.8%) and a maximum of 100.7 (+0.7%) - I'd say my long-term EPC is rock solid and very very predictable.
But the raw data does not support this at all. I agree that the mysterious click-dump process influences this further. The raw data has become even more unpredictable recently. This points to heavy market movements going on, suggesting that supply and demand is at work (sometimes I see such a high volatility that it is not realistic any more). This would be quite logical if the whole market adjusts to a huge economical crisis.
But the long-term data does not indicate this.
Again - all I am saying is that my analysis indicates that something is wrong with the daily data, i.e. with the data Google presents to us as "reports".
A 100-page site about red widget fasteners for the doughnut-machine industry probably won't look as stable as a 5,000-page site about all kinds of widget-related products in dozens of different industries.
Consider our site to be in the latter, i.e. many thousands pages with unique, exclusive content with clear value to the visitor, attracting visitors from across the world, just organically. The main site covers many dozen topics instead of being focused on a single niche.
I see similar patterns with display advertising and affiliate sales.
This information by itself puts a further nail in the coffin of the glass ceiling theory--unless you choose to believe that these other companies also set ceilings. I personally believe Google when they say they don't set a ceiling or an eCPM for a site or sites, but I also believe that the way their very complex system works can produce effects that suggest a glass ceiling....
In one of your earlier posts you mentioned the trends you note do not include ctr, right?
I'd have to reread everything carefully, but off the top of my head, I'd need to see more data. I'd graph standard deviations, actually, since eyeballing is often a really good way to get a feel for what the data is doing.
I'm unsure why you think that short term volatility from day to day should affect long term results for anything, to be honest. But I'm probably missing something you've said.
I'm not clear, either what units of analysis are best, statistically, for this kind of thing. I'm not sure you are looking at it correctly (not to say you are wrong, I'm just not sure). I still think the ideas I set forth before are almost certainly operating to push stability over the long term.
It's fun stuff, but even a proper analysis would exclude too much information we are lacking, and I agree that "daily" numbers probably aren't, anyway.
This information by itself puts a further nail in the coffin of the glass ceiling theory--unless you choose to believe that these other companies also set ceilings.
Actually, they do (ad agencies do). Or rather they have mechanisms in place that operate as if there is a ceiling having to do with allocating inventory. I suspect, and have said before, that I believe google also does "things" that create the appearance of ceilings, even though their intent is not to create a ceiling.
As an aside, burstmedia is reporting that their average cpms this year doubled (primarily for display ads).
In one of your earlier posts you mentioned the trends you note do not include ctr, right?
Yes, for this thread, I am just looking at EPC, because here I see this, er, strange behaviour. I will run the same analysis for the other values to see what's up with them.
I'm unsure why you think that short term volatility from day to day should affect long term results for anything
I just think it is strange to see daily data becoming less predictable while long-term data get's MORE predictable and solid. With the market changing heavily (advertisers enter the program, advertisers leave the program, advertisers change their bids), I'd expect the long-term EPC trends to become also less predictable. But I am not seeing this. And I wonder why.
Or rather they have mechanisms in place that operate as if there is a ceiling having to do with allocating inventory.
I've often said that in the forum's discussions of "earnings caps." It's unrealistic to think that Google would simply serve up ads on a "first-come, first-serve until the advertiser's budget is drained" basis. Other ad networks don't do that, so why would anyone expect Google to do it?
I just think it is strange to see daily data becoming less predictable while long-term data get's MORE predictable and solid. With the market changing heavily (advertisers enter the program, advertisers leave the program, advertisers change their bids), I'd expect the long-term EPC trends to become also less predictable. But I am not seeing this. And I wonder why.
You have to take into account the characteristics of data, reversion to the mean, and the interactive influences of the variables, which you aren't doing.
Further, it's very easy to predict what the average temperature will be for a year in a specific location (+ or - an error factor), but it's not so easy to predict what the temperature will be for that same location on July 7, 2009 (sorry if this is a simplification but sue me.)
Long term aggregated data is almost ALWAYS going to exhibit much more stability than short term data
Please note that I did take that into account. Of course, long-term data can be expected to be more solid than short-term data.
it's very easy to predict what the average temperature will be for a year in a specific location (+ or - an error factor), but it's not so easy to predict what the temperature will be for that same location on July 7, 2009
I agree, but the weather is not driven by a market (i.e. by supply and demand). It's random, sort of. Also, if your weather station is next to a vulcano, and the vulcano explodes and messes up your data set, you WILL see an effect on the long-term figures.
But I did some further number crunching - the same analysis for revenue, CTR, and eCPM as well. Then I plotted the figures into the same chart, beginning with June data.
Surprising results:
1) "Revenue" was below 1% most of the time. It left the 1% corridor early September, got worse every day (i.e. a more random data set with less predictability), now around 2.5%. Never came back until now.
2) "eCPM" left the 1% corridor end October, got worse every day, never came back. This value has been below 0.5% before.
3) "CTR" left the 1% corridor end November to join the eCPM curve. This value has been below 0.5% before.
4) "EPC" - was typically above 1%. Since early June, it's well below 0.5%. Apparently, this value got more predictable while all the other metrics have become less predictable.
Just looking at averages helps filtering out the massive noise that seems to be poisoning the individual data.
I find it interesting that users seem to adopt to the economical crisis (CTR becomes more unreliable) while advertisers do not change their behaviour (EPC becomes even more predictable, alas for our sites).