Category Archives: Ad Tech

Winter is Coming (to Ad Tech)


This past couple of weeks have been full of great news in ad tech. The Trade Desk went public and immediately saw their stock price pop 60%. AppNexus raised a small round, presumably setting themselves up for an IPO. Krux was acquired for $650-750M by Salesforce. HookLogic was acquired by Criteo for $250M.

Is the future of ad tech rosy, or is this the sunshine before a long winter?

Well, as a proud owner of RocketFuel stock (bought at $25, now hovering around $2.5, do the math) I am looking at this as glass-is-half-empty. Don’t get me wrong, I love ad tech. Tech IPO is a good thing and these companies have built real, tangible businesses that can withstand the scrutiny of the SEC and the public market (I’m looking at you Uber and Airbnb). The problem is, beyond the wall(ed garden), a massive disruptive force is rising up, looking to swallow up the whole industry. Facebook.

We all know Game of Thrones is a thinly veiled metaphor for the ad tech industry. We are all fighting over market share, going after each other’s advertisers and their budgets. New houses emerge, alliances are made, innovative weapons are developed, all to claim the throne, which Google is sitting on. House Salesforce has acquired Krux as they go to war against House Oracle and House Adobe. House Criteo has acquired HookLogic to strengthen their arsenal. House Amazon has fucking dragons.

When Bezos had long hair

Meanwhile beyond the wall, Facebook has evolved into a completely different creature. They were initially happy staying on the other side of the wall, but the Audience Network (where FB allows advertisers to buy non-FB mobile ad inventory using FB data) is the breach of the wall. They are coming and no one standing between them and the advertiser is safe.

Zuck delivering his sales pitch

Zuck delivering his sales pitch to advertisers

How formidable is Facebook?

Let’s look at the numbers. Let’s see… in Q2 2016 FB had:

  • 1.7B MAU
  • $6.2B in revenue
  • $2B in Net Income

First of all, MAU and quarterly financials in Billions is ridiculous. They also have $5B in cash, so if they wanted to, they could just buy Criteo ($2.1B) and The Trade Desk ($1B) and AppNexus (prob $1B) and MediaMath (prob $1B) without borrowing money. Not that they would, but they could. Facebook’s market cap ($370B) is bigger than Salesforce ($48B), Oracle ($159B), and Adobe ($54B) combined, plus you can throw in IBM ($110B) for good measure.

Let’s look at FB from a capability standpoint. At a high level, points of differentiation in ad tech are:

  • inventory (eyeballs = scale)
  • data (what you know about the eyeballs)
  • algorithm (to optimize ROI)
  • access to ad budget

The four capabilities reside in different parts of the ad tech ecosystem today. Publishers own the inventory and SSPs consolidate them. Data sits in DMPs and the DSPs provide the algorithm + connects the dots with the SSPs & DMPs. Agencies have access to the budget. Most players in the Lumascape does one of the four. Some players can check off multiple boxes, for an example Appnexus could be considered a SSP + DSP and they also have direct advertiser relationships.

What is scary about FB is that they do all four and they do it 10x better.

  • Inventory: 1.7B MAU is a good start + nothing stopping FB from expanding the Audience Network to more inventory
  • Data: This one is obvious. FB knows more about their users than anyone. (although Amazon does have purchase data = the aforementioned dragons).
  • Algorithm: Army of top data scientists working with massive, proprietary data is a winning formula
  • Access to budget: FB has improved the Ad Account and Business Manager to make it easier for advertisers to manage campaigns themselves. The number of advertisers taking FB ads in-house is only going to increase, chipping away the agency’s business.

To top it all off, because FB can directly connect the advertiser budget to an impression without the slew of vendors taking a margin, FB can do all this at a lower price. Traditionally, a dollar an advertiser had spent would become 50 cents by the time it reached the publisher as the money change hands throughout the Lumascape, whereas in the FB ecosystem, the dollar is essentially pure margin.

What’s Next

Digital ads will continue to grow and eat away TV and other marketing budget, so ad tech will not go away overnight. But while ad tech players are busy fighting each other (like the houses in Westoros), a formidable force has started it’s march south. FB will continue to invest in new inventories through partnerships and acquisitions like Instagram, and will further their self service agenda. There will be further consolidation in the Lumascape to connect the ad dollars to eyeballs more efficiently (as in lower margins for everyone). A lot of small players will get crushed. The time to join forces with rivals to beat the common, great threat has passed. Every house must now answer – why do they deserve to survive when winter is here? What value are they adding that Facebook does not / will not?


CPM Pricing in a Post-Last Click World


“How much should one bid on a specific impression?” is a question I have been trying to answer for years and have shared my thoughts on this blog. Today, we are approaching CPM bid price as lazy financiers approach asset valuation; assume other similar transactions were done rationally and base our pricing on their results. Well what if everyone else had made the same assumption? Are we sure the market value of an impression is equal to its intrinsic value?

CPM Pricing in the Last Click World

In the last click world, life was relatively simple. As a merchant, I know my margins and I can tell you how much % of revenue I am willing to part with for a sale. Given that information, I can back into a CPM price I am willing to pay, using best guesses for click through rate (CTR), conversion rate (CVR), average order value (AOV), and adjusting for risk (or how confident you are in your guesses). More details in this post.  To summarize in one formula, the concept looks like this:

CPM=ƒ(μCTR*μCVR*μAOV*% Sales, σCTR, σCVR, σAOV)

We have always used eCPM as the common denominator to compare performances of different pricing models, which ignored two important things; the quality of the impressions, and the value of risk transfer. The formula takes these two key points into account. Better quality of impression can be defined as an impression with a higher mean CTR and/or CVR and/or AOV, and you should be willing to pay more. Higher risk for the advertiser can be defined as a higher standard deviation for CTR and/or CVR and/or AOV and you should be willing to pay less.

Then Came Attribution

The last click model is linear but consumer journey is not. You can’t look at each marketing channel & campaigns in a vacuum, since that is not how people experience your brand and end up becoming a customer. The concept of attribution makes total sense. The issue is, in reality it is impossible to attribute a sale “accurately”. Not even the consumer will be able to tell you why she ended up buying that item on that day.

For all we know, the customer bought the item because she had a bad day, had one too many glasses of wine, and saw a handbag she liked displayed in the storefront on the way home. So we should attribute 30% of the sale to her ex-boyfriend, 40% to the bar (20% to bartender, 20% to the guy buying her drinks), 20% to the store, and of course 10% to Google because she searched for the product name to get to the purchase page.

By trying to illustrate why accurate attribution is impossible, I also just illustrated why last click is pretty much always wrong. In the above case, all these things contributed to the sale but Google just happened to be conveniently located as the last click hub of anyone knowing what she wants to buy. Where is the value in that as a marketing channel? Why would Google deserve 100% of the credit? (hint: it shouldn’t)

So we know last click is (almost) always wrong and attribution is never accurate. That’s why a whole slew of attribution companies are out there touting their attribution logic to be the most accurate. “Black box” guys hire PhD’s and crunch numbers based on consumer touch points and conversions and other data points (“Can’t explain exactly why it’s accurate but trust me, I’m a doctor”) and some are more open, incorporating the advertisers requests (“Yea, we have no idea either. Let’s prove your boss was right all along”).

No matter how imperfect attribution is, it is reality for digital marketing today because never accurate sure beats always wrong. So what does attribution mean to our little formula? Obviously, the formula needs to be adjusted but just how?

Influence is the New Click

When we review the formula, we notice that two things happen outside of the advertiser’s domain; the impression and the click. CVR and AOV happen on the advertiser side and the advertiser obviously has control over the % of Sales to be paid. In the post last click world, what we care about is “influence”. If accurate attribution was available, we would be measuring how much of the influence did this particular ad unit have to this consumer’s purchase. Clicks used to be the go to proxy for influence because it required user action and intent. But what about video ads? Interactive ads that don’t need you to click? Audio ads? A really well made static display ad that captivates you? There are so many things that can happen between an impression and a click. Click as a measurement of influence is flawed and by association, CTR is also flawed. What matters is the conversion rate between who the ad campaign reached and who visited the advertiser site. This is measuring unique users, so a user coming back to the site via retargeted ad would count as one visitor. Let’s call this conversion rate RUR for Reach to Unique Ratio.

The reason why we should look at RUR from a unique user perspective is because the consumer journey does not end when the user visits the site. It ends when the user converts or the marketer gives up. So when a user visits a product page, leaves to do additional research, the retargeting ad that brought her back should not take whole credit of the sale. The brand touch points that brought her to the site in the first place should also be credited. There’s another variable that we have to consider. How many touch points it takes for the user to become a unique visitor to the site (TP as in touch point, not toilet paper).

The New Formula

Given all these thoughts, I can finally put together the theoretical CPM pricing formula taking attribution into consideration.

CPM=ƒ(μRUR/μ#TP *μCVR*μAOV*% Sales,
σRUR, σ#TP, σCVR, σAOV)

Let me try to explain this verbally. The CPM you should be willing to bid on a given impression has to do with the average % of users you reach who turn out to be a unique visitor to your site, how many touch points it takes for you to get this user to come visit your site, and once visited, what the conversion rate, AOV are, and of course, how much of the revenue you are willing to give up. Oh, and don’t forget to adjust for risk.

You may have noticed the definition of CVR actually changed slightly in this formula because we are allowing for the eventual return of this user outside of the current session, so CVR is actually “eventual CVR”.

Ah, Crap

I hope you didn’t notice I took a major short cut. Because I just did and I want to go to bed but promised myself I would finish this post today. Attribution. Yes it is still haunting me. The assumption I made in the formula is that every impression is equal and I can use the average number of touch points to get to the CPM price. This means, if we average 4 touch points before a user visits the site or we give up, I am attributing 25% of the credit to each of the touch points.  If the whole attribution movement taught us anything, it taught us that each touch point is unique in how influential it will be. Let’s call that variable % Infl. By definition, the % infl will add up to 100% when the user visits the site or the marketer gives up.

CPM TP1→N =ƒ(μRUR * %Infl TP1→N *μCVR*μAOV*% Sales,
σRUR, σ%Infl, σCVR, σAOV)
where sum (% Infl TP1→N) = 100%

I don’t even know the mathematically correct way to denote this but the spirit is there. I think I actually used an Excel formula near the end, but whatever, I had 3 hours of sleep last night. When you are bidding on an impression, you have to know where in the consumer journey this person is and what state of mind she is in, and what touch points she has had with your brand. Given all that and with the magic of big data, and your future intent to buy this consumer’s impression, you buying this next impression will have a certain effect on this person’s psyche and hopefully purchase behavior in the future. When the % Infl adds up to 100%, theoretically the user has RUR% chance of visiting the site, CVR% chance of becoming a customer, and spending $AOV.

In Conclusion, So Many Freaking Caveats

Well at least I got to some kind of formula, no matter how weird it looks. There are a boat load of caveats and three come to mind immediately.

First, as I mentioned above, there are so much predictive algorithm that needs to happen to get to the RUR and % Infl variables. But this is the nature of being able to bid on the most granular level possible to mankind, the impression. Now most bidding currently happens on the segment level and my thought process assumes personal level, so maybe this is the kind of thing we need to think about in the future.

Another caveat, not sure if you caught this, is the question of incrementality. I went through all this trouble to calculate how much this impression is worth, but in reality, the advertiser already has some organic traffic that converts at a certain rate. Shouldn’t the advertiser spend only on the difference of the effect? There is a lot of argument that can be made for both sides but maybe some kind of discount needs to happen. Also remember way earlier in this post I mentioned that the quality of the impression not only increases CTR but potentially also CVR and AOV.

One final caveat. Won’t advertisers care about the life time value of the customer instead of the one sale? Yes, most certainly and I think the formula can be adjusted to take that into consideration. At some point in the near future, I’ll explore this idea.

Man, the future of digital marketing is full of formulas and quantitative intellectual reasoning. There has got to be someone better at this than me. I failed calculus 18 years ago (I was a terrible student back then)!


The Intrinsic Value of an Impression


In the finance world, there are roughly two ways of calculating the value of an asset; the market approach and the cash flow approach. You can valuate company X by comparing it to a similar company Y, by taking the ratio between the market value and operating profit or EBITDA or other financial measurements. Alternatively, you can come up with the best guess estimate of future cash flow from the company and use a discount rate to calculate the net present value. Now, calculating future cash flows is painful and calculating the discount rate can be even more of a pain the ass, whereas the multiple approach is much quicker because it makes one key assumption: the market value of the comparison asset is correct. The flaw with the market approach is that when the comparison asset is not valuated appropriately you are never going to get your valuation right. When the market is wrong everyone is wrong.

So being this is (sometimes) a blog about ad tech, why am I talking about this? Because I wanted to pose a question: When you are making your bid for an impression, what are you basing it on? Are we assuming that whatever the market is willing to pay for the impression is the “right” price? Are we bidding the current price plus a penny for an impression we want? You can already guess where I am going with this. What if we are pricing everything wrong? What if everyone is overpaying for those impressions?

Conceptually the cash flow method is simple. We should be valuating the impression based on how much the impression is expected to bring us in terms of revenue (or profit). Say you have a $100 product you have an ad for and you are willing to spend $10 to get someone to purchase it. If you knew for certain that a person has a 10% chance of buying your product by showing your ad, you should be willing to spend $1 on that impression. Now, as discussed in my older post An Efficient Market for Online Ads there are a few variables at play here:

  • Click Through Rate (CTR)
  • Conversion Rate (CVR)
  • Average Order Value (AOV): the average value of the basket
  • % of Sales I am willing to spend for a conversion

How much I am willing to spend on an impression can be expressed as: CPM = CTR*CVR*AOV*%Sales. These variables all have different variances meaning some variables are more predictable than others and how confident I am about the variances will affect how much I am willing to pay. The % of sales I am willing to spend is an internal decision that depends on my cost structure, capital structure, and appetite for risk. Given all this, the formula looks like this:image

The above assumes last click attribution which is being really under fire these days (for good reason). I will share my thoughts on attribution on CPM pricing in another post, but the point here is, you should not assume whatever price the market is bidding for an impression is the price you should be willing to pay. Theoretically there is a “right” price you should be willing to bid for every impression, given the characteristics of the impression and your situation. Whether you act based on the market value or intrinsic value of that impression is up to you.


Next Step for Ad Tech: Product Data Innovation


We’ve reached the most granular level in online advertisement targeting technology: an SKU to an impression. So where will the next innovation come from? I’m thinking the next big step in ad tech will be on the product side, helping advertisers pick which SKU to show a particular impression, at a specific bid price.

An impression is more than just the age and gender of the person. It is the person at the exact time and place. It includes the demographic, psychographic, and geographic information, the taste, mood, and everything about who the person is, adjusted for the context of what she is looking at. We probably can’t combine all those data for a given impression yet, but we’ve made a lot of progress in painting a picture of this person at the exact state in time.

What about the SKU side? When an impression shows up on a website, how much should an advertiser bid on this impression and more importantly, which product should he show?
I believe an impression has a fair intrinsic value that is a function of the predicted click through rate, conversion rate, the average order value, and how much % of sales the advertiser is willing to pay, adjusted for risk. In a formula, this looks like this:

I’ve outlined my thought process in a previous entry, found here.

My question is, of all the advertisers that are bidding on impressions, how many are looking at the product side on a data driven way? By data driven, I don’t mean “this product is designed for single urban males who like cars”. I mean “analyzing all the transaction history involving this product, the calculated CTR & CVR of this impression is X & Y, with standard deviation of Z”. The point is, advertisers are choosing which product to show without the rigorous past purchase analysis that should point them to the optimal SKU to show at the optimal bidding price. If you sell hundreds or thousands or even millions of SKUs how can you be sure that the ad you decide to serve is the best one in your product portfolio? Also if you are not amazon or Walmart do you have enough data to really make an intelligent decision?

The next innovation in ad tech will effectively collect these purchase data across merchants and transform it in a way that is useful to advertisers. With information asymmetry out of the way, advertisers will bid on impressions based on predicted CTR, CVR, and AOV. So what will win an impression when multiple merchants are selling identical products? It is the % of sales the advertiser is willing to give up thus ensuring the publisher will get the maximum $ for the impression. The user who will see the ad will see a calculated, optimized ad, which should match his profile so well that the ad will be less of a distraction and more of a content. Of course the advertiser who was willing to pay the most will get to show the ad, so we have a win-win-win situation between the user, publisher, and the advertiser. Moreover the product side innovation should drive automation even further. Ultimately, a user showing up to a website will trigger a process that will sift through millions of products across advertisers to find the optimal one. That sounds like a more efficient market to me.


An Efficient Market for Online Ads


This is another old post from Tumblr.

In my last post, my argument was essentially: The difference of CPM/CPC/CPA is risk transfer and the fair price can be determined by looking at the mean and standard deviation of the three risks that are being shared between the advertiser and the publisher. The three risks being: click through rate (CTR), conversion rate (CVR), and average order value (AOV).

I was thinking option pricing may work to get to the right price but upon further thinking, this looks more like a job for crystal ball or some statistical simulation tool. So, if you have all the data points, you can plug them in and boom, you should have an idea of fair relative price for the each model. If you are bidding on an impression at a certain price, how much should you be willing to pay in CPA? This equation can be expressed like this:


This assumes that whatever bid price you have for an impression is the right price. In a perfectly efficient market where the bid price is always right, this makes sense. But we live in a imperfect world and advertising is especially an inefficient market, so this should be the other way around. Our bid price for an impression should be driven by how much % of sales for this certain product you are willing to give up in order to show an ad in front of this person. So the equation that we are actually trying to solve should be:


If I am making 5% gross profit margin on a speaker, I am only willing to give up 0.5% of the price. But if I sell my own brand of super expensive shoes at the GP margin of 70%, I may be willing to pay 10% of sales. So the % of sales you are willing to give up is purely an internal decision driven by the nature of the product and your risk adversity. Let’s say for this one user looking at this blog, I know the likelihood of him clicking on the ad is, the likelihood of him buying something is, and know how much he will spend, plus I know I am willing to pay 10% of sales to get this guy to purchase my product, then I know exactly how much this impression is worth to me and how much I should bid.

The only problem is, we don’t know the exact CTR and CVR and have no idea on their standard deviation. This is where big data comes in. If we can identify this user and look at his past behavior, we can make an intelligent guesstimate of the variables. We may also take context into consideration. If you are looking at a funeral website, you probably won’t be clicking on an ad for a sports car.  But what is the problem we are trying to solve? It’s not just that the more data the better. We need to predict this user’s behavior as accurately as possible.

So the point of this post boils down to this. The holy grail of online advertisement is predicting intrinsic demand.

That means I want to show you an ad with a product that:

  1. You don’t already have
  2. You don’t know you want (because you would be on Google or Amazon searching for that product if you wanted it now)
  3. You really really want

I think in the online advertisement market, we are finally trying to figure out #3. Retargeter do this by saying, “hey you almost bought this product, maybe you would actually buy it if I give you free shipping”. The problem with that is scale. You don’t abandon shopping carts everyday so the amount of ads you can serve with this method is limited or you will be annoyingly repetitive. So, in order to understand what you really really want that you don’t know already, we want to understand who you are and your taste.

Let’s say I sell leather pants (which I don’t and have no plans to). There is a user looking at a blog and I have an opportunity to show an ad.
Here is the user profile:

  1. Male
  2. 30’s
  3. stable job
  4. married with children

I’m quite indifferent about this guy. Not willing to pay much at all.
But what if we also find out he loves metal, Harley Davidson’s, and wears leather jackets? This is a guy I want my ad to be shown to and I am willing to pay for it. So the more specific information I have about this guy and the more match I see with my product, the more I am willing to pay.

The race for the new new thing in online advertisement is, who can predict a user’s taste and match that impression to the advertiser with a product that maximizes his desire. At this point, advertisements should become less of an annoyance and more of a content. Amazon has a whole bunch of information on what you’ve bought before and can infer what you might like. But the cool thing about Pinterest is that it can connect a whole bunch of seemingly random products that you like, or someone similar to your taste likes and create a taste map. This can be used to understand the user but also to understand the product.

With the granularity of decision making reaching the logical extremes of “one impression” to “one product”, it looks like the field is finally set to start the data collection & algorithm battle. There is so much inefficiencies for companies to profit from and it will be interesting to see if giants like Google/Amazon/eBay will figure it out first or a new comer like AppNexus/Pinterest will come out on top. I’m excited to see Pinterest taking a stab at this with their unique asset.


CPM, CPC, CPA, and the Transfer of Risk


This is a post I wrote on Tumblr a long time ago (July 2011!) but no one read it.
The point is still valid, so I’m copying it here with slight updates.

I am currently employed in the CPA side of the online advertisement industry, putting me solidly in a minority. There are millions of sites that explain the definition of CPM, CPC, and CPA individually, but not many describe them relative to each other. Just to go over the basics:

  • CPM: Cost Per Mille. Advertiser pays the publisher per 1000 of visitors who the advertisement is shown to. Cost = # of Impressions / 1000 * CPM
  • CPC: Cost Per Click. Advertiser pays the publisher for each click on the advertisement. Cost = # of Clicks * CPC
  • CPA: Cost Per Action. Advertiser pays the publisher for each desired action such as a percentage of sales or a filled out form. Cost = # of Actions * CPA

So how do these different pricing models relate to each other?

  • # of Clicks = # of Impressions * Click Through Rate (CTR)
  • # of Actions = # of Clicks * Conversion Rate (CVR)
  • So, # of Actions = # of Impressions * CTR * CVR

If you are paying $100,000 for a campaign and you get 1,000,000 impressions, CTR = 10% and CVR=10%, what are the CPM, CPC, and CPA?

  • CPM = $100,000 / (1mm / 1000) = $100 (or $0.1 per impression)
  • CPC = $100,000 / (1mm*10%) = $1
  • CPA = $100,000 / (1mm*10%*10%) = $10

This means we can relate the three this way:

Except, we did not consider one thing, and that is this guy…


Both click through rate and conversion rate will have a standard deviation, meaning those numbers are never constant.Sometimes the numbers will be above average and sometimes will fall below average. Even if you know the median CTR and CVR for the publisher, advertiser, and the advertisement, that’s not always going to happen (in fact that will almost never happen). The wider the distribution curve, the more likely the CTR and CAR will diverge from the median, which means higher risk for either the advertiser or the publisher.

Changing the pricing model from CPM to CPC to CPA is the act of transferring risk from the advertiser to the publisher. Let’s take a leap of faith and assume that the advertiser wants to drive sales.

In a CPM model, the advertiser is bearing both the risk in CTR and CVR. From the publisher’s perspective, all you need to do is drive traffic and you’ll get paid. If you decide to run a yamaka ad on a mormon website, you’ll still get paid. The advertiser is bearing all the risk.

In CPC, the advertiser transfers the CTR risk to the publisher. Now that yamaka ad is not going to do too well. The incentive for the publisher is to show advertisements that is relevant to the audience so they can generate clicks.

In CPA, the advertiser transfers not only the CTR risk but also the CVR risk. So even if the publisher is able to generate traffic to the advertiser website, they won’t get paid unless the user actually purchases something or fills out a form.

That is asking a publisher to do a lot. If you think about a percent of sale offer, the publisher is taking more risks than just CTR and CVR. If the user only spends $2 on the website, the publisher will only get a tiny pay. So the publisher is also taking on the risk of the average order value (AOV). In fact, CPA is basically riskfree for the advertiser and it should not even be considered a marketing expense. It is more of a cost of goods sold expense.

So, in order for the publisher to take on more and more risk, the below formula must hold true.
The publisher must be rewarded with a higher payday with CPA compared to CPC, which in turn will be more expensive than CPM.

Exactly how much more expensive should CPA be? That’s the million (billion?) dollar question. We are valuing risk based on standard deviation which from my knowledge, sounds awfully like an option…