“Half the money I spend on advertising is wasted; the trouble is, I don’t know which half.”

-John Wanamaker (marketer and department store magnate)

 

Even though this quote is over 100 years old, it resonates with today’s marketers. They know that a portion of what they invest in advertising is wasted, but it’s not always clear how much waste there is or how to identify it.

Marketers understand that Google Analytics relies on last-click attribution, disregarding the rest of the funnel. Even though this quote is over 100 years old, it resonates with today’s marketers. They know that multiple advertising platforms take credit for the same conversions. They are aware that platforms use arbitrary attribution windows (e.g., 7-day click + 1-day view), overlooking customers who convert after 8 days and over-crediting view-through conversions that likely would have occurred regardless. Marketers know that ad blockers and browser privacy prevent pixels from tracking about 40% of their conversions.

But although the problem is clear, the solution is not.

 

First-Party Tracking Is A Must, But It’s Not Attribution

Some marketers have transitioned from third-party pixels to first-party tracking. This is highly recommended in today’s digital world, but even though tracking more conversions is key to helping ad platforms learn what creatives and audiences work best, this doesn’t solve the attribution problem. While it’s helpful to see that a conversion path involves three channels, arbitrarily assigning 33% of the credit to each is barely an improvement over giving all the credit to the last or first click. You would credit Michael Jordan with 20% of the Bull’s success just because he was one of five players on the court. True attribution should measure the impact of each player, not just their mere presence on the court.

 

Attribution Should Measure Impact

In marketing, a channel’s impact should be measured by how much it lifts conversions. Imagine you own a taco stand with 100 customers a day. Half come from the west corner and half from the east corner. You hire a promoter to stand on the west corner with a sign advertising your tacos, and you notice that although you still get 50 customers from the east corner, you now get 70 from the west corner. That’s your lift: 20 additional customers. It doesn’t matter that the 70 people saw the sign because 50 of them were going to buy anyway. In this case pixel-based attribution (aka multi-touch attribution) would credit the promoter with 70 customers, whereas impact-based attribution (aka media mix modeling) would measure the actual lift.

 

Building Accurate Attribution Models Isn’t Easy

More than half of US-based companies’ marketing budget is spent on ads. If measuring and optimizing performance is so critical, why isn’t everyone using media mix modeling?

Building a great model is very hard, and most companies don’t have a team of data scientists to do it well. A great model needs to capture the true causal relationship between your marketing and your conversions, and accurately predict the outcome of any given scenario, even the ones it hasn’t seen before. What would happen to your conversions if you turned off all your ads? What if you increased Facebook remarketing by 10% and decreased Google Performance Max by 25%?

Because our clients move around millions of dollars to optimize their media mix based on our models’ insights, it’s crucial that their models be properly configured, validated, and tested. 

When done well, modeling is four times more accurate than pixel-based attribution and helps our clients save, on average, over 20% of their advertising budgets. Most of our clients then reinvest those savings into their best channels and campaigns, given that a robust model allows them to predict, with a very high degree of accuracy, how many additional conversions they’ll receive from every dollar invested in each media.

One of the most powerful aspects of modeling is its ability to measure what pixels cannot: TV / OTT / CTV, radio, podcasts, influencers, and sales from Amazon, Walmart and other retail stores. Models also allow marketers to measure marketing’s impact on a number of different metrics, such as new customer acquisition, retention and lifetime value. That’s very important when the customer lifetime value is much higher than the customer’s first transaction.

In this whitepaper, I’ll walk you through our modeling process step-by-step, so you can understand how we build, validate and test our models.

 

Why Bayesian Models Are the Future

You may have noticed that I’ve been talking about models, not model. That’s because we build a model for each client. Because a custom model takes into account your customer mix, product/service mix and media mix, it will always outperform a one-size-fits-all model.

The perfect model doesn’t exist. The world is very complex, and it’s impossible to capture every single factor that could influence conversions. Nobody could have predicted COVID, or that Jane Doe will convince all her friends to buy from you. So, the best model is the one that captures reality as closely as possible.

We use Bayesian models, which are very powerful because they perform well even with small datasets, and they allow incorporating domain knowledge as priors, which are later updated based on the data received.

Suppose you are asked to estimate the percentage of vehicles on the road that are trucks. Based on your experience, you say it’s about 20%. Then, you go for a drive and out of 100 cars you see, 17 of them are trucks. That day’s experience isn’t the ultimate truth, but based on this new data point, you update your belief to 19%. If you repeat that a few days in a row, you’ll update your belief each time. If you log 17% of cars being trucks multiple days in a row, your new belief will be 17% even though you started with 20%. If you move from the city to the countryside, you’ll notice that now 33% of the vehicles you see are trucks. However, you know that it’s different in the city, so your new belief, assuming that half of the vehicles are in the city and half in the countryside, will be the average of both, 25%.

Marketing operates in a similar manner. Let’s say that Facebook Ads reports a ROAS of 5. We know that’s not correct, but it’s useful as a prior – a starting belief that will later be updated as new data is received. If you increase your spend by $10,000 and get $40,000 in revenue, that would indicate a ROAS of 4, which allows us to update our prior belief as a posterior. That posterior is then updated again the following day. If the ROAS figures you get are 4.0, 4.1, and 3.9, your new ROAS will be very close to 4, regardless of what that initial belief was.

Although Bayesian statistics requires working with priors, it also allows us to assign different weights to the priors based on our degree of certainty. If we have plenty of historical data to see definitive patterns, we can define wider priors – give them more weight. However, in most cases, we choose to work with narrow priors to minimize the weight of our opinions or our client’s opinions, therefore letting the data speak for itself.

Our models measure the impact of marketing channels (independent variables/inputs) on conversions (dependent variable/output). It quantifies how changes in inputs affect the outputs. If you double your Google Ads budget and the number of conversions doesn’t increase, that would suggest that Google doesn’t influence your conversions much. If you turn off Facebook Ads and your conversions drop substantially, that would suggest that Facebook has a big influence on conversions.

One instance of either wouldn’t be conclusive evidence, but if every time you turn up a channel your conversions increase, and every time you turn it down your conversions drop, that would be a very strong impact indicator.

 

Natural Experiments and Incrementality Experiments

Whenever possible, we use geo-segmentation to multiply the number of daily observations. That is, we study how inputs influence outputs for every state, ZIP code, or designated market area (DMA.) Going back to the taco stand example, if you own 10 of them and you notice a 20 customer lift across all of them, you’re going to be much more certain of the effect than if you only own one taco stand.

Our models study natural experiments and incrementality experiments. Natural experiments are all the budget increases and decreases you had in the past, even though they weren’t deliberately designed as experiments. Incrementality experiments are essentially a randomized controlled trial (RCT), the golden standard of science. They have a treatment group (a set of geographical segments for which we shut off or scale up a given channel for a period of time) and a control group (for which we maintain our budgets.) The experiment then uses a synthetic control method (SCM) to measure how much your conversions for your treatment group changed compared to the expected conversions given what we know about the control group. 

For example, let’s say that during the experiment design we determine that Oregon and Nebraska are the two ideal states to run an experiment because they’re the best representation of the whole US market but they only account for a very small portion of all the conversions. Our goal is to maximize the statistical significance of the experiment while minimizing the negative effects on the business. The experiment is designed to last 22 days, as that is that would give us a large enough sample. During those 22 days of shutting off Facebook Ads, the conversions in those two states dropped 20%, while the conversions in the other states dropped 5%. We could then infer that 5% of the drop was due to market conditions and other factors, and 15% of the drop was due to Facebook Ads.

Because we used a random sample (i.e. there’s nothing suggesting that consumers in those two states are any different than those in any other state) and both groups were subjected to the same conditions (same time of the year, same exposure to all the other marketing channels), the performance difference can be attributed to the variable we isolated: Facebook Ads.

We recommend running at least one annual experiment for every channel where you’re investing over $250,000 a year. Our platform designs your experiments automatically, and it lets you incorporate the results in your model with a high degree of confidence given that the data comes from an experiment rather than a personal belief.

Econometrics is the branch of statistics that studies cause and effect. Just because two things are correlated, it doesn’t mean that one causes the other. Capturing the true causal relationships in the data is the most important and the most challenging aspect of modeling complex phenomena. RCTs are the most powerful tool in an econometrician’s toolbox, but incrementality experiments aren’t always possible. Sometimes it’s for technical reasons, and sometimes it’s because companies don’t want to mess with their marketing mix. Our platform doesn’t require incrementality experiments and it’s designed to learn from natural experiments as well.

 

Automated Data Pipelines

Our clients need to make decisions in real time, so our models need to be always up-to-date. We set up pipelines to get the data we need from over 250 platforms via API, then load it to a secure data warehouse, and run a few processes to clean it up and fix any potential issues. There’s a sequence of jobs that run daily so models are always working with the latest data.

 

Feature Engineering

In machine learning, all the input variables (e.g. media spend) are known as features. Selecting and engineering features is a key aspect of building a great model. Most companies have hundreds or even thousands of campaigns, so we need to group them into categories based on common traits.

The most common way to do this is to group the campaigns in each channel by top of funnel (TOFU), middle of funnel (MOFU), and bottom of funnel (BOFU). This helps you understand two key questions to optimize your media buying.

  1. Am I investing too much or too little in any of these three areas? Am I, for example, spending too much retargeting the same users multiple times, and not enough targeting new users with brand awareness ads?
  2. Which channels are most effective for brand awareness, which ones for nurturing, and which ones for conversions? Am I finding, for example, that Facebook and TikTok are great to get in front of new users, and then Google and Bing are ideal for retargeting them?

TOFU ads need to get the user’s attention with a strong hook and generate interest with your primary value proposition, MOFU ads need to provide additional value propositions and social proof, and BOFU ads need to have a clear call to action and invoke a sense of urgency. Not all channels do all these things equally well. Your marketing performance depends on determining the best media type for each stage and on the optimal balance among TOFU / MOFU / BOFU budgets.

This is the most common setup our clients have, but many of our clients need something a bit different. Some want to group campaigns by product or service offering, by value proposition, by target audience, by ad type (e.g. video vs image), content source (UGC vs brand), keyword type (e.g. competitor keywords vs brand keywords), etc. This level of customization is one of the most powerful aspects of Data Speaks.

There’s no right or wrong way to engineer features. The right way is the one that aligns with your decision-making process. If the campaigns in each category share some similarities, each category represents at least 2% of the total spend, and the model helps you make quality decisions with confidence, that’s the right way for you.

For each media, in addition to the spend, we pull impression, reach and frequency data. There’s a big difference between reaching ten people once or one person ten times, even if they both cost about the same. You want to know not only how much to spend per media, but also how to optimally balance reach and frequency. Knowing the number of impressions allows the model to understand whether an increase in spend delivered more impressions, or if it was driven by a rise in CPM around holidays with high demand for ad inventory.

 

How We Determine Lead Value for B2B / Lead-Gen Clients

After designing the input variables, we need to design the output one. That’s very easy for ecommerce because we already have revenue data. For B2B/lead-gen, we need to figure out the value of each conversion. Not all conversions are created equal. Maybe a whitepaper download is worth $100 and a demo request $300. This is an exercise we do with our clients. If they don’t already know the value per conversion, we normally explore their CRM data to figure out two things:

  1. What’s the lead-to-deal conversion rate?
  2. What’s the average deal value?

If 1 in 10 demos result in a new account and each account is worth $3,000, then each demo booked is worth $300.

We also look at the overlap among conversions because it’s common for people to complete more than one. If half of the people who download the whitepaper also book a demo, then we need to subtract them to avoid duplication.

Once the input and output variables are ready, we include all the other features: organic channel data (e.g. email/SMS/affiliates/influencers/etc.), promotions, seasonality, and other external factors that may influence the output variable.

 

Data Transformation Phase

 

Pre-Processing

Quality data is imperative to produce quality models, so we run a lot of quality checks to ensure data integrity. We look for missing data and outliers, and choose the best method to address those issues. In machine learning, data is usually scaled to ensure that all features are on comparable scales, which is why during this stage we identify the best scaling method according to the data we’re working with (log, Min/Max, Zscore, IQR, etc.)

 

Adstock Effect

If 100 people see your ad today, some will convert today, some next week and some next month. This is known as the adstock effect and it’s measured by the number of days it takes for an ad unit to produce 90% of its effect. This is different for each media. TOFU ads may take 15 days to reach this threshold, whereas BOFU ads usually do it in one or two days. Once we determine this for every media, we transform it accordingly and we move on to the next transformation: the saturation effect.

 

Saturation Effect

As you spend more on a given media, you start getting diminishing returns. ROAS is usually lower at a $10,000/day spend compared to $1,000/day. It is then essential to understand the response curves for each media: how the ROAS changes as the spend increases.

 

Model Configuration Phase

 

Organic vs Paid Channels

The first step in the model configuration process is determining what portion of the conversions can be attributed to organic and paid efforts. We do some exploratory data analysis and talk to our clients to identify good priors, and then we run our model to find the best parameters based on the data. Only after identifying what percentage of the conversions are organic or paid do we start identifying what percentage of paid conversions should be attributed to each paid channel, and what percentage of organic conversions should be attributed to each organic channel. This hierarchical approach to modeling (i.e. a model within a model) is crucial to produce the most accurate model possible.

 

Paid Channels ROAS

If we determined that $4M out of $10M is driven by organic channels, then we need to figure out what paid channels drove the other $6M. This can be represented by the following equation:

Revenue = Organic revenue + channel A spend * channel A ROAS + channel B spend * channel B ROAS

Here are two possible scenarios:

$10M = $4M + $1M spend * 4 ROAS + $2M spend * 1 ROAS

$10M = $4M + $1M spend * 2 ROAS + $2M spend * 4 ROAS

We know the spend and the revenue, but we don’t know the ROAS yet. Although these are only two possible combinations for ROAS, there are actually infinite combinations possible. Although we won’t know the actual ROAS for each channel before training our model, we need to identify the parameter boundaries to explore. In this case, channel A’s ROAS couldn’t be higher than 6 or lower than 0 (assuming that the worst case scenario would be that it doesn’t increase sales but it doesn’t decrease them either.) That range between 0 and 6 is the search space the model will explore to find the most likely ROAS for the channel and the confidence intervals.

 

Channel Interactions

Channels often influence other channels. The more you spend on TikTok awareness ads, the more people search for your brand on Google or are shown retargeting ads on Instagram. In other words, some media types generate demand while others capture it.

For this reason, the impact of demand generation channels (TOFU) should include both their direct impact on conversions as well as its indirect impact (i.e. the lift they drive in the demand capture (BOFU) media). The model also needs to understand the time lag between the TOFU and BOFU interactions (e.g. the number of days between the TikTok awareness ad click and the consumer searching for your brand on Google).

 

Promotions

Media performance is very different during promotions, so it’s vital for the model to capture how much of the impact was driven by a promotion as opposed to the media itself.

During promotional periods, companies tend to spend more as algorithms are encouraged to drive more traffic when conversion rates are high. But although there’s a correlation between spend and revenue during promotions, it’s key for the model to understand whether there’s a causal relationship and in which direction it goes.

If your spend in a channel increases during your New Year promo and our model doesn’t know that, it may conclude that the channel drove the revenue increase, and that you can create another “New Year promo” situation just by increasing the channel spend.

There are two main kinds of promotions that we model. When customers expect them (e.g. Black Friday), conversions tend to be a bit lower than usual before the promo as customers await for a deal, they peak during the promotion, and they drop lower than usual after the promotion. When promotions aren’t expected (e.g. last-minute flash sale) then there’s no drop prior to it as customers don’t know a better deal is coming.

 

Model Training Phase

Once the data is cleaned up and transformed, the features designed, and the model’s parameters have been defined, we start training the model.

Data Speaks is essentially a very sophisticated simulation engine. It explores a space of infinite universes to find the most plausible ones, and then summarizes that by providing the most likely ROAS for each media along with a range of possible values.

For example, we start knowing that the ROAS for channel A could be between 0 and 6, it could decrease between 10% and 40% for every 10% increase in spend, and 90% of its impact is delivered in as little as 4 days or as many as 14 days. That’s the entire universe of possible values for each parameter, our search space.

After running the model, we may see, for example, that the most likely ROAS for that channel is 3.4 and it has a 95% probability of being between 2.9 and 3.6.

To determine which of all the possible universes is most likely, we use one of the most powerful machine learning algorithms called Markov Chain Monte Carlo. MCMC works by guessing, checking, and then improving the guesses step by step. It starts with a random guess about the contribution of each channel, then tweaks it slightly and checks if the new guess fits the data better. Over time, it keeps adjusting the guesses, narrowing in on the most likely answer.

 

Model Validation Phase

Because our clients make big decisions using our models, we always validate them and evaluate them before delivering them. This is a comprehensive series of checks we run to ensure that models perform up to our standards in every possible scenario.

 

Convergence Diagnostics

Convergence diagnostics are critical for verifying that the Markov Chain Monte Carlo (MCMC) sampler has properly converged to the target posterior distribution. Key metrics to assess include:

  • Number of Divergences: Should ideally be zero.
  • R-hat Statistic: Should be close to 1, indicating consistency across chains.
  • Effective Sample Size (ESS): Should exceed 1,000, ensuring sufficient independent samples.
  • Trace Plots: Visual inspection can help detect poor mixing or non-convergence, and identify potential problems with certain parameters.

These diagnostics confirm the computational correctness of the sampling process but do not evaluate the quality of the underlying model. They should be viewed as essential checks rather than performance metrics to optimize.

Our models work with several MCMC chains so we can verify that although they all start from different places in the search space, they all converge to similar distributions (we run a Gelman-Rubin diagnostic to ensure that chains have similar means and variances).

 

Model Fit Assessment

Model fit assessment evaluates how well the model explains the observed data. In Bayesian models, metrics such as Mean Absolute Percentage Error (MAPE) can be used to assess accuracy. However, MAPE has limitations, particularly when the data includes zeros or very small values, which can distort the results.

Metrics like MAPE are useful as sanity checks rather than optimization targets in Bayesian models. If a MAPE is above 0.5, it usually indicates poor fit and warrants looking into it to spot any potential model misspecifications.

 

Simulation-Based Calibration (SBC)

Simulation-Based Calibration (SBC) ensures the model can accurately recover the “true” parameters when the ground truth is known. This process helps identify issues such as model misconfiguration or challenges posed by the data structure, like multicollinearity in marketing spend.

The process:

  • We randomly sample parameter values from the priors.
  • We then use these parameters to generate dependent variables, treating them as the true values.
  • We run the model using the generated dependent variables.
  • Finally, we compare the posterior parameters (model results) to the original sampled parameters to verify alignment.

For example, when calibrating ROAS, the model should replicate major trends in a channel’s ROAS over time, even if not perfectly. SBC validates the model’s ability to recover underlying truths and highlights areas needing refinement.

 

Predictive Performance

Predictive performance evaluates the model’s ability to generalize and make accurate predictions. In Bayesian MMM, this is best assessed using the Watanabe-Akaike Information Criterion (WAIC), a metric that accounts for model complexity and penalizes overfitting. Lower WAIC values indicate better predictive performance. Calculating WAIC provides a clear and efficient measure of the model’s generalizability.

 

Posterior Predictive Checks (PPCs)

Posterior Predictive Checks (PPCs) evaluate how well the model reproduces the observed data using all available data. This involves simulating data from the posterior predictive distribution and comparing it to the actual data. The comparison can be done visually or through statistical tests or similarity measures. To summarize the results, a single number can be designed to capture the outcome of the PPCs, providing a concise measure of the model’s ability to replicate the observed data.

 

Sensitivity Analysis

Sensitivity analysis tests how changes in a model’s inputs or assumptions affect outputs, helping identify fragile or non-causal models. It ensures robustness, validates causal relationships, and highlights weaknesses such as overfitting or reliance on spurious correlations.

We use a few different methods for sensitivity analysis:

  • Input variation: we incrementally adjust variables like media spend and observe output consistency.
  • Specification testing: we modify your media mix, such as removing Facebook Ads, to make sure that the performance for the rest of the channels doesn’t change too much.
  • Scenario analysis: we run “what-if” cases, like doubling spend, to ensure predictions remain accurate in multiple different conditions.
  • Residual analysis: we look at the differences between observed and predicted values to detect misspecifications.
  • Stability check: we hold out data from the model and try to predict it using past data. For example, we normally train models with data from the last 120 weeks. During this process, however, we’d give it the first 100 weeks and ask it to predict the ROAS for each channel in week 101. Because we know the actual spend for week 101, we can then predict what the conversions / revenue would have been in week 101, and we want to make sure the prediction is as close as possible. We then repeat that process for each week (e.g. give it 101 weeks of data to predict week 102, etc.)

 

Validating Models at Scale

Testing hundreds of models to find the best one is very computationally intensive and requires a framework to do it successfully at scale. We use a platform called MLFlow to keep a log of every model version, its performance, and detailed notes about any improvements made over time.

 

Implementing the Model’s Recommendations

Setting up your data pipelines, and building and validating your model can take anywhere from three to six weeks, depending on its complexity. After that, the model will be ready to be used!

I’ll share some screenshots so I can walk you through how you can interpret your model’s insights and how to take action based on them.

 

Media Performance

The first thing you’re going to see is the key metrics for each media: spend, revenue, ROAS and profit. You’ll see them summarized and the trends over time.

This is a great way to see, at a glance, your best and worst performers.

 

Platform Reporting Accuracy

As we discussed, platforms can’t accurately track revenue or conversions. This report helps you see which media outlets are over-reporting and which ones are under-reporting. In this example, the model determined that Facebook Prospecting’s ROAS was 5.09, much higher than the 1.03 Facebook reported. Conversely, Google branded’s ROAS was 1.20, much lower than the 4.39 Google claims.

This is why it’s important to work with accurate data. Based on platform data, Google seems to be 4X better than Facebook. However, based on their actual impact on revenue, Facebook is 4X better than Google.

 

Attribution

This report shows you what percentage of revenue and conversions is attributed to each media. It’s useful to understand the overall contribution of individual channels.

If you’re an ecommerce business with multiple revenue sources, you can select a given store to optimize your media for, or all your stores combined. This applies to all reports. 

 

Diminishing Returns

As we established, media efficiency tends to decline as spend increases. On the left side, you can see a spend vs profit plot, which shows at which spend level is profit maximized. You want to be at the top of the hill. If you’re to the left you should invest more, and if you’re to the right, you should scale back. In this example, we’re spending $1,500 a day and we should be spending $2,000 a day.

On the right side you see how ROAS declines as spend increases. You can see that the ROAS is around 3 if you spend $1,000/day, 2.8 at $1,500/day, and 2.6 at $2,000/day.

 

Optimal Allocation

Our objective is to allocate our budget optimally to maximize profit. This report shows you what to scale up and what to scale down.

 

Recommendations

In this report, you can see not only how much to adjust your spend for each media, but also what you can expect if you do so. Seeing how your optimization will impact revenue before taking action is extremely valuable to minimize risk and maximize growth. You’ll see three recommendation types: scale up, scale down, and bring back (if a media you used in the past is worth reintroducing.) If your spend is optimal for a given media, then you’ll see “no change” so you know it’s already where it should be.

 

What-If Scenarios

In some cases, you want to play around with different scenarios. This tool lets you compare performance at different spend levels, and you can even factor in cost of goods sold (COGS) into the equation.

 

Campaigns, Ad Sets and Creatives

Consider two video ads running on YouTube. One has a ROAS of 10 and the other a ROAS of 2. If you’ve spent the same on both, then the average ROAS for YouTube would be 6. However, one element in that bucket is doing 5X better than the other. For that reason, it’s not just about allocating our budgets optimally to each bucket, but also about optimizing what’s in each bucket. We want to eliminate the worst performers so our dollars go towards the best performers. And we also want to understand what the top performers have in common so we can get inspiration for new creatives and audiences to target.

In this first screenshot, we see the “losers” – the unprofitable ad sets and creatives.

In this second screenshot, we see the winners – the ad sets and creatives with the highest ROAS.

 

Optimization Feedback

Using data to make decisions isn’t enough. We also need to use data to verify that the decisions we made delivered the outcomes we expected. We believe that no data should be blindly trusted, which is why we made it very easy to track the impact of your optimizations.

This media had very good performance, so the model recommended scaling it up. The area chart shows the spend trend, and the blue/red indicate whether profit increased or decreased during that time. As you can see, when spend increased, so did profit. And when spend dropped, so did profit. Essentially, if you see blue, keep doing what you’re doing. And if you see red, you need to adjust your budget in the opposite direction.

 

Best Practices

Here are some recommendations to get the most out of Data Speaks.

 

Optimize Channels Before Optimizing Budgets

In our YouTube example, the ROAS was 10 for one video and 2 for the other one. The model will look at the average performance and make recommendations based on that. However, you want to work with homogeneous buckets that share common traits and have similar performance levels. When there’s a significant gap between the best and worst performers within a bucket, start by optimizing that first and give your model a couple of weeks to learn how the media performs after the optimization.

 

Isolate Changes to Help Your Model Learn

Data Speaks works by analyzing how variance in spend impacts a business outcome. If you’ve never changed your budgets, your model will be unable to understand the impact of each media. Likewise, if you scale up all your channels at once, the model won’t be able to identify what portion of the increase in conversions was caused by each media.

There are two ways to help the model learn as you adjust budgets. The first way is to do so sequentially. For example, you could start growing Facebook in January, Google in February, and so on. This is effective assuming each period is comparable to the rest. If you sell sunblock and your sales peak in June, scaling a channel in June wouldn’t be ideal because the model wouldn’t be able to tell as effectively what portion of the increase was due to channel spend as opposed to seasonality. Same goes for scaling a channel during a promotional period. Find a time that is as uneventful as possible.

The second way to adjust budgets is through incrementality tests, which you can easily do from our platform. When you adjust budgets for specific geographic regions, you can isolate the advertising effect because all other things are equal (dates, exposure to other marketing channels, etc.)

There are two different types of incrementality tests. Holdout tests, aka going dark, is when you shut off a channel to measure how much your conversions drop.

Then, there are scale-up experiments, where you increase spend to measure how much your conversions increase.

Data Speaks supports up to three lift experiments, so you could be simultaneously doing a Facebook holdout in Colorado and Florida, a Google 10% scale-up in Oregon and Wisconsin, and a 25% TikTok scale-up in New York and California.

 

Scale Gradually

Let’s suppose that the optimal daily spend for a media is $2,000. If you’ve been spending $1,000 and all of a sudden you increase it to $5,000, our model will show you that it’s now unprofitable, but it won’t be certain about the optimal spend. When scaling up budgets, we recommend doing so no more than 10% per week. When scaling down budgets, especially if Data Speaks shows very bad performance, it’s okay to cut up to 20% per week. Either way, check performance weekly and adjust as necessary. Maybe growing 10% per week made sense for a while, but now it’s best to do 5% per week. 

 

Maintain Consistent Naming Conventions

In the feature engineering phase, we collaborate with you to determine what belongs in each bucket. For example, how do we know what campaigns or ad sets should be TOFU, MOFU or BOFU? We define a set of rules, such as “if the campaign name contains ‘awareness’, then categorize it as TOFU.” It’s critical to make sure that your media buying team or agency maintains the same naming convention as they launch new campaigns.

 

Keep Us in the Loop

Our models measure the influence of each known factor on a given outcome. If there’s a new factor the model doesn’t know about, it can affect model performance. If something major happens – a whole channel was turned off, you ran out of inventory for a bestselling SKU, or you were featured in a morning TV show, let us know so we can factor it into the model.

 

Final Thoughts

By now, you can see how much work and complexity goes into building, testing and validating an attribution model. You can also see why it isn’t accessible for most companies. Even with a basic team—one data scientist, one data engineer, and one machine learning engineer—and an entry-level tech stack, doing this in-house would cost approximately $300,000 to $500,000 annually. And, it’d take at least a year to build version 1.0.

That’s why I founded Data Speaks. I didn’t like that this level of sophistication and performance was only available to the big guys, and I wanted to bring it to the rest of us. Better insights lead to better decisions and better results. We’re here to help you eliminate advertising waste, and invest in the right channels and campaigns.

Interested in learning more? Book your call or email us at hello@dataspeaks.ai