Read time: 9 minutes
You might run dozens of Google Adwords campaigns, advertise on Facebook, promote a list of Facebook and Instagram Pages, run SEO campaigns, send email campaigns AND track your offline sales with a CRM.
Most likely you want to track KPIs, and here’s where the pain starts.
Reports are scattered across multiple platforms and dashboards. And you need to see correlations between metrics, channels and results: namely marketing investment vs leads & sales.
In this 6-step technical guide, we map out how we’re using Google Data Studio and Google BigQuery to build our clients real-time dashboards that link advertising and promotional campaigns right through to offline sales, giving executives a clear view on digital marketing ROI and effectiveness cross-channel.
bmID-topic1
There are stacks of great reporting tools out there for marketers to bring multiple digital channels into one report.
But (and it’s a big but) when it's time to get serious about designing a marketing dashboard the company directors can confidently use to make serious business decisions, you need to bring in sales data. This is where you’ll find out of the box solutions fall short on the connectors and data points.
Often you won’t find the straw that breaks the camel's back until you are neck deep and already heavily invested. Then that sinking feeling enters when you realise you paddled up the wrong software creek.
Most tools can give you the typical metrics in spades, but reporting on enquiries and cost per enquiry will only get you so far. To paint the full picture of ROI you inevitably need to tap into sales CRM data and start bringing real dollar values into the equation so the bosses will sit up and take notice.
Part of the problem stems from the CRM system that captures the sale. There are some great small business CRMs available such as Active Campaign, but their reporting is *limited*, to say it nicely.
Coupled to that is that the history of moving a user through a sales pipeline is not captured, so if you want to report on how many people were in various stage of your sales pipeline last month, and what marketing channels drove these leads, you’re out of luck.
This is where you need to think BIG as in Google BigQuery, but we’re getting ahead of ourselves.
We’re about to go start to finish on bringing all of your data sources together under one roof and wrapping a beautiful interactive dashboard layer over the top to deliver reporting nirvana that your bosses will love.
This article describes how to set up aggregated report using well-known (mostly free) tools such as:
But before you jump in, take a hard look at yourself in the mirror, now one more and ask yourself if you’ve had your weetbix this morning because this isn’t your mama’s point n click dashboard.
bmID-topic2
Here’s an example dashboard we built for our client in the childcare space, using the techniques we’re outlining in their article:
Ok now that you know what we’re building, let’s get into it.
bmID-topic3
The core idea of this reporting set up is to bring metrics from various data sources (i.e. Google Analytics, Adwords, Facebook CRM, etc) together to where your visualisation tool (Google Data Studio, Tableau, etc.) can pull the combined data.
“Bringing data together” here means continuous replication of your metrics / KPIs to the core storage (or data warehouse). Data replication is especially useful for CRM reporting - bringing you insight on deals moving from stage to stage in the sales pipeline.
BigQuery is our go-to data warehouse choice. It’s designed to store and process big data and is relatively easy to integrate with. Plus, it has transparent quota limits / pricing.
Here’s where you’d need data pipeline, a set of tools or services which pull original data sources (Analytics, Adwords, etc.) and store the response to BigQuery. Replication is run periodically and all new or updated data is stored in BigQuery.
Once data is replicated, it’s joined and becomes available for front-end reporting.
We need to join data because our visualisation tool of choice, Google Data Studio, has a significant limitation - it cannot combine metrics from different data sources and report on them as a whole.
By joining data within warehouse we provide the dashboard with entire dataset. Thus, we enable powerful reporting options, i.e. filtering by date, channel, product, location, pipeline, etc.
bmID-topic4
This is the cornerstone of the whole project. Develop a wireframe of your report and arrange the metrics you’ll need. Add data range picker and filter (such as office location or product name / line).
Here’s a basic version of how we started with a mockup of the ideal dashboard:
bmID-topic5
Once the wireframe is done it’s time to map out your metrics and attributes list. Attributes are needed to help split your data by product / location or whatever filter you’d like to implement in your dashboard. In a nutshell, it’s the dimension that explicitly determines metrics.
The easiest way to plan data sources is to build a table where list all the metrics and attributes you’re going to pull.
Here’s an example:
Define what conversions you’re going to report on.
If you’re using Google Analytics goals, make sure to group them by channel. That way you’ll be able to track goal completions coming from different sources (for example, Adwords vs organic search).
When replicating conversion metrics, keep in mind that AdWords conversions and Analytics goals are 2 different animals and you can’t sum those up.
Define what particular metrics you’re going to report on. Those might be:
bmID-topic6
All of your data will be replicated into a single destination database - Google BigQuery.
Most of your data will be replicated with ETL tool called Stitch (stitchdata.com). Stitch takes care of exporting your data from a data source (for example Google Analytics) and uploading it to the selected destination (BigQuery).
All you need to do is tell Stitch where and what data to take, point it to the destination and authorize Stitch with sources and destination.
Each connection between source and destination made with Stitch is called “integration”. Stitch offers a bunch of ready to go integrations and covers most of this project needs.
For some data sources, like Search Console or Facebook Insights, Stitch currently doesn’t offer off-the-shelf integrations. That data we replicate using direct API calls.
bmID-topic7
Select Google Cloud project name (remember, the one you have billing enabled), then save your settings. Stitch will run a quick test to ensure it can get access to your data and if it’s fine - will get you to main page.
3. Select metrics you’re going to sync. More on that in notes below.
4. Save your integration
You can’t change Analytics metrics after you’ve saved your pipeline.
For AdWords integration, you need to pick the AdWords report or table, and metrics you need within the chosen report.
Keep a list of your metrics short - if you don’t need to go down to each ad, select just Campaign or Ad group report.
Stitch automatically disables incompatible metrics:
As soon as integration is saved, Stitch attempts to run a sync job and replicate your data to the destination.
Give it 10-30 minutes to export the data and then check your BigQuery project. You’ll see datasets created by Stitch (one per each integration) where it stores your data.
bmID-topic8
CRM (in this case Active Campaign) data is a bit of another animal. Stitch doesn’t have native Active Campaign connector, yet it does support webhooks, so does Active Campaign.
To capture deal and contact updates in separate tables you need to create 2 webhook integrations in Stitch, and then link them with 2 webhooks in your CRM.
So what you need to do here is to set up 2 webhook integrations in your Stitch and copy URLs provided for each integration.
Then you get back to Active Campaign and create 2 webhooks on deal add/update and on contact add/update. Paste Stitch’s URLs to each webhooks URL field.
Voila - each time Active Campaign gets or updates deal or contact - you get another record in your BigQuery table.
bmID-topic9
This section will require some software development skills (Node.js).
Some of your data sources might not be currently supported by Stitch. Among these are Search Console and Facebook Insights.
So you’d need to query platforms directly for the data - you can do that with API calls. General principles here:
These steps should be performed with a Cloud Function on a recurring basis.
When the call is set up and you’re essentially able to request data, you can wrap it into a Cloud Function to automate your request and response handling. Cloud Function lives within the same Google Cloud project as BigQuery and does the following:
Cloud function needs to be run periodically, just like your Stitch replication jobs do.
While Stich takes care of scheduling replication jobs, with Cloud Functions you need to turn to a cron service to kick off your replication. You’ll have to use one of third-party cron job services (Google Cloud doesn’t have a native scheduled trigger).
Another option is to deploy a workaround described here.
A good practice is to query 2-3 days worth of data each time. In case your cron job fails someday, data won’t be lost. Just make sure to store replication timestamp similarly to Stitch’s _sdc_sequence column to be able to get the latest version of each record.
bmID-topic10
Most likely you’d like to be able to report on the historical data that has been present in your data sources before you’ve started to replicate it to BigQuery.
bmID-topic11
Here’s how your BigQuery data structure might look like: a number of datasets and tables.
As soon as all data sources are replicated to your BigQuery warehouse, you need to bring the data together.
That has to be done since Google Data Studio doesn’t have an option to combine data from several data sources (different BigQuery tables or Google Spreadsheets or platforms like Analytics and AdWords).
The easiest way to combine the data is to create a BigQuery view. It’s a dynamic table that updates as soon as source tables are updated.
The view is built using SQL query. There are several possible SQL query structures, but to keep things simple you might want to combine your tables using UNION ALL clause. Keep in mind that the view should have a structure digestible by Data Studio i.e. flat table, no nested records, basically 1st normal form.
Here’s sample query for your view:
c1st-code-start-sql
SELECT ‘adwords’ as data_source, date, clicks, impressions, cost, campaign, null AS sessions, null as landing_page FROM adwords.AD_PERFORMANCE_REPORT
UNION ALL
SELECT ‘facebook_ads’ as data_source, date, clicks, impressions, cost, campaign, null AS sessions, null as landing_page FROM facebook.ads_insights
UNION ALL
SELECT ‘analytics’ as data_source, date, NULL AS clicks, NULL AS impressions, NULL AS cost, NULL AS campaign, sessions, landing_page from analytics.report
UNION ALL
…
c1st-code-end-sql
Note: Facebook returns nested records, use UNNEST clause to convert them to generic rows.
To add explicit division by say, product, you may introduce additional columns that gets product name from campaign or landing page:
c1st-code-start-sql
CASE
WHEN REGEXP_CONTAINS(campaign, ‘’(?i).*product_a.*”) or REGEXP_CONTAINS(landing_page, ‘’(?i).*product_a.*”) then “Product A”
WHEN REGEXP_CONTAINS(campaign, ‘’(?i).*product_b.*”) or REGEXP_CONTAINS(landing_page, ‘’(?i).*product_b.*”) then “Product B”
ELSE “Brand” as Product end
c1st-code-end-sql
And the whole view query might look like that:
c1st-code-start-sql
WITH data as
(SELECT ‘adwords’ as data_source, date, clicks, impressions, cost, campaign, null AS sessions, null as landing_page FROM adwords_data
UNION ALL
SELECT ‘facebook_ads’ as data_source, date, clicks, impressions, cost, campaign, null AS sessions, null as landing_page FROM facebook_data
UNION ALL
SELECT ‘analytics’ as data_source, date, NULL AS clicks, NULL AS impressions, NULL AS cost, NULL AS campaign, sessions, landing_page from analytics_data
UNION ALL
…)
SELECT
*,
CASE
WHEN REGEXP_CONTAINS(campaign, ‘’(?i).*product_a.*”) or REGEXP_CONTAINS(landing_page, ‘’(?i).*product_a.*”) then “Product A”
WHEN REGEXP_CONTAINS(campaign, ‘’(?i).*product_b.*”) or REGEXP_CONTAINS(landing_page, ‘’(?i).*product_b.*”) then “Product B”
ELSE “Brand” END AS Product
FROM data
c1st-code-end-sql
Since Stitch replicates your data in “append-only” mode, you will most likely get 2, 3 or even more rows per each date in your BigQuery table. For example, AdWords data:
In this example, Stitch had run 2 replication jobs: May 3 and May 4. Records dated May 3 (rows 1, 3, 5, 7) are the first version of each record, records dated May 4 (rows 2, 4, 6, 8) - are the second version of each record.
What you need to do is grab the latest version of each row in your query using _sdc_sequence field. It’s a column added by Stitch automatically which stores replication job timestamp. Find further explanation on this method here.
Append only replication is helpful: it doesn’t overwrite your older records thus giving you the entire picture on any selected day. Downside - you need more complicated queries to get data.
Save your query as a view to make it accessible by Google Data Studio. To keep thing organised it might be a good idea to keep your view in a separate dataset.
Side note: tables and queries above are just an example. You can add as much metrics as you need, just keep it as minimal as possible for the sake of easier support.
bmID-topic12
Now your view is set up and works fine - it’s time to get the front end done. Dashboard development splits into 2 major steps:
To connect the view, select BigQuery as a data source and then pick your project, dataset and view. Click the Connect button.
We won’t dive deep into Data Studio UI secrets - a lot of articles covering this subject are already out there.
bmID-topic13
When your report is done and running, it would be a good idea to set up some logging and error notification.
You’d want to report on at least the following errors:
Stitch sends email notifications if something is wrong with your replication and attempts to re-run replication job.
To handle API replication errors - use cron job service email notification feature if the job has failed. For a more sophisticated option, set up Google Cloud notification on function failure.
The final touch: set up email filter and get these emails higher priority to be on top of your system performance.
bmID-topic14
If you’d like to create an at-a-glance cross-channel digital marketing & sales dashboard for your executive leadership team, the Content First team can help.
Simply contact us to get the conversation going.