Why and how you should connect your Google Analytics 4 Property to BigQuery
on 08.12.2020 by Eric Böhme, Thomas Symann
With the announcement of Google Analytics 4 (GA4), Google takes the next step for their analytics platform. This step involves some of the biggest feature changes in the Google Analytics history. The well-known data model was changed and reduced to one available type (“events”) that can be sent to GA4. Also, new features to analyse data collected from web or apps natively inside one property, bring new possibilities.
Google’s announcement to have a native connector to send data to BigQuery was one of the most exciting news. This formerly exclusive feature for paying customers of GA360 now is available for all users of GA4.
Why sending your Google Analytics 4 data to BigQuery?
Access to hit level data
The free version of Universal Analytics (the third generation of Google Analytics) did not provide access to hit level data out of the box and doing analysis on hit level data was hard. Especially when there were more specific questions, e.g. a question regarding data quality, it was hard to solve or debug that without access to hit level data.
Unsampled data in any case
Sampling within Google Analytics always was a problem, especially, when within a large data set many different values for one dimension occurred. Within BigQuery there will no sampling happening.
Not bound to event parameter and custom dimension limits of GA4
Data which contains more event parameters, then the interface could handle, will be sent directly to BigQuery and hence be available there while missing inside the interface. Considering Google’s limit is always recommended but here we get an additional small security layer against data loss.
A wide range of possible analysis cases
With the available hit level data, many use cases are possible that were not possible without a 360 license. Most of the analysis that Google is providing, from descriptive analysis up to forecasting cases, are mainly based on your historical GA data. With this in your hand you can theoretically do all these analysis by yourself.
Usage of the cloud Infrastructure
Google Cloud provides many tools and functionalities. Since BigQuery is part of the Google Cloud you can natively integrate other GCloud tools. One example would be “Dataprep” which helps you to clean and prepare your data for analysis.
Source: https://cloud.google.com/dataprep
Very cheap comparing to an enterprise licence
There will not be any costs, connecting your GA4 to BigQuery, but the usage of BigQuery is not for free. You must pay for the storage and the processing of the data. Google provides different pricing models, which are very reasonably priced, especially comparing to the enterprise version. The data itself is stored at 0,02$ per GB per month and can be analysed for 5$ per TB in the EU or US region. See the BigQuery Pricing Page for all the details. There is also a free tier usage of 10GB storage and 1TB queries per month.
How to connect data to BigQuery
Considering the benefits of using BigQuery, one question remains: how to set up the connection. Luckily, Google makes it very easy to connect Google Analytics 4 with BigQuery.
1. Create a Google Analytics 4 Property and add all relevant iOS App, Android App or web data streams
2. Go to your Admin Settings and click on “BigQuery Linking”
If you have not done any connection yet, this window will be empty, and you can click on “Link” to create a connection.
3. Create a link with BigQuery
As a first step you need to choose the BigQuery Project. If you already created one, you can choose from the list, that opens up after clicking on “Choose a BigQuery project”.
If not, you will see an empty list and need to create a BigQuery Project.
Create a Cloud Project (You can skip this part if you already have a project that you want to use)
Go to https://console.cloud.google.com and create a new or select an existing project. For this guide, we will create a new project.
Select a project name, an organisation and click on “Create”.
After creation you must select your just created project.
In the left navigation you will find “APIs & Services”, where you have to select the point “Library”.
Here you need to search for BigQuery API and select the BigQuery API to check if this API is activated. You automatically have the sandbox mode enabled, which you can use without any costs. Be aware of the additional limits!
If you checked this, you can go back to Google Analytics 4.
4. Create a Link to your Cloud Project
Just click once again on “Choose a BigQuery project”, select the one you just created and click on confirm.
Choose your location and click “next”.
Here you can select the data centre where you want your BigQuery data to be stored. We recommend using the EU region since it has the lowest prices. Since the data will only be stored in the European union, using this data centre can also help you to comply with the European privacy laws.
5. Select data streams and frequency of exports
You can select only some of your data streams to stream data to BigQuery. One use case for that could be to have a dedicated data stream for a test environment, which data you do not want to send to BigQuery.
Regarding the frequency, you can choose how often the data should be exported to BigQuery. “Daily” is included in the free tier and sandbox mode. “Streaming” gives you access to near real time data in BQ, but the streaming comes with a price tag of 0,01$ per 200MB streamed data.
Select the frequency and click “Next”.
6. Review and submit your connection
As a last step you just need to review and submit the connection. You should see a „link created“ success message.
7. Go to BigQuery
After one day (since we only choose daily export) data in BigQuery arrived and we can look at the new schema.
The new data model in BigQuery
As mentioned in the beginning of this article Google changed the data model quite significantly from a session focused model to an event-based model. This change is also visible in BigQuery. While in Universal Analytics every row was one session, GA4 represents one event in each row.
Each hit contains an identifier per user (field: “user_pseudo_id”), the ID of the session (field : “ga_session_id”) and even the session number (field : “ga_session_number“). So, despite the change, all session-based analysis is still possible.
Figure 3 GA4 – one event per row
Conclusion
Bringing a native BigQuery connection without an enterprise plan, Google opens many exciting possibilities for analytics use cases. Not being bound to all GA4 limits, having unsampled data at all time and being able to use the cloud infrastructure with all the powerful tools provided at a small price point, there are many reasons to go for that option.
For the users this is a clear step forward in terms of flexibility and data usage and Google creates revenue streams for clients, even if they would never move to a paid licence.
Read our last post about how to use the dual strategy to smoothly transform to the new GA4 in your company.
How can FELD M support?
As an agency specialized in analytics, we are experts in using Google Analytics (and other web analytics tools). Through our daily business with our customers and the analytics scene, we are always up to date on what is new with GA4. In a long-term dual strategy, i.e. the transition from GA Universal Analytics to GA4, we can help to develop the new connections, design improved tracking and gain more insights from the data.
You can find out more about our Digital Analytics Services here: https://www.feld-m.de/service/digital-analytics/
(c) Image by GraphicsSC from Pixabay