BigQuery is Google Cloud’s serverless data warehouse, and for most analytics teams it earns its place because of one feature: the free GA4 BigQuery export. Every GA4 events table β events_YYYYMMDD with the full event payload, params, items, user properties, and device breakdowns β lands in a BigQuery dataset daily, with optional intraday streaming. This guide covers what BigQuery is, how to enable the GA4 export, the events table schema, common SQL queries (active users, conversions, funnels), free-tier limits, and when BigQuery beats the GA4 UI or Looker Studio.
What Is BigQuery?
BigQuery is a fully managed, serverless cloud data warehouse from Google Cloud designed for petabyte-scale analytical SQL. There are no clusters to provision, no nodes to patch, and no storage engine to tune β you write ANSI SQL, BigQuery parallelizes the query across thousands of slots, and you pay only for the data scanned (or for a flat-rate slot reservation). For analytics teams, it has become the standard “single source of truth” where raw event streams from GA4, ad platforms, CRM, and product databases land and join.
Three architectural choices make BigQuery a good fit for web and product analytics:
- Columnar storage. Tables are stored in a column-oriented format (Capacitor), so a query that touches three columns of a 10TB table only scans those columns β billing follows.
- Computeβstorage separation. Storage scales cheaply ($0.02/GB/mo); query slots scale independently. You don’t pay for idle capacity.
- SQL with arrays and structs. The GA4 export uses nested
RECORDfields forevent_paramsanditems.UNNEST()flattens them on the fly without a separate ETL step.
The GA4 β BigQuery Export β Free and Powerful
Universal Analytics 360 charged enterprise customers for raw-data export to BigQuery. GA4 made it free for everyone, including the standard (non-360) tier. Two export modes ship out of the box:
- Daily export. Once per day Google writes the previous day’s full event payload into
events_YYYYMMDDtables. Free standard properties are capped at 1 million events per day; properties exceeding the cap stop receiving daily exports until the next billing period or a 360 upgrade. - Streaming export. Events arrive in
events_intraday_YYYYMMDDwithin seconds of being collected. Streaming is billed at $0.05 per GB of data ingested. The intraday table is replaced by the daily table once the day finalises.
The export is unsampled and unaggregated β every event row is preserved with its full parameter set, the engagement flag, the session ID derivation, and the device/geo enrichments GA4 adds at collection time. That is the headline value: the GA4 UI applies sampling thresholds, cardinality limits, and “(other)” bucketing for high-cardinality dimensions. The BigQuery export does none of that.
Setting Up GA4 BigQuery Export β Step-by-Step
Linking a GA4 property to BigQuery takes about ten minutes if you have the right Google Cloud roles. You need Editor on the GA4 property and BigQuery Admin (or Project Editor) on the destination Cloud project.
- Create a Google Cloud project at console.cloud.google.com and enable billing on it. The export itself is free, but BigQuery requires a billing account to be attached even when you stay within the free tier.
- Enable the BigQuery API for that project (APIs & Services β Library β BigQuery API β Enable).
- In GA4, open Admin β Product Links β BigQuery Links β Link.
- Pick the Cloud project, choose a region (US, EU, or any single region β once set it cannot be changed without recreating the link), and accept the data-location terms.
- Choose Daily and/or Streaming. Most teams enable both; daily is free and authoritative, streaming is billed but useful for real-time debugging.
- Optionally include advertising identifiers (IDFA / AAID) β only if your privacy posture allows.
- Click Submit. The first daily export arrives ~24 hours later. Streaming starts within a few hours.
Find the dataset in BigQuery under analytics_<property_id>. Tables are named events_YYYYMMDD for finalised days and events_intraday_YYYYMMDD for the current day. See the official BigQuery export setup docs for region pricing details.
The GA4 Events Table Schema
Every row in events_YYYYMMDD represents a single GA4 event. The schema is wide (~40 top-level fields) and uses repeated RECORD types for the things that vary per event. Knowing the key columns saves hours of guessing:
| Column | Type | Purpose |
|---|---|---|
event_name | STRING | page_view, session_start, purchase, custom names |
event_timestamp | INT64 (microseconds) | Event time in UTC; divide by 1e6 for seconds |
event_date | STRING (YYYYMMDD) | Used as a partition filter for cost control |
user_pseudo_id | STRING | GA4 first-party cookie ID; one per browser/device |
user_id | STRING | Cross-device login ID if you populate it |
event_params | REPEATED RECORD | Array of key/value pairs (page_location, ga_session_id, engagement_time_msecβ¦) |
items | REPEATED RECORD | E-commerce items (item_id, item_name, price, quantity) |
device | RECORD | category, operating_system, browser, language |
geo | RECORD | country, region, city, continent |
traffic_source | RECORD | name, medium, source for first-touch attribution |
ecommerce | RECORD | transaction_id, total_item_quantity, purchase_revenue |
Two columns trip up almost every newcomer. The session ID isn’t a top-level column β pull it from event_params with (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id'). And page_location sits inside event_params too; it isn’t a flat URL field on each row. The data layer values you push from GTM custom events also land inside event_params as additional key/value pairs.
Common GA4 BigQuery SQL Queries
The four queries below cover ~80% of day-to-day analyst work. Replace your_project.analytics_123456789 with your dataset and always include a partition filter on _TABLE_SUFFIX or event_date β without one, the query scans every events table you have ever exported.
1. Active users per day (last 28 days).
SELECT
event_date,
COUNT(DISTINCT user_pseudo_id) AS active_users
FROM `your_project.analytics_123456789.events_*`
WHERE _TABLE_SUFFIX BETWEEN
FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 28 DAY))
AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))
GROUP BY event_date
ORDER BY event_date;
2. Top 10 pages by views.
SELECT
(SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'page_location') AS page,
COUNT(*) AS views
FROM `your_project.analytics_123456789.events_*`
WHERE _TABLE_SUFFIX BETWEEN '20260401' AND '20260428'
AND event_name = 'page_view'
GROUP BY page
ORDER BY views DESC
LIMIT 10;
3. Conversion events with revenue.
SELECT
event_date,
COUNT(*) AS purchases,
SUM(ecommerce.purchase_revenue) AS revenue
FROM `your_project.analytics_123456789.events_*`
WHERE _TABLE_SUFFIX BETWEEN '20260401' AND '20260428'
AND event_name = 'purchase'
GROUP BY event_date
ORDER BY event_date;
4. Funnel β view_item β add_to_cart β purchase.
SELECT
event_name,
COUNT(DISTINCT user_pseudo_id) AS users
FROM `your_project.analytics_123456789.events_*`
WHERE _TABLE_SUFFIX BETWEEN '20260401' AND '20260428'
AND event_name IN ('view_item','add_to_cart','purchase')
GROUP BY event_name
ORDER BY CASE event_name
WHEN 'view_item' THEN 1
WHEN 'add_to_cart' THEN 2
WHEN 'purchase' THEN 3
END;
For server-side events sent via the Measurement Protocol, those rows appear in the same events_* tables β there is no separate dataset. Filter on a custom event_param like {key:'data_source', value:{string_value:'server'}} if you need to split server-side traffic.
Pricing β Free Tier vs Paid
BigQuery has one of the most generous free tiers in cloud data warehousing. Most small and mid-size GA4 implementations stay within it indefinitely:
| Resource | Free Tier (Sandbox + Always-Free) | Beyond Free |
|---|---|---|
| Storage | 10 GB free per month | $0.02 per GB-month (active) Β· $0.01 (long-term, >90d unmodified) |
| Query analysis | 1 TB scanned free per month | $6.25 per TB scanned (on-demand) |
| Streaming inserts | Not in free tier | $0.05 per GB ingested (streaming export) |
| Loading + exports | Free (batch loads) | Free |
| BI Engine reservations | 1 GB free | $30 per GB-month reserved |
A typical GA4 export of 500K events/day stores ~5β10GB/month and runs ~50β100GB of analytical scans β well under the 1TB free query allowance. The cost spikes happen in two scenarios: streaming ingest on a high-traffic site (millions of events/day adds up at $0.05/GB), and unfiltered queries that scan every events_* table by accident. Always use _TABLE_SUFFIX BETWEEN ... and the BigQuery query validator’s “this query will process X bytes” estimate before running.
BigQuery vs Looker Studio vs GA4 UI β When to Use Which
The three tools sit at different points on the flexibility/skill-needed curve. The right answer depends on what question you are answering and who is asking:
| Dimension | GA4 UI | Looker Studio | BigQuery |
|---|---|---|---|
| Flexibility | Low β preset reports + Explorations | Medium β drag-and-drop blends | High β full SQL, joins, custom logic |
| Latency | ~24β48 h for finalised data | 12-h connector cache | Seconds (streaming) or daily |
| Cost | Free | Free | Free up to 1 TB/mo, then $6.25/TB |
| Skill needed | None β clicks only | Low β chart configuration | SQL fluency required |
| Sampling | Yes, on high-cardinality reports | Inherits GA4 sampling via connector | None β raw events |
| Cardinality limits | “(other)” bucket above 50K rows | Same as GA4 connector | None |
| Best for | Quick lookups, data stream debugging | Stakeholder dashboards | Attribution, retention, custom KPIs |
Most teams use all three. The GA4 UI is for ad-hoc questions and real-time event debugging. Looker Studio is for the weekly stakeholder dashboard. BigQuery is for the questions the UI cannot answer: deduplicated cross-platform funnels, custom multi-touch attribution, cohort retention curves, and joins with CRM or ad-spend data.
Limitations and Gotchas
BigQuery is a solid foundation, but the GA4 export ships with quirks every analyst eventually trips over:
- 1M events/day cap on free tier. Standard GA4 properties exporting more than 1M events/day stop receiving daily exports for the rest of the day. Streaming continues, but you lose the canonical daily table.
- Up to 72-hour late data. GA4 backfills late-arriving hits into the daily table for ~72h after event_date. Today’s metrics from yesterday may not match next-week’s metrics from yesterday.
- Numbers won’t match the GA4 UI. The UI applies different attribution windows, session timeouts, and bot filtering than the raw export. A 1β5% delta is normal; a 20%+ delta means you missed a filter (bot traffic, internal IPs, default channel grouping).
- Cardinality of custom dimensions. User-scoped custom dimensions are stored as
user_properties; event-scoped ones insideevent_params. Both can hit cardinality limits in BigQuery exports for free properties. - Region lock-in. The dataset region is set at link creation and immutable. Choose carefully β moving to a different region requires creating a new link and migrating historical data.
- Streaming-only first 24 h. Until the daily export lands, your only data is in the intraday table, which is replaced (not merged) when the daily table arrives. Production queries should always read from
events_*with a date filter that excludes today.
Frequently Asked Questions
Is BigQuery free?
BigQuery has a permanent free tier covering 10GB of storage and 1TB of query scans per month. The GA4 daily export is also free for any GA4 property up to 1M events/day. You only pay when you exceed those limits or enable the streaming export ($0.05/GB ingested).
What is BigQuery used for?
BigQuery is used for analytical SQL on large datasets β typically as a destination for raw event streams, ad-platform exports, and CRM data that need to be joined and aggregated for reporting, attribution, and ML. For web analytics specifically, it’s the canonical raw-data store for the GA4 export.
How do I export GA4 to BigQuery?
In GA4, go to Admin β BigQuery Links β Link β pick a Cloud project with billing enabled β choose region, daily and/or streaming β Submit. The first daily export arrives ~24h later. The full step-by-step is covered in the setup section above.
What is the GA4 BigQuery export schema?
Each row in events_YYYYMMDD is one GA4 event with ~40 top-level columns: event_name, event_timestamp, user_pseudo_id, user_id, plus repeated RECORD fields event_params and items, and nested RECORDs for device, geo, traffic_source, and ecommerce.
How long until BigQuery data is available?
Daily exports arrive ~24β48 hours after the GA4 event date and may receive late-arriving backfills for up to 72 hours. Streaming export rows appear in events_intraday_YYYYMMDD within seconds of being collected.
Can I write to BigQuery from GTM?
Not directly β the GA4 β BigQuery link is one-way (read-only from GTM’s perspective). To send custom server-side data, use Google Tag Manager Server-Side Container with a custom HTTP request, or push to a Cloud Function that writes to BigQuery via the streaming insert API.
How does BigQuery compare to Looker Studio?
Looker Studio is a free dashboard tool β it visualizes data but cannot transform it beyond simple calculated fields. BigQuery is the warehouse β it stores, transforms, and queries raw data. They are complementary: BigQuery as the source of truth, Looker Studio as the presentation layer.
Bottom Line
BigQuery is the analyst’s escape hatch from the GA4 UI. The free GA4 BigQuery export ships unsampled events into a serverless warehouse where SQL solves problems no preset report can β multi-touch attribution, custom funnels, cohort retention, and joins with ad spend or CRM. The free tier (10GB storage + 1TB queries/month) covers most small and mid-size sites. Use BigQuery when the GA4 UI runs out, layer Looker Studio on top for stakeholder dashboards, and treat the daily events_YYYYMMDD table as your authoritative log.
Related Terms
- Looker Studio β free dashboard tool that connects to BigQuery and GA4
- GA4 events β the unit of measurement that fills the events_YYYYMMDD table
- Data stream β the GA4 collection endpoint feeding the export
- Measurement Protocol β server-side hits that also land in BigQuery
- Data layer β source of custom event_params written to BigQuery
- Conversion β how to define and query revenue events in BigQuery
- First-party cookies β origin of the user_pseudo_id column