Fastly's Real-Time Log Streaming feature can send log files to BigQuery, Google's managed enterprise data warehouse.
NOTE: Fastly does not provide direct support for third-party services. See Fastly's Terms of Service for more information.
Prerequisites
Before adding BigQuery as a logging endpoint for Fastly services, you will need to:
- Register for a Google Cloud Platform (GCP) account.
- Create a service account on Google's website.
- Obtain the
private_key
andclient_email
from the JSON file associated with the service account. - Enable the BigQuery API.
- Create a BigQuery dataset.
- Add a BigQuery table.
Creating a service account
BigQuery uses service accounts for third-party application authentication. To create a new service account, follow the instructions in the Google Cloud documentation. Keep the following in mind when creating the service account:
-
The service account must be assigned the Big Query Data Editor role to write to the table you use for Fastly logging. See BigQuery Roles for details about the default permissions assigned to the Big Query Data Editor role.
-
Set the key type to JSON when creating the service's private key pair.
Obtaining the private key and client email
When you create the BigQuery service account, a JSON file automatically downloads to your computer. This file contains the credentials for your BigQuery service account. Open the file and make a note of the values of the private_key
and client_email
fields.
Enabling the BigQuery API
To send your Fastly logs to your BigQuery table, you'll need to enable the BigQuery API in the Google Cloud Platform API Manager.
Creating the BigQuery dataset
After you've enabled the BigQuery API, follow these instructions to create a BigQuery dataset:
- Log in to BigQuery.
-
Click the arrow next to your account name on the sidebar and select Create new dataset.
The Create Dataset window appears.
- In the Dataset ID field, enter a name for the dataset (e.g.,
fastly_bigquery
). - Click the OK button.
Adding a BigQuery table
After you've created the BigQuery dataset, you'll need to add a BigQuery table. There are four ways of creating the schema for the table:
- Edit the schema using the BigQuery web interface.
- Edit the schema using the text field in the BigQuery web interface.
- Use an existing table.
- Set the table to automatically detect the schema.
NOTE: Setting the table to automatically detect the schema may give unpredictable results.
Follow these instructions to add a BigQuery table:
-
On the BigQuery website, click the arrow next to the dataset name on the sidebar and select Create new table.
The Create Table page appears.
- In the Source Data section, select Create empty table.
- In the Table name field, enter a name for the table (e.g.,
logs
). - In the Schema section of the BigQuery website, use the interface to add fields and complete the schema. See the example schema section for details.
- Create the Create Table button.
Adding BigQuery as a logging endpoint
Follow these instructions to add BigQuery as a logging endpoint:
- Review the information in our Setting Up Remote Log Streaming guide.
-
Click the Google BigQuery Create endpoint button. The Create a BigQuery endpoint page appears.
- Fill out the Create a BigQuery endpoint fields as follows:
- In the Name field, enter a human-readable name for the endpoint.
- In the Log format field, enter the data to send to BigQuery. See the example format section for details.
- In the Email field, enter the
client_email
address associated with the BigQuery service account. - In the Secret key field, enter the value of the
private_key
associated with your BigQuery service account. - In the Project ID field, enter the ID of your Google Cloud Platform project.
- In the Dataset field, enter the name of your BigQuery dataset.
- In the Table field, enter the name of your BigQuery table.
- In the Template field, optionally enter an
strftime
compatible string to use as the template suffix for your table.
-
Click the Advanced options link of the Create a BigQuery endpoint page. The Advanced options appear.
- In the Placement area, select where the logging call should be placed in the generated VCL. Valid values are Format Version Default, None, and waf_debug (waf_debug_log). Selecting None creates a logging object that can only be used in custom VCL. See our guide on WAF logging for more information about
waf_debug_log
. - Click the Create button to create the new logging endpoint.
- Click the Activate button to deploy your configuration changes.
Example format
Data sent to BigQuery must be serialized as a JSON object, and every field in the JSON object must map to a string in your table's schema. The JSON can have nested data in it (e.g. the value of a key in your object can be another object). Here's an example format string for sending data to BigQuery:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
{
"timestamp":"%{begin:%Y-%m-%dT%H:%M:%S}t",
"time_elapsed":%{time.elapsed.usec}V,
"is_tls":%{if(req.is_ssl, "true", "false")}V,
"client_ip":"%{req.http.Fastly-Client-IP}V",
"geo_city":"%{client.geo.city}V",
"geo_country_code":"%{client.geo.country_code}V",
"request":"%{req.method}V",
"host":"%{req.http.Fastly-Orig-Host}V",
"url":"%{json.escape(req.url)}V",
"request_referer":"%{json.escape(req.http.Referer)}V",
"request_user_agent":"%{json.escape(req.http.User-Agent)}V",
"request_accept_language":"%{json.escape(req.http.Accept-Language)}V",
"request_accept_charset":"%{json.escape(req.http.Accept-Charset)}V",
"cache_status":"%{regsub(fastly_info.state, "^(HIT-(SYNTH)|(HITPASS|HIT|MISS|PASS|ERROR|PIPE)).*", "\\2\\3") }V"
}
Example schema
The BigQuery schema for the example format shown above would look something like this:
1
timestamp:TIMESTAMP,time_elapsed:FLOAT,is_tls:BOOLEAN,client_ip:STRING,geo_city:STRING,geo_country_code:STRING,request:STRING,host:STRING,url:STRING,request_referer:STRING,request_user_agent:STRING,request_accept_language:STRING,request_accept_charset:STRING,cache_status:STRING