Connect Pigment with Google Cloud Storage

  • 26 August 2022
  • 0 replies
  • 71 views
Connect Pigment with Google Cloud Storage
Userlevel 5
Badge +3

This article explains how to connect a Google Cloud Storage bucket with your Pigment Organization to load any of csv files stored in that bucket into a Pigment’s block.

Overview

 

This process starts in Google Cloud Platform, with the creation of a service account. Next, you’ll generate a Service Account Key and grant this service account access to the bucket containing the data you want to load in pigment. Then, on Pigment side, you will create a Connection , adding the Key crendentials and the bucket name. Once set up, the Gogle Cloud Storage connector has the added benefit of being able to choose from a fixed or most recent file.

 

Prerequisites

You must have a Google Cloud Storage bucket containing the data you want to load in Pigment.

You must be owner of the Google Cloud Project containing the bucket you want to connection with.

 

Setting up access 

 

GCP Step 1 - Creating an Service Account 

 

Pigment uses a Service account to authenticate to your GCP. For security reasons, Pigment recommends a dedicated Pigment Service account.

 

Follow these steps to create a new Service Account.

 

 

You just need to name the service account. (ex: Pigment)

You don’t need to bind any policy to the service account here ( on GCP access to buckets are configured directly on Google Cloud Storage side)

 

GCP Step 2 - Generate a Service Account Key 

 

Open the Service account you just created, go to the Keys section and cick on Add Key > Create new key and select JSON. This will download a JSON file. You will need this file to configure the Connection on Pigment side. 

 

 

GCP Step 3 - Configure bucket access

 

This step grants the Pigment service account access to the bucket containing the data you want to share with Pigment.  The Pigment connector is compatible with both Uniform and Fine-grained Access control buckets.

Pigment recommend using a dedicated bucket for Pigment. On this bucket upload olny the data which are meant to be loaded by Pigment.
If you use a dedicated bucket, use the Uniform access control to simplify the access configuration.

 

Follow those steps to add a principal to the bucket configuration. 

Grant to the Service account you just created the Storage Object Viewer Role.

 

Establishing connection in Pigment 

 

Go to the integration page from the Settings tab.

Next click on + Add next to the Google Cloud Storage integration.

 

Fill out the form with the following information. You will need to establish a connection for each individual GCS bucket. 

 

  • Name: give a name to your connection  
  • Bucket name: name of the bucket which can be accessed with this connection. (ex: pigment-export )
  • Credentials: Content of the JSON file generated during service account’s Key creation (Step 2)
  • Application access: Select the applications which will be allowed to use this connection 

 

 

Use the connection in Pigment 

 

Once the GCS connection is configured, open an Application that the connection is available on and open the “Import Data” interface for the object you want to import data into.  For example, if you were importing data into a Transactions, you would open up list and then select Import and Download and then Import .

Instead of “Upload file”, select the Integration option.  From here, select your GCS connection.

Select the Import mode you want.

 

Fixed file

This import mode will choose one individual file name.

Just input the file name and location (ex: folder1/folder2/file.csv) and click on Import.

The file should load in Pigment within a few seconds!  If this configuration is saved, it will only upload that file name.  

 

Most recent file

 

This import mode allows you to define a path and Pigment will upload the most recent file depending on a defined naming convention that must contain a ISO-8601 compatible date format within it. 

 

Naming convention in Google Cloud Storage 

When naming your files in Google Cloud Storage, your file name must follow a consistent naming convention with the date included.  Here are some examples of file names for a rev.csv with the ISO-8601 compatible date formats located at the start.

 

ISO-8601 compatible date formats example of file names {{date}}-rev.csv naming convention
yyyy-MM-dd 2022-09-30-rev.csv
yyyy-MM-ddThh:mm:ss 2022-09-30T00:00:00-rev.csv
yyyy-MM-ddThh:mm:ssZ 2022-09-30T00:00:00Z-rev.csv

 

When writing the file path in Pigment

 

Similar to the Fixed file, you must first define the path with the location(ex: folder1/folder2/ ) ,then you must define file naming convention with a date included in the file name.  When entering the case sensitive path in Pigment, use {{date}} where the date is located. 

It should look something like this  folder1/folder2/file-{{date}}.csv

 

Here are some examples of file names and the results depending on what path you entered.

If the bucket contains the following files:

  • /myfolderA/2022-09-15-salaries.csv
  • /myfolderA/2022-10-15-salaries.csv
  • /myfolderA/2022-10-16-revenue.csv
  • /myfolderB/2022-11-15-salaries.csv

Here are the results I would get depending on my file path :

  • path "myfolderA/{{date}}-salaries.csv" will import /myfolderA/2022-10-15-salaries.csv because this is the most recent file in myfolderA.
  • path "myfolderB/{{date}}-salaries.csv" will import /myfolderB/2022-11-15-salaries.csv because this is the most recent file in myfolderB.
  • path "/{{date}}-salaries.csv" will import nothing because its missing the folder location.

 


This topic has been closed for comments