Connect Pigment with Amazon Redshift

  • 17 November 2022
  • 0 replies
  • 571 views
Connect Pigment with Amazon Redshift
Userlevel 3
Badge +3
  • Community Manager
  • 3 replies

This article explains how to connect Amazon Redshift with Pigment to retrieve any query result into a Pigment block.
 

Table of contents

 

As with all other native Pigment integrations, you’ll first need to configure a Connection in the Integration library page. Once configured, this connection can be used by any Application that is authorized as a data source. These authorized Apps are defined during the ‘Import Data’ configuration step. Now, let’s see how to configure and use this native integration!

 

Overview

 

This process starts in AWS Redshift - you will need to create (or assign) two dedicated users for the purposes of the connection with Pigment: one for Redshift (the Database User) and one for AWS (the IAM User).

The Database User is used for accessing the tables where your data is stored.

 

Meanwhile, the IAM User is for the purposes of generating an Access Key and Secret Key to authenticate the connection from the database to Pigment.

Then, in Pigment, you will create a Connection and add the Key credentials. Once set up, you can input a SQL query and the Amazon Redshift connector will load the result into your block. 

 

Prerequisites

You must have the permissions to create and manage access for the Redshift Database User (e.g. database owner / superuser).

You must have the permissions to create and manage the permissions for the AWS IAM User (e.g. administrator).

 

Preliminary steps on Amazon Redshift

 

AWS Step 1: Create a Redshift Database User and grant the user access to the specific tables

 

First, create a Redshift Database User in the cluster and database containing the data you would like to load in Pigment. We recommend setting an easily identifiable user name (e.g. pigment_db_user) so that you can identify the purpose of this user.  

You can use the Query Editor to do so. Use the SQL Command below to create the user (refer to the docs for guidance if needed) - make sure to replace the {placeholders}  with your details:

 

CREATE USER {pigment_db_user} WITH PASSWORD '{user_password}'

 

This Redshift Database User password will not be used by Pigment (because we will use the AWS IAM User credentials), hence feel free to use a randomly generated password, and to disregard it later. 

Then, grant the user SELECT permissions on the specific tables containing the data you would like to import into Pigment.

You can use the SQL statement below to do so (refer to the docs for guidance if needed) - make sure to replace the {placeholders}  with your details:

 

GRANT SELECT ON TABLE {schema}.{table_name} TO {pigment_db_user}

 

Note: If the Database User does not have access to the required tables, and you write a query to select from the restricted table, Pigment will throw an error message during the import. The error will simply state that there is an Unexpected Error: query execution unsuccessful coming from Redshift. We recommend that if you receive this error, as a first step, please check that the Database User has been granted access to the required tables in Redshift. 

 

AWS Step 2: Create an IAM user and set up the minimal set of required permissions

 

Next, you need to create a dedicated IAM User in your AWS Account. We recommend setting an easily identifiable user name (e.g. pigment_aws_user) so that you can identify the purpose of this user. 

 

To authenticate the connection to your Amazon Redshift cluster, Pigment uses Key-based Access Control.  This IAM User will be used for generating an Access Key and a Secret Key to configure the connection (make sure to store these keys in a safe place). Thus, this user only needs the Access key - Programmatic access AWS credential type - they don’t require a Console password (unless you intend to sign in as this user). Refer to the AWS docs for an overview on AWS credentials.

 

 

You will need create a policy containing the minimal set of permissions required for the Pigment connection, and assign it to the user. The user will need permissions for both Redshift and Redshift Data API Services. 

 

Redshift Service permissions required:

GetClusterCredentials

  • You can assign the specific resources that this user has access to by assigning the specific db name in the region and cluster
  • You can assign the Database User you created in Step 1 as the db user 

Redshift Data API permissions required:

ExecuteStatement

ListDatabases

GetStatementResult

DescribeStatement​

You can assign the specific resources that this user has access to by assigning the specific region and cluster​​​​​​

 

For your convenience, you can copy and paste the json below as the minimal set of permissions required to configure the Pigment connection (make sure to replace the {placeholders} with your account details.

 

{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "PigmentGetRedshiftClusterCredentials",
"Effect": "Allow",
"Action": "redshift:GetClusterCredentials",
"Resource": [
"arn:aws:redshift:{region}:{AWS Account ID}:dbname:{cluster name}/{database name}",
"arn:aws:redshift:{region}:{AWS Account ID}:dbuser:{cluster name}/{database user name}"
]
},
{
"Sid": "PigmentExecuteStatementsOnRedshiftCluster",
"Effect": "Allow",
"Action": [
"redshift-data:ExecuteStatement",
"redshift-data:ListDatabases"
],
"Resource": "arn:aws:redshift:{region}:{AWS Account ID}:cluster:{cluster name}"
},
{
"Sid": "PigmentGetStatementResultOnRedshiftCluster",
"Effect": "Allow",
"Action": [
"redshift-data:GetStatementResult",
"redshift-data:DescribeStatement"
],
"Resource": "*"
}
]
}

 

After creating the minimal set of permissions and assigning it to the IAM User you have just created, create an Access Key and Secret Key for the user. (For guidance on managing Access Keys, please refer to the Redshift docs.)

 

Recommended: Never use your AWS account (root) credentials to create the connection. Always create a separate IAM user and provide that user's access key ID and secret access key to configure your connection. 

 

Limitation:

The maximum query result size of a given query is 100MB. This limitation comes from Amazon Redshift Data API (source).
If you are reaching this limit we recommend you to filter the adjust the query to only select the necessay fields or split the load into multiple query.

As an alternative solution you can also explore loading the data through the S3 connector.

 

Setting up the Amazon Redshift Connector in Pigment

 

Within Pigment, navigate to the Integration page from your Workspace homepage, and click on + Add on the Amazon Redshift integration.

 

Important Note 

You need to be a Pigment Workspace Admin to be allowed to configure the connector.

 

Click on Setup a Connection and then fill in the following 4 fields: 

  • Name: give a name to your connection (ex: “Redshift connection”)
  • Application Access: List the application which will be allowed to use this connection
  • Access key ID: input the Access Key that you generated for the IAM User in AWS Step 2
  • Secret access key: input the Secret Access Key that you generated for the IAM User in AWS Step 2
  • Region: select the Region that your cluster is based in
  • Redshift cluster name: input the name of the cluster where your database is
  • Database user: input the database username for the Database User that you created in AWS Step 1
  • Default database name: input the default database name that stores your data
  • Click on Setup to finish the setup! 

 

 

Note

Once created, you won’t be able to edit the connection details except “Name” and “Application Access”.  

 

 

Load Amazon Redshift data into Pigment

 

Once the Amazon Redshift Connection is configured, open an Application that the connection is available on. From here click “Import Data” for the object you want to import data into. 

For example, if you were importing data into a Transaction List, you would open up list and then select Import and Download and then Import.

Instead of “Upload file”, select the Integration option. After this, select the Amazon Redshift Connection you have just configured.

 

Note

If you do not see the Redshift connection you have configured, please check that the current Application is part of the Authorized Applications for this connection. If you do not see the connection, navigate back to the Integrations page, and verify that the Application is on the Application Access list.

 

 

Now, you just need to input a Query and click on Start Import.

 

Note:

You need to ensure that the database selected in the Database field is the same one where you are querying from. 

 

The result of the query should load in Pigment within a few seconds! 

 


This topic has been closed for comments