User Tools

Site Tools


redshift_agent

In order to integrate with a Redshift database you must obtain the JDBC driver from AWS. Currently, Perspectium supports version 4.1 of the Redshift JDBC driver. Once you have obtained the driver from AWS, it needs to be placed into the Agent's Jars directory, which is a sub-directory within the Replicator Agent's application folder.

Starting with v3.22.0 of the Replicator Agent support for the AWS Redshift database has been enhanced. To take advantage of the enhanced processing requires additional configuration steps which are outlined below. You can however continue to use the previous support for Redshift by skipping this additional configuration.

The enhanced version of the agent uses an S3 bucket as a staging area where replicated data is stored and then subsequently used to populate a redshift table.

The agent uses AWS tools to efficiently populate a replicated table to the database from the S3 bucket. You must create the S3 bucket during setup by using AWS tools. Once created you'll then provide the bucket name and security credentials which will enable access to the S3 bucket during configuration of the agent.

Furthermore, the following entries must be included within each <task> element in which Redshift support is being configured.

The <handler> directive must specify 'RedshiftSQLSubscriber' as the handler. For example:

<handler>com.perspectium.replicator.sql.RedshiftSQLSubscriber</handler>

In addition, the following directives must also be placed within the <task> definition with the proper values. These additional directives are used to specify the location or region that hosts your redshift database and S3 bucket.

The <region> directive specifies the AWS region. Example:

<region>us-west-1</region>

The <access_key> and <secret_access_key> directives must have valid AWS access credentials in order to securely access your S3 bucket.

<access_key>PLACE_YOUR_KEY_HERE</access_key>
<secret_access_key>PLACE_YOUR_SECRET_KEY_HERE</secret_access_key>

The <s3_bucket> directive is used to define the name of the S3 bucket you created to hold replicated data.

<s3_bucket>your_bucket_name</s3_bucket>

Additionally, in order to optimize throughput you should enable batch processing of row deletes by including the following directive within your <task> definition:

<batch_delete/>

Redshift is offered as a premium service, please contact our support team support@perspectium.com to inquire about this service.

redshift_agent.txt · Last modified: 2017/10/11 20:07 by billy