Moving Data from PostgreSQL to Snowflake
Are you tired of manual data migration processes? Look no further! In this quick guide, we'll walk you through a simple shell script that extracts data from PostgreSQL and loads it into Snowflake using the psql and snowsql command-line utilities.
Pre-requisite:
Before we dive in, make sure you have:
psql
commandline utility installedsnowsql
commandline utility installed- PGPASSWORD environment variable set with your PostgreSQL user password
- SNOWSQL_PWD environment variable set with your Snowflake user password
- Snowflake Target tables already exists
Here's the script that will do all the heavy lifting for you:
#!/bin/bash
# Functionality: Extract Data from Postgres & Load to Snowflake
set -e
# List of Tables
tables=("employee" "department" "salary")
# Postgres Parameters
pgHost="<postgresql-host-name>"
pgUser="<postgresql-user-name>"
pgDatabase="<postgresql-database-name>"
pgSchema="<postgresql-schema-name>"
export PGPASSWORD='<postgresql-user-password>'
# Snowflake Parameters
sfAccount="<snowflake-account-name>"
sfUser="<snowflake-user-name>"
sfRole="<snowflake-role>"
sfDatabase="<snowflake-database-name>"
sfSchema="<snowflake-schema-name>"
export SNOWSQL_PWD='<snowflake-user-password>'
# Process Each Tables
for table in "${tables[@]}"
do
# Table Parameters
workingDir="<local-working-directory-absolute-path>"
tableName=${table}
filePath="${workingDir}/${tableName}.txt"
# Extract
query="\COPY (SELECT * FROM ${pgSchema}.${tableName}) TO '${filePath}' WITH DELIMITER '^' CSV;"
echo "${query}"
psql -h ${pgHost} -U ${pgUser} -d ${pgDatabase} -c "${query}"
# Load
query="PUT file://${filePath} @%${tableName};"
echo "${query}"
snowsql --noup -a ${sfAccount} -u ${sfUser} -r ${sfRole} -d ${sfDatabase} -s ${sfSchema} -q "${query}" -o exit_on_error=true -o timing=false -o log_level=DEBUG
query="LIST @%${tableName};"
echo "${query}"
snowsql --noup -a ${sfAccount} -u ${sfUser} -r ${sfRole} -d ${sfDatabase} -s ${sfSchema} -q "${query}" -o exit_on_error=true -o timing=false -o log_level=DEBUG
query="COPY INTO ${sfSchema}.${tableName} FROM @%${tableName} file_format = (type = csv field_delimiter = '^' field_optionally_enclosed_by = '\"');"
echo "${query}"
snowsql --noup -a ${sfAccount} -u ${sfUser} -r ${sfRole} -d ${sfDatabase} -s ${sfSchema} -q "${query}" -o exit_on_error=true -o timing=false -o log_level=DEBUG
query="REMOVE @%${tableName} pattern='.*.txt.gz';"
echo "${query}"
snowsql --noup -a ${sfAccount} -u ${sfUser} -r ${sfRole} -d ${sfDatabase} -s ${sfSchema} -q "${query}" -o exit_on_error=true -o timing=false -o log_level=DEBUG
done
The script iterates through a list of tables, extracts the data from PostgreSQL using psql, and then loads the data into Snowflake using snowsql.
- Make sure to replace the placeholders with your actual PostgreSQL and Snowflake credentials.
- You can modify the script to handle more complex sql select queries for incremental data extraction or add additional error handling as needed.
- Consider using a more robust logging mechanism instead of echo statements for better debugging and auditing.
With this simple script, you'll be able to effortlessly move data from PostgreSQL to Snowflake. Happy migrating!