Moving data into Snowflake using Alteryx
Connecting and configuring Alteryx to talk to Snowflake
On a recent project, I needed to move some data into Snowflake using Alteryx. With some trial and error and some online research I was able to get this going. In this blog post, I will cover my learnings on the process for connecting Alteryx to Snowflake and building a basic data loader.
I covered the following scenarios in this blog post.
- Data Movement and transformation between Snowflake databases using Alteryx using the In-Database tools
- Local file into Snowflake data load using Alteryx In/Out tools
Snowflake
Snowflake is a cloud-based data platform that can run on a multi-cloud model. The platform works on a pay as you use model, and the customer is only charged when either the storage or compute associated with the data warehouse is used. Snowflake can be configured to grow or shrink based on user or system workloads. There are several other features that come with the Snowflake Data Platform including Data Sharing, Zero-copy cloning, Snowpipe (for data streaming) and support for semi-structured data to name a few. For more details on the full capabilities of Snowflake please browse to this link.
Alteryx
Alteryx combines powerful data transformation capabilities with an intuitive, workflow-based user-interface
Key features of Alteryx include:
- Create repeatable analytic workflows to help automate manual data tasks
- Easy-to-use, code-free, and programming-free user interface with descriptive error and warning messages.
- Flexible and diversified platform supports nearly every type of data source and visualization tool.
- Provides insight into flow of data and data health at every point of workflow.
- Created with the goal of having deep functionality for data scientists while being intuitive and user-friendly enough for a business user.
You can find more information about Alteryx here:https://www.alteryx.com/
High Level Architecture
Here’s a common high level architecture pattern for using Alteryx on Snowflake Data Warehouse.
Connecting to Snowflake from Alteryx
- Download and Install the Trial version of Alteryx
If you don’t already have the Alteryx application running, download and install the Alteryx application.
Once downloaded, you can install the application on your local laptop/desktop.
2. Launch the Alteryx Designer Application
In this example, we will load a single table from one database in Snowflake to another database.
3. Setup the Snowflake Data Source and Target
- Use “Manage In-DB Connections”
- Create the Connection File
- Create the connection strings
Provide the ODBC connect string for the “Read” option:
odbc:DRIVER={SnowflakeDSIIDriver};UID=<Userid>;PWD=__EncPwd1__;WAREHOUSE=<snowflake warehouse>;DATABASE=<snowflake_sample_data>;SERVER=<xxxx.snowflakecomputing.com..>
Provide the Connect string for the “Write” in the same format using the appropriate destination database details (DB name, schema, warehouse, user id, password etc.).
- Apply and Close.
4. To test the connections, we will now create a basic workflow to move the data from a source table into a target table with a simple formula transform.
The recommended best practice for working with Snowflake in Alteryx is by using the In-Database option. Using this option allows you to use the full capabilities of Snowflake while in Alteryx. You can also use the In/Out tools to move data into Snowflake.
- Drag the Source connector into the designer canvas:
- Import the source data query
- Add the target database/table. For this you should use the Write Data In-DB component.
- Provide the specific table name for the target
There are many options for this data target:
I chose the “Create new table” option. This option will run a Create Table statement on Snowflake.
Once all the components are in the workflow, connect them up. In this workflow, I added a couple of calculated fields using the “Formula” transform to capture the “Create_Date_Time” and “Create User”, and used snowflake SQL functions to populate these at runtime.
Save the workflow and run it.
If the workflow ran successfully, you should see something like this.
You can check in Snowflake. You will see the NATIONS_2 table that was created by this workflow.
You can also look at the history in Snowflake to track the specific data load activity that was done through Alteryx.
You can go into the details of the query to look at additional statistics.
I also attempted a basic file load into Snowflake using Alteryx, using the In/Out transform tools.
In this scenario, is used an already configured Snowflake ODBC connection to connect to Snowflake and load the data from the local file.
In this review we established a connection between Alteryx and Snowflake using a couple of different mechanisms, added some basic transformations to the input data and loaded the data into Snowflake using Alteryx. We also used the In-Database and the In/Out transforms to load data into Snowflake. We also looked at the capabilities of the In-Database option to pushdown data processing to the Snowflake Database.
Alteryx also has a number of tools/transforms in its toolset to use data in a Snowflake database and build and deploy a variety of Data Science models.
Both Alteryx and Snowflake are very powerful tools in the Modern Data And Analytics landscape. These two technologies can make a formidable combo for data platform and analytics development.
Hope this was helpful. Please let me know if you have any feedback. If you have any questions, please reach out at venu.chakrapani@gmail.com.