SnowSQL – Unload Snowflake Table to CSV file - Spark By {Examples} (2023)

Snowflake data warehouse is a cloud database hence we often need to unload/download the Snowflake table to the local file system in a CSV file format, you can use data unloading SnowSQL COPY INTO statement to unload/download/export the data to file system on Windows, Linux or Mac OS.

It doesn’t matter where you are downloading the flat file to either Windows, Linux or Mac os, the process is the same except the download path you use. And make sure you install right SnowSQL CLI appropriate to your OS.

Pre-requisite

Install Snowflake CLI to run SnowSQL commands.

Below URL takes you to the Snowflake index download page, navigate to the OS you are using and download the binary and install.

https://sfc-repo.snowflakecomputing.com/snowsql/bootstrap/index.html

Download a Snowflake table to local file system is a two-step process

  • First use “COPY INTO” statement, which copies the table into the Snowflake internal stage, external stage or external location.
  • Use “GET” statement to download the file from the staged folder.

Connect to Snowflake database and schema

From command prompt, go to the location of the Snowsql CLI install and just enter Snowsql, this will display you the usage syntax. Below is a simple syntax to connect to the Snowflake database. when you use this statement, it prompts for a password and after entering correctly, it provides you shell to interact with Snowflake cloud.

syntax

user>snowsql -a accountname -u user -d dtabasename -s schemaname

Below Snowsql statement connects to account “oea82.us-east-1”, user “nkmr”, database “employee” and schema “public”.

user>snowsql -a oea82.us-east-1 -u nkmr -d employee -s publicPassword:* SnowSQL * v1.2.5Type SQL statements or !helpnkmr#[emailprotected]>

Unload Snowflake table in CSV format using internal stage

Though you can download to different internal stages , here we use the internal table stage to unload the table.

Using SnowSQL COPY INTO you can unload Snowflake table to the internal table stage in a CSV format, In below example, we are exporting table EMP and stores at the internal table “EMP” stage @%EMP/result/. It creates a file with the name data_0_0_0.csv.gz

By default COPY INTO unload the data into CSV file with a header and compress the file with gzip. You can change this behavior and we will see this in the below sections.

nkmr#[emailprotected]>COPY INTO '@%EMP/result/' from EMP;+---------------+-------------+--------------+| rows_unloaded | input_bytes | output_bytes ||---------------+-------------+--------------|| 5 | 122 | 115 |+---------------+-------------+--------------+1 Row(s) produced. Time Elapsed: 1.300s

I’ve 5 records in the EMP table and all records have been unloaded. Now use GET to download the file.

GET – Download from the internal stage

Use GET SnowSQL command to download the unloaded file from the internal stage to the local file system. By default, all unloaded data files are compressed using gzip.

GET @%EMP/result/ file:///tmp/;file1.csv_0_0_0.csv.gz(0.00MB): [##########] 100.00% Done (0.204s, 0.00MB/s).+------------------------+------+------------+---------+| file | size | status | message ||------------------------+------+------------+---------|| data_0_0_0.csv.gz | 115 | DOWNLOADED | |+------------------------+------+------------+---------+1 Row(s) produced. Time Elapsed: 2.013s

This downloads a file data_0_0_0.csv.gz to /tmp directory.

Unload Options to Use

COPY INTO provides server options, In this section let’s see some most commonly used options.

HEADER – Ignore column names on header

Use HEADER optional parameter to specify whether to include the table column headings in the output files, by default it is set to TRUE, you can change it to FALSE if you do not want column names of a header on the output file. for example.

COPY INTO '@%EMP/result/' from EMP HEADER = FALSE 1 Row(s) produced. Time Elapsed: 1.300s

COMPRESSION – Changes compression

Use COMPRESSION to change the default compression, By default, it unloads a file into GZIP format, however, you can change it to use the following compressions AUTO | GZIP|BZ2|BROTLI|ZSTD|DEFLATE|RAW_DEFLATE|NONE

COPY INTO '@%EMP/result/' from EMP file_format = (type=CSV COMPRESSION = BZ2) 1 Row(s) produced. Time Elapsed: 1.300s

Above example unloads an EMP table to data_0_0_0.csv.bz2 .

Use NONE, if you want the file to unload in plain raw CSV file without default compression.

FIELD_DELIMITER – Changes delimiter

Use FIELD_DELIMITER to change the default delimiter, by default it used ‘,’ character, but you can change to use any character, for example, let’s see how to use pipe delimiter

COPY INTO '@%EMP/result/' from EMP file_format = (type=CSV COMPRESSION = NONE FIELD_DELIMITER='|') 1 Row(s) produced. Time Elapsed: 1.300s

RECORD_DELIMITER – Changes delimiter for record

In case if you wanted to separate the record other than ‘\n’ you use RECORD_DELIMITER. for example, below query separates records with ‘\n\r’

COPY INTO '@%EMP/result/' from EMP file_format = (type=CSV COMPRESSION = NONE RECORD_DELIMITER='\n\r') 1 Row(s) produced. Time Elapsed: 1.300s

FILE_EXTENSION – Changes file extension

You can change the file extension using FILE_EXTENSION option, below example unloads the file in raw CSV format with out compression by the file name will be with TXT extension

COPY INTO '@%EMP/result/data_extension' from EMP file_format = (type=CSV COMPRESSION = NONE FILE_EXTENSION='TXT');+---------------+-------------+--------------+| rows_unloaded | input_bytes | output_bytes ||---------------+-------------+--------------|| 5 | 122 | 115 |+---------------+-------------+--------------+1 Row(s) produced. Time Elapsed: 0.804snkmr#[emailprotected]>GET @%EMP/result/data_extension file://c:/tmp/;data_extension_0_0_0.txt(0.00MB): [##########] 100.00% Done (0.201s, 0.00MB/s).+-----------------------------+------+------------+---------+| file | size | status | message ||-----------------------------+------+------------+---------|| data_extension_0_0_0.txt | 220 | DOWNLOADED | |+-----------------------------+------+------------+---------+1 Row(s) produced. Time Elapsed: 1.998s

NULL_IF – Converts a value to NULL

COPY INTO '@%EMP/result/' from EMP file_format = (type=CSV COMPRESSION = NONE NULL_IF=('')) 1 Row(s) produced. Time Elapsed: 1.300s

EMPTY_FIELD_AS_NULL – Changes empty field to null

COPY INTO '@%EMP/result/' from EMP file_format = (type=CSV COMPRESSION = NONE EMPTY_FIELD_AS_NULL='') 1 Row(s) produced. Time Elapsed: 1.300s

<strong>DATE_FORMAT</strong> – Change Date to a specified format

By default date columns outputs in ‘YYYY-MM-DD‘, you can change this behavior using DATE_FORMAT option.

COPY INTO '@%EMP/result/' from EMP file_format = (type=CSV COMPRESSION = NONE DATE_FORMAT='MM-DD-YYYY') 1 Row(s) produced. Time Elapsed: 1.300s

<strong>TIME_FORMAT</strong> – Change Time to a specified format

By default time columns outputs in ‘HH24:MI:SS ‘, you can change this behavior using TIME_FORMAT option.

COPY INTO '@%EMP/result/' from EMP file_format = (type=CSV COMPRESSION = NONE TIME_FORMAT='HH24:MI:SS') 1 Row(s) produced. Time Elapsed: 1.300s

<strong>TIMESTAMP_FORMAT</strong> – Change Timestamp to a specified format

By default timestamp columns outputs in ‘YYYY-MM-DDHH24:MI:SS.FF3TZHTZM‘, you can change this behavior using TIMESTAMP_FORMAT option.

COPY INTO '@%EMP/result/' from EMP file_format = (type=CSV COMPRESSION = NONE TIMESTAMP_FORMAT='MM-DD-YYYY HH24:MI:SS.FF3 TZHTZM') 1 Row(s) produced. Time Elapsed: 1.300s

OVERWRITE – Overwrites existing file

If a file already presents in a specified internal stage, it overwrites when you set OVERWRITE property to TRUE

COPY INTO '@%EMP/result/' from EMP file_format = (type=CSV COMPRESSION = NONE) OVERWRITE=TRUE 1 Row(s) produced. Time Elapsed: 1.300s

In case if you have not use this property and trying to write to the same file you get below error.

001030 (22000): Files already existing at the unload destination: @%EMP/result/data_extension. Use overwrite option to force unloading.

Unload Snowflake table in CSV file to Amazon S3 bucket

Using SnowSQL COPY INTO statement, you can unload/download the Snowflake table directly to Amazon S3 bucket external location in a CSV file format. In below example, we are exporting from table EMP.

To connect to AWS, you need to provide the AWS key, secret key, and token, use credentials property to define credentials = (aws_key_id='xxxx' aws_secret_key='xxxxx' aws_token='xxxxxx').

COPY INTO 's3://mybucket/unload/' from EMP credentials = (aws_key_id='xxxx' aws_secret_key='xxxxx' aws_token='xxxxxx');1 Row(s) produced. Time Elapsed: 1.300s

It unloads all records form EMP table to specified S3 bucket wit file name data_0_0_0.csv.gz. Now use AWS utils to download from S3 bucket to local file system.

You can also use all the above explained options when you unloading to Amazon S3, Microsoft Azure or GCP using COPY INTO

Conclusion

In this SnowSQL article, you have learned to connect to SnowSQL from CLI, how to unload/download/export the Snowflake table in a CSV file using COPY INTO , using CSV with different options and finally learned how to download to local using GET command

Related Articles

  • Download Snowflake table to local WINDOWS | Linux | MAC
  • SnowSQL – Unload Snowflake table to Amazon S3
  • SnowSQL – Unload Snowflake table to Parquet file
  • Snowflake Data Warehouse Database Tutorials
  • Load file from Amazon S3 into Snowflake table
  • Snowflake Create, Clone, Drop Database
  • How to Load Parquet file into Snowflake table

Reference

Snowflake reference

Happy Learning !!

You may also like reading:

  1. Download Snowflake table to local WINDOWS | Linux | MAC
  2. SnowSQL – Unload Snowflake table to Parquet file
  3. SnowSQL – Unload Snowflake table to Amazon S3
  4. Spark Read CSV file into DataFrame
  5. Spark Parquet file to CSV format
  6. How to Load Parquet file into Snowflake table
  7. PySpark Read CSV file into DataFrame
  8. Spark Read XML file using Databricks API
  9. Scala – Create Snowflake table programmatically
Top Articles
Latest Posts
Article information

Author: Sen. Ignacio Ratke

Last Updated: 01/31/2023

Views: 5401

Rating: 4.6 / 5 (56 voted)

Reviews: 87% of readers found this page helpful

Author information

Name: Sen. Ignacio Ratke

Birthday: 1999-05-27

Address: Apt. 171 8116 Bailey Via, Roberthaven, GA 58289

Phone: +2585395768220

Job: Lead Liaison

Hobby: Lockpicking, LARPing, Lego building, Lapidary, Macrame, Book restoration, Bodybuilding

Introduction: My name is Sen. Ignacio Ratke, I am a adventurous, zealous, outstanding, agreeable, precious, excited, gifted person who loves writing and wants to share my knowledge and understanding with you.