This topic describes how to configure secure access to data files stored in a Microsoft Azure container.
Option 1: Configuring a Snowflake Storage Integration¶
This section describes how to use storage integrations to allow Snowflake to read data from and write data to an Azure container referenced in an external (Azure) stage. Integrations are named, first-class Snowflake objects that avoid the need for passing explicit cloud provider credentials such as secret keys or access tokens. Integration objects store an Azure identity and access management (IAM) user ID called the app registration. An administrator in your organization grants this app the necessary permissions in the Azure account.
An integration must also specify containers (and optional paths) that limit the locations users can specify when creating external stages that use the integration.
Completing the instructions in this section requires permissions in Azure to manage storage accounts. If you are not an Azure administrator, ask your Azure administrator to perform these tasks.
In this Section:
Step 1: Create a Cloud Storage Integration in Snowflake
Step 2: Grant Snowflake Access to the Storage Locations
Step 3: Create an External Stage
Step 1: Create a Cloud Storage Integration in Snowflake¶
Create a storage integration using the CREATE STORAGE INTEGRATION command. A storage integration is a Snowflake object that stores a generated service principal for your Azure cloud storage, along with an optional set of allowed or blocked storage locations (i.e. containers). Cloud provider administrators in your organization grant permissions on the storage locations to the generated service principal. This option allows users to avoid supplying credentials when creating stages or loading data.
A single storage integration can support multiple external (i.e. Azure) stages. The URL in the stage definition must align with the Azure containers (and optional paths) specified for the STORAGE_ALLOWED_LOCATIONS parameter.
Only account administrators (users with the ACCOUNTADMIN role) or a role with the global CREATE INTEGRATION privilege can execute this SQL command.
CREATE STORAGE INTEGRATION <integration_name> TYPE = EXTERNAL_STAGE STORAGE_PROVIDER = 'AZURE' ENABLED = TRUE AZURE_TENANT_ID = '<tenant_id>' STORAGE_ALLOWED_LOCATIONS = ('azure://<account>.blob.core.windows.net/<container>/<path>/', 'azure://<account>.blob.core.windows.net/<container>/<path>/') [ STORAGE_BLOCKED_LOCATIONS = ('azure://<account>.blob.core.windows.net/<container>/<path>/', 'azure://<account>.blob.core.windows.net/<container>/<path>/') ]
integration_nameis the name of the new integration.
tenant_idis the ID for your Office 365 tenant that the allowed and blocked storage accounts belong to. A storage integration can authenticate to only one tenant, and so the allowed and blocked storage locations must refer to storage accounts that all belong this tenant.
To find your tenant ID, log into the Azure portal and click Azure Active Directory » Properties. The tenant ID is displayed in the Tenant ID field.
containeris the name of a Azure container that stores your data files (e.g.
mycontainer). The STORAGE_ALLOWED_LOCATIONS and STORAGE_BLOCKED_LOCATIONS parameters allow or block access to these containers, respectively, when stages that reference this integration are created or modified.(Video) Azure Blob Storage Containers - How to create a storage account and upload files, create folders
pathis an optional path that can be used to provide granular control over logical directories in the container.
The following example creates an integration that explicitly limits external stages that use the integration to reference either of two containers and paths. In a later step, we will create an external stage that references one of these containers and paths. Multiple external stages that use this integration can reference the allowed containers and paths:
CREATE STORAGE INTEGRATION azure_int TYPE = EXTERNAL_STAGE STORAGE_PROVIDER = 'AZURE' ENABLED = TRUE AZURE_TENANT_ID = 'a123b4c5-1234-123a-a12b-1a23b45678c9' STORAGE_ALLOWED_LOCATIONS = ('azure://myaccount.blob.core.windows.net/mycontainer1/mypath1/', 'azure://myaccount.blob.core.windows.net/mycontainer2/mypath2/') STORAGE_BLOCKED_LOCATIONS = ('azure://myaccount.blob.core.windows.net/mycontainer1/mypath1/sensitivedata/', 'azure://myaccount.blob.core.windows.net/mycontainer2/mypath2/sensitivedata/');
Step 2: Grant Snowflake Access to the Storage Locations¶
Execute the DESCRIBE INTEGRATION command to retrieve the consent URL:
DESC STORAGE INTEGRATION <integration_name>;
integration_nameis the name of the integration you created in Step 1: Create a Cloud Storage Integration in Snowflake.
Note the values in the following columns:
URL to the Microsoft permissions request page.
Name of the Snowflake client application created for your account. In a later step in this section, you will need to grant thisapplication the permissions necessary to obtain an access token on your allowed storage locations.
In a web browser, navigate to the URL in the AZURE_CONSENT_URL column. The page displays a Microsoft permissions request page.
Click the Accept button. This action allows the Azure service principal created for your Snowflake account to obtain an access token on any resource inside your tenant. Obtaining an access token succeeds only if you grant the service principal the appropriate permissions on the container (see the next step).
The Microsoft permissions request page redirects to the Snowflake corporate site (snowflake.com).
Log into the Microsoft Azure portal.
Navigate to Azure Services » Storage Accounts. Click on the name of the storage account you are granting the Snowflake service principal access to.
Click Access Control (IAM) » Add role assignment.
Select the desired role to grant to the Snowflake service principal:
Storage Blob Data Readergrants read access only. This allows loading data from files staged in the storage account.
Storage Blob Data Contributorgrants read and write access. This allows loading data from or unloading data to files staged inthe storage account. The role also allows executing the REMOVE command to remove files staged in thestorage account.(Video) Learn Azure Blob Storage for Beginners
Search for the Snowflake service principal. This is the identity in the AZURE_MULTI_TENANT_APP_NAME property in the DESC STORAGE INTEGRATION output (in Step 1). Search for the string before the underscore in the AZURE_MULTI_TENANT_APP_NAME property.
It can take an hour or longer for Azure to create the Snowflake service principal requested through the Microsoft request page in this section. If the service principal is not available immediately, we recommend waiting an hour or two and then searching again.
If you delete the service principal, the storage integration stops working.
Click the Review + assign button.
According to the Microsoft Azure documentation, role assignments may take up to five minutes to propagate.
Snowflake caches the temporary credentials for a period that cannot exceed the 60 minute expiration time. If you revoke access from Snowflake, users might be able to list files and load data from the cloud storage location until the cache expires.
Step 3: Create an External Stage¶
Create an external (Azure) stage that references the storage integration you created in Step 1: Create a Cloud Storage Integration in Snowflake (in this topic).
Creating a stage that uses a storage integration requires a role that has the CREATE STAGE privilege for the schema as well as the USAGE privilege on the integration. For example:
GRANT CREATE STAGE ON SCHEMA public TO ROLE myrole;GRANT USAGE ON INTEGRATION azure_int TO ROLE myrole;
To reference a storage integration in the CREATE STAGE statement, the role must have the USAGE privilege on the storage integration object.
Append a forward slash (
/) to the URL value to filter to the specified folder path. If the forward slash is omitted, all files andfolders starting with the prefix for the specified path are included.
Note that the forward slash is required to access and retrieve unstructured data files in the stage.
Create the stage using the CREATE STAGE command.
For example, set
mydb.public as the current database and schema for the user session, and then create a stage named
my_azure_stage. In this example, the stage references the Azure container and path
mycontainer1/path1, which are supported by the integration. The stage also references a named file format object called
USE SCHEMA mydb.public;CREATE STAGE my_azure_stage STORAGE_INTEGRATION = azure_int URL = 'azure://myaccount.blob.core.windows.net/container1/path1' FILE_FORMAT = my_csv_format;(Video) Setup SFTP on Azure Blob Storage Account
The stage owner (i.e. the role with the OWNERSHIP privilege on the stage) must have the USAGE privilege on the storage integration.
To load or unload data from or to a stage that uses an integration, a role must have the USAGE privilege on the stage. It is not necessary to also have the USAGE privilege on the storage integration.
blob.core.windows.netendpoint for all supported types of Azure blob storage accounts, including Data Lake Storage Gen2.
The STORAGE_INTEGRATION parameter is handled separately from other stage parameters, such as FILE_FORMAT. Support for these other parameters is the same regardless of the integration used to access your Azure container.
Option 2: Generating a SAS Token¶
Step 1: Generate the SAS Token¶
The following step-by-step instructions describe how to generate an SAS token to grant Snowflake limited access to objects in your storage account:
Log into the Azure portal.
From the home dashboard, choose Storage Accounts » <storage_account>. Under Security + networking, choose Shared access signature.
Select the following Allowed services:
Select the following Allowed resource types:
Container(required to list objects in the storage account)
Object(required to read/write objects from/to the storage account)
Select the following allowed permissions to load data files from Azure resources:
Createpermissions are also required if you plan to unload files to a container. In addition, to use the
PURGE = TRUEoption, the
Permanent Deletepermission is required.
Specify start and expiry dates/times for the SAS token. As part of a general security plan, you could generate a different SAS token periodically.(Video) AzCopy I How to use AzCopy to Upload Data in Azure Storage I Azure Storage Account I Container
Leave the Allowed IP addresses field blank, and specify either HTTPS only or HTTPS and HTTP under Allowed protocols.
Click the Generate SAS and connection string button. Record the full value in the SAS token field, starting with and including the
?. This is your SAS token. You will specify this token when you create an external stage.
Step 2: Create an External Stage¶
Create an external (Azure) stage that references the SAS token you generated in Step 1: Generate the SAS Token (in this topic).
The following example uses SQL to create an external stage named
my_azure_stage that includes Azure credentials and amaster encryption key. The stage URL references the Azure
myaccount account. Thedata files are stored in the
mycontainer container and
/load/files path. The stage references a named file format object called
my_csv_format. Note that the example truncates the
CREATE OR REPLACE STAGE my_azure_stage URL='azure://myaccount.blob.core.windows.net/mycontainer/load/files' CREDENTIALS=(AZURE_SAS_TOKEN='?sv=2016-05-31&ss=b&srt=sco&sp=rwdl&se=2018-06-27T10:05:50Z&st=2017-06-27T02:05:50Z&spr=https,http&sig=bgqQwoXwxzuD2GJfagRg7VOS8hzNr3QLT7rhS8OFRLQ%3D') ENCRYPTION=(TYPE='AZURE_CSE' MASTER_KEY = 'kPx...') FILE_FORMAT = my_csv_format;
Note that the AZURE_SAS_TOKEN and MASTER_KEY values used in this example are for illustration purposes only.
By specifying a named file format object (or individual file format options) for the stage, it is not necessary to later specify the same file format options in the COPY command used to load data fromthe stage. For more information about file format objects and options, see CREATE FILE FORMAT.
Data File Encryption¶
Enable Azure Storage Service Encryption (SSE) for Data at Rest on your storage account directly, and Snowflake will handle it correctly. For more information, see the Azure documentation on SSE.
In addition, Snowflake supports client-side encryption to decrypt files staged in Azure containers.
AZURE_CSE: Requires a MASTER_KEY value. For information, see the Client-side encryption information in the Microsoft Azure documentation.
Block blobs and append blobs support client-side encryption but page blobs do not.
Next: Creating an Azure Stage
How do I import data into an Azure container? ›
- In the Azure portal, navigate to the container you created in the previous section.
- Select the container to show a list of blobs it contains. ...
- Select the Upload button to open the upload blade and browse your local file system to find a file to upload as a block blob.
All data that is copied to an azure storage account is backed up automatically to another Azure data center. An azure storage account can contain up to 2TB of data and up to one millons files. One of the benefits of Azure SQL Data Warehouse is that high availability is built into the platform.How do you load data into Azure blob storage? ›
- Prerequisites to import data to Azure Blob storage.
- Step 1: Prepare the drives.
- Step 2: Create an import job.
- Step 3: Configure customer managed key (Optional)
- Step 4: Ship the drives.
- Step 5: Update job with tracking information.
- Step 6: Verify data upload to Azure.
An entity in Azure Storage can be up to 1MB in size. An entity in Azure Cosmos DB can be up to 2MB in size. Properties: A property is a name-value pair. Each entity can include up to 252 properties to store data.How do you transfer data to a container? ›
- Open a terminal on your local machine.
- Create a file named myfile.txt using the touch command. ...
- Execute the docker run command. ...
- Verify if the Docker container is successfully created using the docker ps command. ...
- Next, execute the docker cp command.
Containers provide an easy way to run batch jobs without having to manage an environment and dependencies. Dynamic compute options, such as Azure Container Instances (ACI), can be used to efficiently ingest source data, process it, and place it in a durable store such as Azure Blob storage.What is the maximum file size to upload in Azure? ›
At the moment, anything larger than 100MB cannot be uploaded. is there any way this can be facilitated on Azure Blob so users are able to upload larger files (up to 500MB)? An Azure service that stores unstructured data in the cloud as blobs.What is the maximum container size in Azure? ›
The maximum size for a deployable container image on Azure Container Instances is 15 GB.What's the maximum amount of data that can be transferred to Azure in one operation through the Azure data box disk? ›
What is the maximum amount of data I can transfer with one Data Box device? Data Box has a raw capacity of 100 TB and usable capacity of 80 TB. You can transfer up to 80 TB of data with Data Box. To transfer more data, you need to order more devices.How do I load data into Azure data Studio? ›
- Step1: Specify Input File: In the first step, we need to specify the input file, table name (default table name is same as the flat file name), table schema. ...
- Step 2: Preview Data. ...
- Step 3: Modify Columns. ...
- Step 4: Summary.
How to load data from Blob storage to Azure SQL Database? ›
- Create a data factory.
- Create Azure Storage and Azure SQL Database linked services.
- Create Azure Blob and Azure SQL Database datasets.
- Create a pipeline contains a Copy activity.
- Start a pipeline run.
- Monitor the pipeline and activity runs.
Microsoft Azure and most other cloud providers offer several different types of storage, each with its own unique pricing structure and preferred use. Azure storage types include objects, managed files and managed disks.What is the maximum amount of data that can be stored? ›
The hard disc has the highest capacity to store data up to 2 Terabytes.What are three types of storage available in Azure storage? ›
Azure Queues: A messaging store for reliable messaging between application components. Azure Tables: A NoSQL store for schemaless storage of structured data. Azure Disks: Block-level storage volumes for Azure VMs.How do you initialize a container? ›
- kubectl apply -f myapp.yaml.
- kubectl get -f myapp.yaml.
- kubectl describe -f myapp.yaml.
Containers are packages of software that contain all of the necessary elements to run in any environment. In this way, containers virtualize the operating system and run anywhere, from a private data center to the public cloud or even on a developer's personal laptop.How do you connect to a container? ›
To connect to a container using plain docker commands, you can use docker exec and docker attach . docker exec is a lot more popular because you can run a new command that allows you to spawn a new shell. You can check processes, files and operate like in your local environment.Which is the easiest way to run a container in Azure? ›
Container Instances – Azure Container Instances is the easiest way to run containers on Azure. Either through the Azure Portal or through one of the many automation tools on Azure, you can quickly and easily spin up single containers that can scale quickly.Why use containers in Azure? ›
Protect your data and code while the data is in use in the cloud. Accelerate time to market, deliver innovative experiences and improve security with Azure application and data modernisation. Seamlessly integrate applications, systems, and data for your enterprise.What are two ways to deploy containers? ›
Containers are deployed by using containerization platforms like, Docker Desktop, Red Hat OpenShift, D2IQ-Mesosphere DCOS, Amazon Web Services ECS/EKS, Microsoft Azure Container Service and Google Container Engine-GKE among others.
How do I transfer large data to Azure? ›
Azure Data Box for data transfer to Azure
Use Data Box family of products such as Data Box, Data Box Disk, and Data Box Heavy to move large amounts of data to Azure when you're limited by time, network availability, or costs. Move your data to Azure using common copy tools such as Robocopy.
Using this simple calculation, we can estimate that we can run about 1,000 containers on a single host with 10GB of available disk space.How many containers can you have in Azure storage? ›
|Max number of storage accounts per subscription||1001|
|TB per storage account||500 TB|
|Max number of blob containers, blobs, file shares, tables, queues, entities, or messages per storage account||Only limit is the 500 TB storage account capacity|
In azure portal ,you can see Usage under Blob Service blade. Open the blob container, select properties under Settings, then click on the Calculate Size button which gives size for each type of blob and also total size of the container.What is the maximum number of data disks that can be added to a virtual machine? ›
Although the absolute maximum number of data disks that can be attached to a virtual machine is 64, the type and size of a virtual machine dictates the maximum number of data disks that can be attached, as summarized here: General-purpose: A , B , and D series VM. 2-64 data disks.What Azure storage is recommended for production data files that requires maximum throughput? ›
Use the Standard_M32ms which has a max uncached disk throughput of 20,000 IOPS and 500 MBps.What is the maximum number of blob containers in an Azure storage account? ›
|Maximum size of single blob container||Same as maximum storage account capacity|
|Maximum number of blocks in a block blob or append blob||50,000 blocks|
|Maximum size of a block in a block blob||4000 MiB|
|Maximum size of a block blob||50,000 X 4000 MiB (approximately 190.7 TiB)|
- Select the Source Azure region where your data currently is.
- Select the storage account from which you want to export data. Use a storage account close to your location. ...
- Specify the blob data to export from your storage account to your blank drive or drives.
In the Azure portal, create an import job for the target storage account. Upload the drive journal files. Input the return address and carrier account number for shipping the drives. Ship the disc drives to the address specified when the job was created.Can you query data from blob storage? ›
The possibility to query information on blob storage and other sources easily with a Serverless Pool has many uses. One of them is ad-hoc analysis queries, and the UI feature to view the result in many different ways contributes even more to this.
Can we query data from Azure blob storage? ›
The Query Blob Contents operation applies a simple Structured Query Language (SQL) statement on a blob's contents and returns only the queried subset of the data. You can also call Query Blob Contents to query the contents of a version or snapshot.How do I transfer data from SQL Server to SQL Azure? ›
Connect to your source SQL Server instance. Click the Migrate to Azure SQL button, in the Azure SQL Migration wizard in Azure Data Studio. Select databases for assessment, then click on next. Select your Azure SQL target, in this case, Azure SQL Database (Preview)What is the difference between Azure storage account and container? ›
A container organizes a set of blobs, similar to a directory in a file system. A storage account can include an unlimited number of containers, and a container can store an unlimited number of blobs.What are the four main types of storage services provided by Azure? ›
Two services that are especially important are Azure SQL Database and Azure Cosmos DB. Azure SQL Database is a managed service for hosting SQL Server databases (although it's not 100% compatible with SQL Server).How do we store large amounts of data? ›
- There are two general types of hard drives: hard disk drives (HDD) and solid-state drives (SSD)
- In computing, mass storage refers to the storage of large amounts of data in a persisting and machine-readable fashion.
Use Cloud Storage to Save Data Long Term
The ideal approach to save data for a longer time is cloud storage. Data security and storage reliability are two advantages of cloud storage that can't be matched. In addition, end-to-end encryption ensures the safety of all transmitted data.
Data caps are a monthly limit on the amount of data you can consume on your home internet. Using too much data leads to extra charges on your bill or drastically slowed internet speeds. Many internet providers have data caps, but some don't. Usually you can expect a monthly cap of 1 TB.What is the difference between blob and file storage in Azure? ›
Azure File Storage and Blob Storage offer the same level of redundancy, but Azure Blob Storage is cheaper than File Storage. Azure File Storage provides the folder for the data storage, while Azure Blob Storage does not provide the folder. They give a flat structure for data storage.What is the difference between blob container and file share? ›
What is the difference between blob and file storage? Azure Blob Storage is an object store used for storing vast amounts unstructured data, while Azure File Storage is a fully managed distributed file system based on the SMB protocol and looks like a typical hard drive once mounted.
How do I import a CSV file into Azure? ›
You can use the bcp command-line utility to import data from a CSV file into Azure SQL Database or Azure SQL Managed Instance.How do I import a database into a docker container? ›
- Login to MySQL. ...
- Create an empty MySQL database. ...
- Download a database import file. ...
- Import the SQL dump into your MySQL database. ...
- Access the new database. ...
- Open a Jupyter notebook.
First, set the path in your localhost to where the file is stored. Next set the path in your docker container to where you want to store the file inside your docker container. Then copy the file which you want to store in your docker container with the help of CP command.How do I import a CSV file into Azure table Storage? ›
- Expand tables on a storage account.
- Create a new table.
- Click Import CSV.
- Select a local csv file with (roughly) these contents.
- To connect Excel to a database in SQL Database, open Excel and then create a new workbook or open an existing Excel workbook.
- In the menu bar at the top of the page, select the Data tab, select Get Data, select From Azure, and then select From Azure SQL Database.
On the File menu, click Import. In the Import dialog box, click the option for the type of file that you want to import, and then click Import. In the Choose a File dialog box, locate and click the CSV, HTML, or text file that you want to use as an external data range, and then click Get Data.How do I load multiple files into Azure Data Factory? ›
- Go to the Copy multiple files containers between File Stores template. ...
- Create a New connection to your destination storage store.
- Select Use this template.
- You'll see the pipeline, as in the following example:
- Select Debug, enter the Parameters, and then select Finish.
Docker is great for running databases in a development environment! You can even use it for databases of small, non-critical projects which run on a single server. Just make sure to have regular backups (as you should in any case), and you'll be fine.
You can use Docker to run a database in a container as if it were a remote server, and test how your application interacts with it. This tutorial describes how to run a Docker container with a PostgreSQL server and connect to it using IntelliJ IDEA.How do databases work with containers? ›
With containers, you can approach the database as an on-demand utility, which means that each application can have its own dedicated database that can be spun up as needed. This overcomes the disadvantages of large, monolithic databases with a microservices architecture supported by smaller, containerized databases.
What is the difference between docker import and docker load? ›
import is used with the tarball which are created with docker export. load is used with the tarball which are created with docker save. If you want to look at those options check the below article.How do I copy files to a container? ›
- Obtain the name or id of the Docker container.
- Issue the docker cp command and reference the container name or id.
- The first parameter of the docker copy command is the path to the file inside the container.
Install the required package vi, nano, vim etc.
Now after updating the package repository you can install your favorite editor (vi, nano, vim) inside your docker container so that you can edit the file.
Select the Azure blob storage and then click on continue. Then select the CSV file as file format and then click on continue. Name the dataset, then select the linked service which we have selected earlier, then select file path, then select none for import schema, and then click on ok.How do I ingest data to Azure Data Explorer? ›
- In the left menu, select Data.
- In the Data Management page, select Ingest data from blob, and then Ingest.
- In the Destination tab, fill out the following information: ...
- Select Next: Source.
- Select Next: Schema.
- Confirm the schema details. ...
- Select Next: Summary.
You could download the file from blob storage, then read the data into a pandas DataFrame from the downloaded file. For more details, see here. If you want to do the conversion directly, the code will help. You need to get content from the blob object and in the get_blob_to_text there's no need for the local file name.