I. Purpose of the document

When using TiDB Cloud, one of the first challenges you often face is how to import data from an existing cluster into a TiDB Cloud cluster. Fortunately, there are several ways to do this on the TiDB Cloud website. This document describes how to import AWS S3 data to a TiDB Cloud cluster.

Note: Currently, in the free TiDB Cloud Dev Tier cluster, data sources imported on the Web interface only support AWS S3, which supports four data formats: TiDB Dumping, Aurora Backup Snapshot, CSV and Parquet, this test simulated the import of dumpling data in AWS S3 into TiDB Cloud cluster, as well as other data formats.

Create a test cluster on TiDB Cloud

1. Select the free “Developer Tier”

2. Create a test cluster

3. After the cluster is created, set the local standard connection

(1) Choose Overview – > Connect – > Standard Connection

(2) Add a local IP address

(3) Test the local connection

Create a bucket and set permissions on AWS S3

1. Create S3 buckets in AWS and upload the files exported by dumpling

(1) Obtain ARN of S3 bucket (arn:aws:s3:::dumplingtest), which will be used to create policies in AWS IAM (the bucket creation process is omitted).

(2) Upload the data file exported by dumpling tool into S3 bucket (omitted in the process) and put it in subdirectory testData

Note: The test data is from the SQL file exported by the local cluster through dumpling tool

2. Create a policy for accessing S3 buckets in AWS IAM

(1) Create policies and write policy rules

  • Fill in S3 bucket ARN information: ARN: aws: S3: : : dumplingtest
  • Permissions s3.GetObject and s3.GetObjectVersion correspond to s3 bucket subdirectories
  • Permissions s3.ListBucket, s3.GetBucketLocation correspond to S3 Bucket
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:GetObjectVersion"
            ],
            "Resource": "arn:aws:s3:::dumplingtest/testdata/*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket",
                "s3:GetBucketLocation"
            ],
            "Resource": "arn:aws:s3:::dumplingtest"
        }
    ]
}
Copy the code

(2) The policy is created successfully

3. Obtain the Accound ID and External ID of the TiDB Cloud cluster

Obtain the value from Overview – Import – Show AWS IAM policy Settings of the TiDB Cloud cluster

TiDB Cloud Account id: 380838443567

TiDB Cloud External id: 696e6672612d6170698cf65cc99da4bea3da7cd6717dd5bbbe

This is required when creating roles in AWS IAM

4. Create roles in AWS IAM

(1) Select “AWS Account” – “Select” Another AWS Account “-” enter TiDB Cloud Account ID – “Select” Require External ID” – “enter TiDB Cloud External ID

(2) Select the created Policy and go to the next step

(3) Role is created successfully

(4) Obtain role’s ARN (arn:aws:iam::255548669385:role/Role_TiDBCloud)

4. Import AWS S3 data into TiDB Cloud cluster

1. Import AWS S3 data to TiDB Cloud

(1) Enter the URL of the actual S3 bucket subdirectory

(2) Enter the ARN of role

(3) Choose “TiDB Dumpling” as the Data Format

(4) Enter the user and password of TiDB Cloud cluster and click “Import”

(5) Data import process

(6) Data is imported successfully

2. Validate data

(1) Method 1: Use the local client

(2) Method 2: Use the Web SQL Shell

Log in to the TiDB Cloud cluster in Overview – Connect – Web SQL Shell