How to import CSV data to DynamoDB new table

Let us convert CSV to DynamoDB JSON keep same type of information when importing to DynamoDB new table

Reo Togashi
Gecogeco

--

Created by Alexas_Fotos

Overview

DynamoDB can now import Amazon S3 data into a new table. By using this, you can import CSV, DynamoDB JSON, or ION format data.

Ref: https://aws.amazon.com/jp/blogs/database/amazon-dynamodb-can-now-import-amazon-s3-data-into-a-new-table/

Among the above format, I think CSV would be the most common format we often use. However, there is a limitation that the CSV data are imported as DynamoDB strings. Therefore, you can not keep type information for data such as number or boolean in the CSV, it would be a hassle for you.

When importing from CSV files, all columns other than the hash range and keys of your base table and secondary indexes are imported as DynamoDB strings.
Ref: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/S3DataImport.Format.html

In order to avoid the limitation, the way how to convert CSV file to DynamoDB JSON is studied and we created a simple conversion script.

This article mainly consists of the following parts.

* Demonstration - How to import CSV file to DynamoDB new table
* Demonstration - How to convert the CSV file to DynamoDB JSON, and import the DynamoDB JSON to DynamoDB new table
* Summary
* Reference

Demonstration — How to import CSV file to DynamoDB new table

In order to show the issue, how to import CSV file to DynamoDB new table is demonstrated. Here are the topics in this section.

* Preparation: CSV file
* Preparation: S3
* Preparation: DynamoDB
* Verification

Preparation: CSV file

Let us simply prepare CSV file. The following CSV file is prepared for the demonstration. As you can see, there are data with string, number, and boolean types.

col1_str,col2_num,col3_bool
"str1",100,true
"str2",200,false
"str3",300,true

Preparation: S3

Let us create a S3 bucket and simply put the CSV file in the above step.

Preparation: DynamoDB

Next, let us use a fully managed feature to import S3 data to DynamoDB new table.

  • STEP 1: Go to DynamoDB management console and tap Imports from S3
  • STEP 2: Tap Import from S3
  • STEP 3: Select the import options as follows and tap Next
Source S3 URL: URL that CSV file is specified
Import file format: CSV
  • STEP 4: Input Table name and Partition key as follows and based on your CSV file. Then tap Next
  • STEP 5: Use Default settings since this setting is not important in this article.
  • STEP 6: Review your setting and tap Import

Verification

Run the following AWS CLI command to scan the data.

$ aws dynamodb scan --table-name test-demo-1 --profile your-profile
{
"Items": [
{
"col1_str": {
"S": "str2"
},
"col3_bool": {
"S": "false"
},
"col2_num": {
"S": "200"
}
},
{
"col1_str": {
"S": "str1"
},
"col3_bool": {
"S": "true"
},
"col2_num": {
"S": "100"
}
},
{
"col1_str": {
"S": "str3"
},
"col3_bool": {
"S": "true"
},
"col2_num": {
"S": "300"
}
}
],
"Count": 3,
"ScannedCount": 3,
"ConsumedCapacity": null
}

As you can see from the result, all the data in columns col2_num and col3_bool were imported as DynamoDB strings ( You can see "S” in all data above). This is a limitation of importing CSV file on S3 to the DynamoDB new table.

Demonstration — How to convert the CSV file to DynamoDB JSON, and import the DynamoDB JSON to DynamoDB new table

In order to avoid the above limitation, we will convert the CSV file to DynamoDB JSON and then import the JSON to DynamoDB new table. Here the topics in this section.

* Preparation: Convert CSV file to DynamoDB JSON
* Preparation: S3
* Preparation: DynamoDB
* Verification

Preparation: Convert CSV file to DynamoDB JSON

The conversion script from CSV file to DynamoDB JSON is prepared in the following repository. You can use this script to convert and generate DynamoDB JSON.

For example, if you generate DynamoDB JSON from the above sample csv data, you would get the following content.

output.json

{
"Item": {
"col1_str": {
"S": "str1"
},
"col2_num": {
"N": "100"
},
"col3_bool": {
"BOOL": true
}
}
}
{
"Item": {
"col1_str": {
"S": "str2"
},
"col2_num": {
"N": "200"
},
"col3_bool": {
"BOOL": false
}
}
}
{
"Item": {
"col1_str": {
"S": "str3"
},
"col2_num": {
"N": "300"
},
"col3_bool": {
"BOOL": true
}
}
}

Preparation: S3

As same as the first demonstration, let us put the DynamoDB JSON to your S3 bucket.

Preparation: DynamoDB

All steps are the same as the first demonstration except for STEP 3. please have the following in STEP 3.

Source S3 URL: URL that DynamoDB json file is specified
Import file format: DynamoDB JSON

Verification

As same as the first demonstration, let us run the following AWS CLI command to scan the data.

$ aws dynamodb scan --table-name test-demo-2 --profile your-profile
{
"Items": [
{
"col3_num": {
"BOOL": true
},
"col1_str": {
"S": "str2"
},
"col2_num": {
"N": "200"
}
},
{
"col3_num": {
"BOOL": true
},
"col1_str": {
"S": "str1"
},
"col2_num": {
"N": "100"
}
},
{
"col3_num": {
"BOOL": true
},
"col1_str": {
"S": "str3"
},
"col2_num": {
"N": "300"
}
}
],
"Count": 3,
"ScannedCount": 3,
"ConsumedCapacity": null
}

As you can see the result, all the data in columns col2_num and col3_bool were imported keeping same type of information from the original CSV data.

Summary

This article introduced the standard functionality for importing S3 data into DynamoDB new table that AWS announces and showed its limitations of importing CSV data through demonstrations. Then, the way how to get around the limitations was introduced, which is to convert the CSV to DynamoDB JSON and import the JSON into a DynamoDB new table. This way, the limitations were avoided and it imports original data expectedly. We hope you will refer to this article when you have the opportunity to import data into DynamoDB in the future. Thank you for reading 🙇

Reference

--

--