How to import CSV data to DynamoDB new table
Let us convert CSV to DynamoDB JSON keep same type of information when importing to DynamoDB new table
Overview
DynamoDB can now import Amazon S3 data into a new table. By using this, you can import CSV, DynamoDB JSON, or ION format data.
Among the above format, I think CSV would be the most common format we often use. However, there is a limitation that the CSV data are imported as DynamoDB strings. Therefore, you can not keep type information for data such as number or boolean in the CSV, it would be a hassle for you.
When importing from CSV files, all columns other than the hash range and keys of your base table and secondary indexes are imported as DynamoDB strings.
Ref: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/S3DataImport.Format.html
In order to avoid the limitation, the way how to convert CSV file to DynamoDB JSON is studied and we created a simple conversion script.
This article mainly consists of the following parts.
* Demonstration - How to import CSV file to DynamoDB new table
* Demonstration - How to convert the CSV file to DynamoDB JSON, and import the DynamoDB JSON to DynamoDB new table
* Summary
* Reference
Demonstration — How to import CSV file to DynamoDB new table
In order to show the issue, how to import CSV file to DynamoDB new table is demonstrated. Here are the topics in this section.
* Preparation: CSV file
* Preparation: S3
* Preparation: DynamoDB
* Verification
Preparation: CSV file
Let us simply prepare CSV file. The following CSV file is prepared for the demonstration. As you can see, there are data with string, number, and boolean types.
col1_str,col2_num,col3_bool
"str1",100,true
"str2",200,false
"str3",300,true
Preparation: S3
Let us create a S3 bucket and simply put the CSV file in the above step.
Preparation: DynamoDB
Next, let us use a fully managed feature to import S3 data to DynamoDB new table.
- STEP 1: Go to DynamoDB management console and tap
Imports from S3
- STEP 2: Tap
Import from S3
- STEP 3: Select the import options as follows and tap
Next
Source S3 URL: URL that CSV file is specified
Import file format: CSV
- STEP 4: Input Table name and Partition key as follows and based on your CSV file. Then tap
Next
- STEP 5: Use
Default settings
since this setting is not important in this article.
- STEP 6: Review your setting and tap
Import
Verification
Run the following AWS CLI command to scan the data.
$ aws dynamodb scan --table-name test-demo-1 --profile your-profile
{
"Items": [
{
"col1_str": {
"S": "str2"
},
"col3_bool": {
"S": "false"
},
"col2_num": {
"S": "200"
}
},
{
"col1_str": {
"S": "str1"
},
"col3_bool": {
"S": "true"
},
"col2_num": {
"S": "100"
}
},
{
"col1_str": {
"S": "str3"
},
"col3_bool": {
"S": "true"
},
"col2_num": {
"S": "300"
}
}
],
"Count": 3,
"ScannedCount": 3,
"ConsumedCapacity": null
}
As you can see from the result, all the data in columns col2_num
and col3_bool
were imported as DynamoDB strings ( You can see "S” in all data above). This is a limitation of importing CSV file on S3 to the DynamoDB new table.
Demonstration — How to convert the CSV file to DynamoDB JSON, and import the DynamoDB JSON to DynamoDB new table
In order to avoid the above limitation, we will convert the CSV file to DynamoDB JSON and then import the JSON to DynamoDB new table. Here the topics in this section.
* Preparation: Convert CSV file to DynamoDB JSON
* Preparation: S3
* Preparation: DynamoDB
* Verification
Preparation: Convert CSV file to DynamoDB JSON
The conversion script from CSV file to DynamoDB JSON is prepared in the following repository. You can use this script to convert and generate DynamoDB JSON.
For example, if you generate DynamoDB JSON from the above sample csv data, you would get the following content.
output.json
{
"Item": {
"col1_str": {
"S": "str1"
},
"col2_num": {
"N": "100"
},
"col3_bool": {
"BOOL": true
}
}
}
{
"Item": {
"col1_str": {
"S": "str2"
},
"col2_num": {
"N": "200"
},
"col3_bool": {
"BOOL": false
}
}
}
{
"Item": {
"col1_str": {
"S": "str3"
},
"col2_num": {
"N": "300"
},
"col3_bool": {
"BOOL": true
}
}
}
Preparation: S3
As same as the first demonstration, let us put the DynamoDB JSON to your S3 bucket.
Preparation: DynamoDB
All steps are the same as the first demonstration except for STEP 3. please have the following in STEP 3.
Source S3 URL: URL that DynamoDB json file is specified
Import file format: DynamoDB JSON
Verification
As same as the first demonstration, let us run the following AWS CLI command to scan the data.
$ aws dynamodb scan --table-name test-demo-2 --profile your-profile
{
"Items": [
{
"col3_num": {
"BOOL": true
},
"col1_str": {
"S": "str2"
},
"col2_num": {
"N": "200"
}
},
{
"col3_num": {
"BOOL": true
},
"col1_str": {
"S": "str1"
},
"col2_num": {
"N": "100"
}
},
{
"col3_num": {
"BOOL": true
},
"col1_str": {
"S": "str3"
},
"col2_num": {
"N": "300"
}
}
],
"Count": 3,
"ScannedCount": 3,
"ConsumedCapacity": null
}
As you can see the result, all the data in columns col2_num
and col3_bool
were imported keeping same type of information from the original CSV data.
Summary
This article introduced the standard functionality for importing S3 data into DynamoDB new table that AWS announces and showed its limitations of importing CSV data through demonstrations. Then, the way how to get around the limitations was introduced, which is to convert the CSV to DynamoDB JSON and import the JSON into a DynamoDB new table. This way, the limitations were avoided and it imports original data expectedly. We hope you will refer to this article when you have the opportunity to import data into DynamoDB in the future. Thank you for reading 🙇
Reference
- https://aws.amazon.com/jp/blogs/database/amazon-dynamodb-can-now-import-amazon-s3-data-into-a-new-table/
- https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/S3DataImport.Format.html
- https://github.com/reotogashi/csv-to-dynamodbjson-converter
- https://dev.classmethod.jp/articles/dynamo-import-s3-csv-string/