Lambda Python Upload Json File to S3

Ultimate AWS Lambda Python Tutorial with Boto3

AWS Lambda is a Function-equally-a-Service offering from Amazon Web Services. AWS Lambda essentially created the service in 2014 with the launch of Lambda. AWS Lambda provides on-need execution of lawmaking without the need for an ever-available server to answer to the advisable asking.

AWS Lambda functions are run in containers that are dynamically created and deleted every bit needed based upon your application'southward execution characteristics, allowing you to pay for but the compute y'all use rather than the constant availability of services like EC2.

Why Lambda with Python?

The 2020 Stack Overflow Developer Survey named Python i of the most honey languages. Python has a strong foothold in a number of unlike industries, from web development to bogus intelligence and data science. It's only natural, then, that many developers would like to rely on Python when working with their serverless functions. In this commodity, we'll discuss using Python with AWS Lambda, exploring the procedure of testing and deploying serverless Python functions.

Given that the power of serverless technology is in the uniformity of its containers, there are some restrictions on the runtime and environs when working with Python in Lambda. The first is related to how Lambda deploys lawmaking itself – you lot'll encounter extensive duplication of the deployed code itself. Additionally, Lambdas simply support specific Python runtimes – see the list here.

In terms of the execution environment, at that place are a couple of things to be enlightened of:

You are only given express access to the operating organisation layer, meaning that yous cannot rely upon operating-system-level packages.
Lambda containers are extremely short-lived and are regularly recycled for apply elsewhere in the system. This ways you cannot debug your application based on operating arrangement aspects, such as local file storage unless you ensure the part context doesn't exist.

What is Boto3?

To make integration easy with AWS services for the Python language, AWS has come up with an SDK called boto3. Information technology enables the Python awarding to integrate with S3, DynamoDB, SQS, and many more services. In Lambda Part, it has get very popular to talk to those services for storing, retrieving, and deleting the data.

In this article, we will try to understand boto3 central features and how to utilise them to build a Lambda Function.

Boto3 Key Features

Resource APIs

Resources APIs provide resources objects and collections to access attributes and perform actions. Information technology hides the low-level network calls. Resource represent an object-oriented interface to AWS. It provides the resources() method of a default session and passes in an AWS service name. For example:

sqs = boto3.resource('sqs')  s3 = boto3.resources('s3')

Every resource instance has attributes and methods that are divide upward into identifiers, attributes, actions, references, sub-resources, and collections.

Resources can likewise exist carve up into service resources (like sqs, s3, ec2, etc) and individual resource (like sqs.Queue or s3.Saucepan). Service resources practise not have identifiers or attributes otherwise the two share the same components.

Identifiers

An identifier is a unique value used by a resource instance to phone call actions.

Resource must take at least one identifier, except for the service resources (eastward.g. sqs or s3).

For example:

# S3 Object (bucket_name and central are identifiers)  obj = s3.Object(bucket_name='boto3', central='test.py')

Action

An action is a method that makes a service call. Deportment may return a depression-level response, a list of new resource instances, or a new resources example. For example:

messages = queue.receive_messages()

References

A reference is just like an attribute. It may be None or a related resource instance. The resource instance does not share identifiers with its reference resource. Information technology is not a strict parent-to-child relationship. For instance:

instance.subnet  example.vpc

Sub-resources

A sub-resources is like to a reference. The merely deviation is that information technology is a related class rather than an instance. When we instantiate sub-resources, it shares identifiers with their parent. Information technology is a strict parent-kid relationship.

queue = sqs.Queue(url='...')  message = queue.Message(receipt_handle='...')

Collections

A drove provides an iterable interface to a group of resources. A collection helps iterate over all items of a resource. For example:

sqs = boto3.resource('sqs')  for queue in sqs.queues.all():  print(queue.url)

Waiters

A waiter is similar to an activity. A waiter will poll the status of a resource to check if the resource has reached a particular state. If it reaches the polled land, it will execute, or else it volition keep polling unless a failure occurs. For example, nosotros tin create a bucket and employ a waiter to await until it is prepare to apply to retrieve objects:

bucket.wait_until_exists()

Service-specific Loftier-level Features

Boto3 comes with several other service-specific features, such every bit automated multi-part transfers for Amazon S3 and simplified query conditions for DynamoDB.

Building an AWS Lambda Awarding with Python Using Boto3

Now, we have an idea of what Boto3 is and what features information technology provides. Permit's build a simple Python serverless awarding with Lambda and Boto3.

The use case is when a file gets uploaded to S3 Saucepan, a Lambda Office is to be triggered to read this file and store it in DynamoDB table. The architecture will wait similar beneath:

We want to use the Python language for this use case so nosotros will have advantage of boto3 SDK to fasten our development work.

Step ane: Create a JSON File

Permit'south first create a small json file with some sample client data. This is the file we would upload to the S3 saucepan. Let's proper name it information.json.

#data.json  { "customerId": "xy100", "firstName": "Tom", "lastName": "Alter", "condition": "agile", "isPremium": true }

Footstep 2: Create S3 Bucket

Now, let's create an S3 Bucket where the json file will be uploaded. Let's name it boto3customer. We have created the bucket with all the default features for this instance:

Step 3: Create DynamoDB Tabular array

Let'southward create a DynamoDB table (client) where nosotros volition upload the json file. Marker customerid as a partitioning key. We need to ensure that our data.json file has this field while inserting it into the table else it volition mutter about missing the key.

Footstep 4: Create Lambda Office

Here, we demand to first create an IAM office that has access to CloudWatch Logs, S3, and DynamoDB to collaborate with these services. Then, nosotros will be writing code using boto3 to practice the data download, parse, and relieve into the customer DynamoDB table. So, create a trigger that should integrate the S3 saucepan with Lambda so that once nosotros button the file in the bucket, it should be picked upwards by Lambda Function.

Allow'due south kickoff create an IAM role. IAM Office needs to have at to the lowest degree Read access to S3, write access DynamoDB and Full admission CloudWatch Logs service to log every issue transaction:

At present, create a function. Give a unique proper noun to it and select Python 3.seven as runtime language:

Now, select the role LambdaS3DyanamoDB we created in the earlier step and hit Create office button:

Now follow the below steps for Lambda Function:

Write the python code using the boto3 resource API to load the service instance object.
An event object is used to laissez passer the metadata of the file (S3 bucket, filename).
So using action methods of s3_client, load S3 file information in the json object.
Parse the json data and save it into the DyanamoDB table (customer)

#Code snippet

import json import boto3 dynamodb =  boto3.resource('dynamodb') s3_client = boto3.client('s3')  tabular array = dynamodb.Table('client')  def lambda_handler(consequence, context):    # Retrieve File Information    bucket_name =   event['Records'][0]['s3']['bucket']['name']    s3_file_name =  event['Records'][0]['s3']['object']['primal']     # Load Data in object    json_object =   s3_client.get_object(Saucepan=bucket_name, Key= s3_file_name)    jsonFileReader  =   json_object['Torso'].read()    jsonDict    =   json.loads(jsonFileReader)     # Save date in dynamodb table    table.put_item( Particular=jsonDict)

Side by side, create an S3 trigger:

Select the Bucket name created in the earlier footstep.
Select the Event blazon as "All object create events".
Enter suffix as .json
Click on Add push

Lambda Function is at present ready with all the configurations and setup.

Test the Lambda Function

Let'due south test this Lambda part customer update.

First, nosotros need to upload a json file in the S3 saucepan boto3customer.
As shortly as the file gets uploaded in S3 bucket, it triggers the customer update
It will execute the lawmaking that receives the metadata of the file through the event object and loads this file content using boto3 APIs.
Then, it saves the content to the customer table in DynamoDB.

We can see the file content got saved in the DyanamoDB tabular array.

This completes our utilize example implementation of a serverless awarding with Python runtime using the boto3 library.

Tips and Tricks for Working with Python in AWS Lambda

AWS Lambda functions provide a powerful tool in developing data flow pipelines and application request handlers without the need for dedicated server resources, but these functions are not without their downside. The following tips and tricks may help:

Make extensive use of unit testing – given the distributed nature of serverless functions, having verification in place is important for peace of listen when deploying your functions that run infrequently. While unit and functional testing cannot fully protect against all problems, it gives y'all the confidence you demand to deploy your changes.
Beware of time limits – AWS Lambda functions incorporate inherent time limits in execution. These can get equally high as 900 seconds simply defaults to 3. If your function is likely to require long run times, ensure you configure this value properly.
Leverage the pattern – It's piece of cake enough to take a simple Flask app and translate it into a monoservice Lambda function, simply there are a lot of potential efficiency gains to be made from moving a step or two across the basics. Have the time to rebuild your awarding to have advantage of the event-driven nature of Lambda processing, and you lot can better user feel with minimal try.
Brand use of third parties – It's powerful to have everything under your control, but it can also exist distracting and time-consuming. Tools similar Lumigo can provide y'all automated tracing and online debugging, removing the need to develop these tools from your excess.

Summary

AWS Lambda is a powerful tool for event-driven and intermittent workloads. The dynamic nature of Lambda functions means that you lot tin create and iterate on your functionality right away, instead of having to spend cycles getting bones infrastructure running correctly and scaling properly.

Python brings a powerful language with a robust environment into the serverless realm and tin can be a powerful tool with some careful application of all-time practices. In addition to general development practices focused on maintainability, tools similar Lumigo tin expand your serverless metrics reporting and give you the power you demand to bulldoze user value equally chop-chop as possible.

Learn how easy AWS Lambda monitoring can be with Lumigo

smithhiliblaw.blogspot.com

Source: https://lumigo.io/learn/aws-lambda-python/