Streamline Data Transfer with AWS DataSync: A Comprehensive Guide

Data Science

by Sunny Srinidhi - March 9, 2024March 9, 20240

Discover the power of AWS DataSync for seamless, secure, and accelerated data transfers. Learn how to optimise workflows with ease!

Enhancing Data Security and Privacy in the Cloud with AWS Clean Rooms

Data Science

by Sunny Srinidhi - May 26, 2023January 17, 20240

Data security and privacy in the cloud is becoming crucial as more organisations are embracing cloud computing and cloud storage. In this post, we’ll see how AWS Clean Rooms can help maintain data security and privacy.

Use Amazon CloudSearch to quickly search through data

Tech

by Sunny Srinidhi - March 29, 2023January 17, 20240

Amazon CloudSearch provides a number of powerful search capabilities, including full-text search, faceted search, and customizable relevance ranking. In this post, we’ll see what CloudSearch is

Cleaning and Normalizing Data Using AWS Glue DataBrew

Data Science

by Sunny Srinidhi - January 17, 2022January 17, 20221

In this post, we’ll see what is AWS Glue DataBrew and how to use it to clean and transform our data in a data pipeline.

Invoke an AWS Lambda Function from another Lambda Function

by Sunny Srinidhi - November 4, 2019November 4, 20190

I recently discovered that you can't invoke more than one Lambda function in AWS for an S3 event, with the same prefix and suffix (or just with the same suffix, which was the issue in my case). So I wanted a way to invoke one Lambda function from another Lambda function. If you're feeling kind of lost, check out the problem statement in my Github project. That could possibly add some context to the problem. If you don't want to go there, I'll try to explain it here again. The Problem and the Requirement In one of our projects, we have a Lambda function which is invoked whenever a text file is uploaded to a particular S3 bucket. The Lambda function takes

How to automatically trigger AWS Lambda functions using CloudWatch

Tech

by Sunny Srinidhi - November 2, 20190

If you have AWS Lambda functions which need to be triggered periodically, like CRON jobs, there are many ways to achieve this. But I recently discovered a very easy and AWS-way of doing this, which makes life a lot easier. So, there are a lot of ways you can trigger Lambda functions periodically. One of the most common ways I've see people doing this is adding an API Gateway to the Lambda function, and then calling that API periodically as a CRON job from one of the machines in the setup. I actually thought this is how you're supposed do to that. Okay, let me make this clear. I'm not a DevOps guy. I just learn these things as and when

Put data to Amazon Kinesis Firehose delivery stream using Spring Boot

by Sunny Srinidhi - September 26, 2019February 12, 20201

If you work with streams of big data which have to be collected, transformed, and analysed, you for sure would have heard of Amazon Kinesis Firehose. It is an AWS service used to load streams of data to data lakes or analytical tools, along with compressing, transforming, or encrypting the data. You can use Firehose to load streaming data to something like S3, or RedShift. From there, you can use a SQL query engine such as Amazon Athena to query this data. You can even connect this data to your BI tool and get real time analytics of the data. This could be very useful in applications where real time analysis of data is necessary. In this post, we'll see

Query data from S3 files using Amazon Athena

by Sunny Srinidhi - September 24, 2019March 7, 20201

Amazon Athena is defined as "an interactive query service that makes it easy to analyze data directly in Amazon Simple Storage Service (Amazon S3) using standard SQL." So, it's another SQL query engine for large data sets stored in S3. This is very similar to other SQL query engines, such as Apache Drill. But unlike Apache Drill, Athena is limited to data only from Amazon's own S3 storage service. However, Athena is able to query a variety of file formats, including, but not limited to CSV, Parquet, JSON, etc. In this post, we'll see how we can setup a table in Athena using a sample data set stored in S3 as a .csv file. But for this, we first need

Integrate AWS DynamoDB with Spring Boot

Tech

by Sunny Srinidhi - June 26, 2019March 12, 20200

Here is another POC to add to the growing list of POCs on my Github profile. Today, we’ll see how to integrate AWS DynamoDB with a Spring Boot application. This is going to be super simple, thanks to the AWS Java SDK and the Spring Data DynamoDB package. Let’s get started then. Dependencies First, as usual, we need to create a Spring Boot project, the dependencies of which look like: <dependencies> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter</artifactId> </dependency> <dependency> <groupId>com.amazonaws</groupId> <artifactId>aws-java-sdk-dynamodb</artifactId> <version>1.11.573</version>