Streamline Data Transfer with AWS DataSync: A Comprehensive GuideData Science by Sunny Srinidhi - March 9, 2024March 9, 20240 Discover the power of AWS DataSync for seamless, secure, and accelerated data transfers. Learn how to optimise workflows with ease!
Enhancing Data Security and Privacy in the Cloud with AWS Clean RoomsData Science by Sunny Srinidhi - May 26, 2023January 17, 20240 Data security and privacy in the cloud is becoming crucial as more organisations are embracing cloud computing and cloud storage. In this post, we’ll see how AWS Clean Rooms can help maintain data security and privacy.
Cleaning and Normalizing Data Using AWS Glue DataBrewData Science by Sunny Srinidhi - January 17, 2022January 17, 20221 In this post, we’ll see what is AWS Glue DataBrew and how to use it to clean and transform our data in a data pipeline.
Getting started with Chalice to create AWS Lambdas in Python – Step by Step TutorialTech by Sunny Srinidhi - November 14, 2019November 14, 20190 Using Chalice, you can write a Lambda function, test it locally, and even deploy the Lambda function to your development, test, or production environments. In this post, we’ll see how we can install Chalice on our local machines, write a simple REST API to return the famous “Hello, world!” response, and deploy it to a dev stage on AWS Lambda.
Invoke an AWS Lambda Function from another Lambda FunctionData ScienceTech by Sunny Srinidhi - November 4, 2019November 4, 20190 I recently discovered that you can't invoke more than one Lambda function in AWS for an S3 event, with the same prefix and suffix (or just with the same suffix, which was the issue in my case). So I wanted a way to invoke one Lambda function from another Lambda function. If you're feeling kind of lost, check out the problem statement in my Github project. That could possibly add some context to the problem. If you don't want to go there, I'll try to explain it here again. The Problem and the Requirement In one of our projects, we have a Lambda function which is invoked whenever a text file is uploaded to a particular S3 bucket. The Lambda function takes
How to automatically trigger AWS Lambda functions using CloudWatchTech by Sunny Srinidhi - November 2, 20190 If you have AWS Lambda functions which need to be triggered periodically, like CRON jobs, there are many ways to achieve this. But I recently discovered a very easy and AWS-way of doing this, which makes life a lot easier. So, there are a lot of ways you can trigger Lambda functions periodically. One of the most common ways I've see people doing this is adding an API Gateway to the Lambda function, and then calling that API periodically as a CRON job from one of the machines in the setup. I actually thought this is how you're supposed do to that. Okay, let me make this clear. I'm not a DevOps guy. I just learn these things as and when
Integrate AWS DynamoDB with Spring BootTech by Sunny Srinidhi - June 26, 2019March 12, 20200 Here is another POC to add to the growing list of POCs on my Github profile. Today, we’ll see how to integrate AWS DynamoDB with a Spring Boot application. This is going to be super simple, thanks to the AWS Java SDK and the Spring Data DynamoDB package. Let’s get started then. Dependencies First, as usual, we need to create a Spring Boot project, the dependencies of which look like: <dependencies> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter</artifactId> </dependency> <dependency> <groupId>com.amazonaws</groupId> <artifactId>aws-java-sdk-dynamodb</artifactId> <version>1.11.573</version>
Real-Time Data Processing: Understanding the What, Why, Where, Who, and HowData ScienceTech by Sunny Srinidhi - October 22, 20240 In today’s data-driven world, businesses and organizations are continuously generating massive amounts of data. While processing data in batch mode remains useful, the need for instant decision-making has led to an increasing focus on real-time data processing. This article delves into what real-time data processing is, why it's essential, its various applications, the tools used to achieve it, trends shaping its evolution, and real-world use cases. What is Real-Time Data Processing? Real-time data processing refers to the capability to continuously ingest, process, and output data as soon as it is generated, with minimal latency. Unlike batch processing, which collects and processes data in large groups at set intervals (e.g., daily or hourly), real-time processing works with data immediately as it becomes available,
The Trend of Cloud Repatriation: Moving Back to On-Premises InfrastructureData ScienceTech by Sunny Srinidhi - October 16, 2024October 16, 20240 In recent years, a shift in IT infrastructure strategies has seen many companies moving workloads away from public cloud services and back to on-premises setups or private cloud environments. This movement, known as "cloud repatriation," is driven by various factors that range from cost management to performance, security, and compliance concerns. While public cloud adoption surged over the past decade, the limitations of this model have led organizations to reconsider their approach, resulting in a hybrid IT strategy combining both on-premises and cloud resources. Why Companies Are Moving Back to On-Premises 1. Cost Considerations One of the most prominent factors driving cloud repatriation is the realization of the high costs associated with public cloud services. While the cloud offers scalability and flexibility, many companies
Exploring the Inner Workings of Google BigQuery: A Deep Dive into Design, Competitors, Use Cases, and Pros/ConsData Science by Sunny Srinidhi - March 13, 2024March 13, 20240 Discover the inner workings of Google BigQuery, a game-changer in big data analytics. Unravel its architecture, including the prowess of its distributed query engine, Dremel, and the innovative Capacitor technology. Compare it with competitors, explore diverse use cases from real-time analytics to healthcare, and weigh its pros and cons. Join us on a journey into the heart of data analytics excellence.