You are here
Home > Search Results for "s3"

How to build a simple data lake using Amazon Kinesis Data Firehose and Amazon S3

data lake

In this post, we’ll see how we can create a very simple, yet highly scalable data lake using Amazon’s Kinesis Data Firehose and Amazon’s S3.

Query data from S3 files using Amazon Athena

amazon athena

Amazon Athena is defined as "an interactive query service that makes it easy to analyze data directly in Amazon Simple Storage Service (Amazon S3) using standard SQL." So, it's another SQL query engine for large data sets stored in S3. This is very similar to other SQL query engines, such as Apache Drill. But unlike Apache Drill, Athena is limited to data only from Amazon's own S3 storage service. However, Athena is able to query a variety of file formats, including, but not limited to CSV, Parquet, JSON, etc. In this post, we'll see how we can setup a table in Athena using a sample data set stored in S3 as a .csv file. But for this, we first need

Use Amazon CloudSearch to quickly search through data

was-cloudsearch

Amazon CloudSearch provides a number of powerful search capabilities, including full-text search, faceted search, and customizable relevance ranking. In this post, we’ll see what CloudSearch is

Cleaning and Normalizing Data Using AWS Glue DataBrew

stephen-dawson-qwtCeJ5cLYs-unsplash

In this post, we’ll see what is AWS Glue DataBrew and how to use it to clean and transform our data in a data pipeline.

Kinesis Data Streams vs. Kinesis Firehose Delivery Streams

sheep

I have talked about Kinesis before, and I'm sure you've been using Kinesis for longer than me. But according to what I've seen, not all teams or companies use all parts of Kinesis. And, there are four parts in Kinesis: Ingest and process streaming data with Kinesis streams - Kinesis Data Streams Deliver streaming data with Kinesis Firehose delivery streams - Kinesis Firehose Delivery Streams Analyse streaming data with Kinesis analytics applications - Kinesis Analytics Ingest and process media streams with Kinesis video streams - Kinesis Video Streams All these four parts offer something different. Well, the last two are definitely different than the first two. But it's the first two that I see a lot of people getting confused with. So I thought I'll

How To Generate Parquet Files in Java

parquet logo

The Parquet file format has become very popular lately. In this post, we’ll see what it is, and how to create Parquet files in Java using Spring Boot.

Getting started with Chalice to create AWS Lambdas in Python – Step by Step Tutorial

Using Chalice, you can write a Lambda function, test it locally, and even deploy the Lambda function to your development, test, or production environments. In this post, we’ll see how we can install Chalice on our local machines, write a simple REST API to return the famous “Hello, world!” response, and deploy it to a dev stage on AWS Lambda.

Top