Cleaning and Normalizing Data Using AWS Glue DataBrewData Science by Sunny Srinidhi - January 17, 2022January 17, 20221 In this post, we’ll see what is AWS Glue DataBrew and how to use it to clean and transform our data in a data pipeline.
The Dunning-Kruger Effect In TechTech by Sunny Srinidhi - November 28, 2021December 18, 20210 The Dunning-Kruger effect is very real in the tech industry. In this post, I talk about my experience with it in the industry.
Understanding Apache Hive LLAPData Science by Sunny Srinidhi - November 18, 2021November 18, 20210 In this post, I try to explain what LLAP is for Apache Hive and how it can help us in reducing query latency.
Installing Hadoop on the new M1 Pro and M1 Max MacBook ProData Science by Sunny Srinidhi - November 5, 2021November 5, 20213 We’ll see how to install and configure Hadoop and it’s components on MacOS running on the new M1 Pro and M1 Max chips by Apple.
Installing Hadoop on Windows 11 with WSL2Data Science by Sunny Srinidhi - November 1, 2021November 1, 20213 We’ll see how to install and configure Hadoop and it’s components on Windows 11 running a Linux distro using WSL 1 or 2.
Installing Zsh and Oh-my-zsh on Windows 11 with WSL2Tech by Sunny Srinidhi - October 27, 2021October 27, 20211 In this post, which is a part of a series of to setup Windows 11 and WSL2 for big data work, I install Zsh and Oh-my-zsh and setup up aliases
Getting Started With Apache AirflowData Science by Sunny Srinidhi - October 11, 2021October 11, 20210 I recently started working with Apache Airflow. And as is tradition, I’m telling you everything about it here.
Fake (almost) everything with FakerData Science by Sunny Srinidhi - September 30, 2021September 30, 20210 Generating customer and address data for testing has never been easier. We’ll see how to do that using the Faker Python library.
Querying Hive Tables From a Spring Boot AppData ScienceTech by Sunny Srinidhi - June 30, 2021June 30, 20211 In this post, we’ll see how to connect to a Hive database and run queries on that database from a Spring Boot application.
out() vs. outE() – JanusGraph and GremlinData Science by Sunny Srinidhi - March 3, 2021March 3, 20210 JanusGraph and Gremlin have the out() and outE() functions which help with traversals. But what’s the difference between the two? Let’s see.