Installing Hadoop on the new M1 Pro and M1 Max MacBook ProData Science by Sunny Srinidhi - November 5, 2021November 5, 20213 We’ll see how to install and configure Hadoop and it’s components on MacOS running on the new M1 Pro and M1 Max chips by Apple.
Installing Hadoop on Windows 11 with WSL2Data Science by Sunny Srinidhi - November 1, 2021November 1, 20213 We’ll see how to install and configure Hadoop and it’s components on Windows 11 running a Linux distro using WSL 1 or 2.
Understanding Apache Hive LLAPData Science by Sunny Srinidhi - November 18, 2021November 18, 20210 In this post, I try to explain what LLAP is for Apache Hive and how it can help us in reducing query latency.
Installing Zsh and Oh-my-zsh on Windows 11 with WSL2Tech by Sunny Srinidhi - October 27, 2021October 27, 20211 In this post, which is a part of a series of to setup Windows 11 and WSL2 for big data work, I install Zsh and Oh-my-zsh and setup up aliases
Querying Hive Tables From a Spring Boot AppData ScienceTech by Sunny Srinidhi - June 30, 2021June 30, 20211 In this post, we’ll see how to connect to a Hive database and run queries on that database from a Spring Boot application.
How To Generate Parquet Files in JavaData Science by Sunny Srinidhi - April 7, 2020April 7, 202014 The Parquet file format has become very popular lately. In this post, we’ll see what it is, and how to create Parquet files in Java using Spring Boot.
Getting Started with Apache Drill and MongoDBData ScienceTech by Sunny Srinidhi - September 23, 2019February 28, 20203 Not a lot of people have heard of Apache Drill. That is because Drill caters to very specific use cases, it's very niche. But when used, it can make significant differences to the way you interact with data. First, let's see what Apache Drill is, and then how we can connect our MongoDB data source to Drill and easily query data. What is Apache Drill? According to their website, Apache Drill is "Schema-free SQL Query Engine for Hadoop, NoSQL and Cloud Storage." That's pretty much self-explanatory. So, Drill is a tool to query Hadoop, MongoDB, and other NoSQL databases. You can write simple SQL queries that run on the data stored in other databases, and you get the result in a row-column format. The
Connect Apache Spark to your HBase database (Spark-HBase Connector)Data ScienceTech by Sunny Srinidhi - April 1, 2019January 31, 20202 There will be times when you’ll need the data in your HBase database to be brought into Apache Spark for processing. Usually, you’ll query the database, get the data in whatever format you fancy, and then load that into Spark, maybe using the `parallelize()`function. This works, just fine. But depending on the size of the data, this could cause delays. At least it did for our application. So after some research, we stumbled upon a Spark-HBase connector in Hortonworks repository. Now, what is this connector and why should you be considering this? The Spark-HBase Connector (shc-core) The SHC is a tool provided by Hortonworks to connect your HBase database to Apache Spark so that you can tell your Spark context to pickup the
About Me Connect with me on: Twitter | LinkedIn | Medium Products Finance Journal Finance Journal is your personal command center for your money. It replaces messy spreadsheets and mental math with a beautiful, easy-to-use app that gives you total clarity on your financial life. Here is what it does for you: See Where Your Money Goes: Don't just track "how much"—track "where" and "why." Tag expenses with specific locations (powered by Google Maps), categories, and even specific trips to see exactly what lifestyle choices drive your spending. Visual Insights: Instantly understand your habits with a dashboard that turns your data into clear, easy-to-read charts and trends. Travel Made Easy: Planning a getaway? Dedicated trip tracking lets you keep your vacation budget separate from your daily expenses, so you can relax without the