Data ScienceTech

Getting Started with Apache Drill and MongoDB

Not a lot of people have heard of Apache Drill. That is because Drill caters to very specific use cases, it's very niche. But when used, it can make significant differences to the way you interact with data. First, let's see what Apache Drill is, and then how we can connect our MongoDB data source to Drill and easily query data. What is Apache Drill? According to their website, Apache Drill is "Schema-free SQL Query Engine for Hadoop, NoSQL and Cloud Storage." That's pretty much self-explanatory. So, Drill is a tool to query Hadoop, MongoDB, and other NoSQL databases. You can write simple SQL queries that run on the data stored in other databases, and you get the result in a row-column format. The best part is you can even query Apache Kafka and AWS S3 data with this. ...

Read More
Data ScienceTech

Connect Apache Spark to your MongoDB database using the mongo-spark-connector

A couple of days back, we saw how we can connect Apache Spark to an Apache HBase database and query the data from a table using a catalog. Today, we’ll see how we can connect Apache Spark to a MongoDB database and get data directly into Spark from there. MongoDB provides us a plugin called the mongo-spark-connector, which will help us connect MongoDB and Spark without any drama at all. We just need to provide the MongoDB connection URI in the SparkConf object, and create a ReadConfig object specifying the collection name. It might sound complicated right now, but once you look at the code, you’ll understand how extremely easy this is. So, let’s look at an example. Source: mongodb.com The Dataset Before we look at the code, we need to make sure we have some data in our ...

Read More
Tech

Hide properties of Mongoose objects in Node.JS JSON responses

Many a times, we'll encounter a situation where we'll have to hide certain properties of Mongoose objects, especially when we're sending those objects in responses. For example, suppose you have an API endpoint like so: /user/:id. You will, obviously, send a user object as a response to this request. But there will be certain properties of the User schema (such as password) which you'd want to remove before sending the object in the response. Laravel developers can relate this to the $hidden array in Eloquent models, which automatically hides the given list of properties before sending the object in the response. There is no out-of-the-box solution for this in Mongoose. But it's pretty easy to achieve, even though it's a bit verbose. The solution is to define a custom .toJSON() metho...

Read More