Big Data Interview Questions and Answers (Part 2) - Big Data In Real World

Big Data Interview Questions and Answers (Part 2)

Real World End To End Project using Spark, Elasticsearch, Kibana, REST and Angular
March 8, 2019
Improving Performance In Spark Using Partitions
March 25, 2019
Real World End To End Project using Spark, Elasticsearch, Kibana, REST and Angular
March 8, 2019
Improving Performance In Spark Using Partitions
March 25, 2019

This is our second installment of our Big Data Interview Questions and Answers webinar. Click for the first one. It’s always fun to host one of these webinars and especially it was fun hosting this one because the questions came from the Hadoop In Real World community. So these were real interview question asked in real interviews.

Spark stole the webinar in 2018

We did this webinar in Nov 2018 and our first webinar on interview question was on Nov 2017. Back in 2017 our community sent us a lot of Hadoop related questions to answer. In 2018, the focus was more on Spark.

We quite often hosts webinars like these, sign up below to get invitations to join one of our webinars.

List of Big Data interview questions that we answered in the webinar

How do you handle scenarios when Spark runs out of memory? (12:40)

How Spark performs operations and generate results when dataset doesn’t fit in memory? (12:40)

What do you do when one of your Spark jobs fails with OOM error? (12:40)

How do you handle slow running jobs in Spark? (28:40)

What do you do when one task takes lot of time in your Spark job while other completed in time? (28:40)

Tell us some of the Spark optimization techniques you used in your current project. (28:40)

How do you handle Spark streaming failures? (40:30)

What happens to Spark streaming when there is network failure during processing? (40:30)

How do you recover from Spark streaming failures? (40:33)

What is the difference between DataFrame and Dataset? (47:10)

When do you use DataFrame and when do you use Dataset? (47:10)

How do you properly remove Datanodes from your cluster? (52:30)

How do you secure Hive? (52:30)

How do you authorize users in Hive? (52:30)

Provide an use case for Zookeeper (1:00:00)

What is the role of Zookeeper in a Big Data cluster (1:00:00)

How do you limit the number of files created under a directory in HDFS? (1:10:40)

How do you limit the space allocation in HDFS? (1:10:40)

Live questions from webinar attendees (1:15:30)

Full webinar

 

Big Data In Real World
Big Data In Real World
We are a group of Big Data engineers who are passionate about Big Data and related Big Data technologies. We have designed, developed, deployed and maintained Big Data applications ranging from batch to real time streaming big data platforms. We have seen a wide range of real world big data problems, implemented some innovative and complex (or simple, depending on how you look at it) solutions.

Comments are closed.

Big Data Interview Questions and Answers (Part 2)
This website uses cookies to improve your experience. By using this website you agree to our Data Protection Policy.

Hadoop In Real World is now Big Data In Real World!

X