How to calculate median in Hive? - Big Data In Real World

How to calculate median in Hive?

How to subtract or see differences between two DataFrames in Spark?
August 25, 2021
How to change the number of replicas of a Kafka topic?
August 30, 2021
How to subtract or see differences between two DataFrames in Spark?
August 25, 2021
How to change the number of replicas of a Kafka topic?
August 30, 2021

You are trying to calculate median and you couldn’t find a function to calculate median in Hive. No worries you have come to the right place.

Solution

The 50th percentile is usually the median. Hive has a percentile function which can be used to calculate median. 

select percentile(numberical_column, 0.5) from table_name;

 

Big Data In Real World
Big Data In Real World
We are a group of Big Data engineers who are passionate about Big Data and related Big Data technologies. We have designed, developed, deployed and maintained Big Data applications ranging from batch to real time streaming big data platforms. We have seen a wide range of real world big data problems, implemented some innovative and complex (or simple, depending on how you look at it) solutions.

Comments are closed.

How to calculate median in Hive?
This website uses cookies to improve your experience. By using this website you agree to our Data Protection Policy.

Hadoop In Real World is now Big Data In Real World!

X