Change field termination value in Hive
January 23, 2017How to find directories in HDFS which are older than N days?
January 30, 2017How to use multi character delimiter in a Hive table?
Sometimes your data is slightly complex to delimit the individual columns with a single character like delimiter comma, pipe symbol etc.
Problem
What if you have multi character delimiter like below ?
In the below sample record the delimiter is @#
NYSE@#B7J@#2009-12-16@#8.80@#8.84@#8.77@#8.83@#69700@#8.83
Solution
Well if you are using Hive version 0.14.0 then you are in luck. You could use MultiDelimitSerDe and specify the multi character delimiter for field.delim property.
hive> CREATE EXTERNAL TABLE IF NOT EXISTS stocks_multi (
exch STRING,
symbol STRING,
ymd STRING,
price_open FLOAT,
price_high FLOAT,
price_low FLOAT,
price_close FLOAT,
volume INT,
price_adj_close FLOAT)
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe'
WITH SERDEPROPERTIES ("field.delim"="@#")
LOCATION '/user/hirw/stocks-multi';

