What is the difference between cache and persist in Spark?
March 26, 2021Different ways to insert data into Hive table
March 31, 2021When you search or lookup a document, Elasticsearch by default returns or shows you all the fields in the document.
$ curl -X GET "localhost:9200/account/_doc/954?pretty"
{
"_index" : "account",
"_type" : "_doc",
"_id" : "954",
"_version" : 1,
"_seq_no" : 790,
"_primary_term" : 1,
"found" : true,
"_source" : {
"account_number" : 954,
"balance" : 49404,
"firstname" : "Jenna",
"lastname" : "Martin",
"age" : 22,
"gender" : "M",
"address" : "688 Hart Street",
"employer" : "Zinca",
"email" : "jennamartin@zinca.com",
"city" : "Oasis",
"state" : "MD"
}
}
But what if you want to display or fetch just a few fields from the document.
Solution
It is quite simple to fetch just the required fields by specifying the fields in the _source attribute when doing the search. Here is an example.
curl -X GET "localhost:9200/account/_search?pretty" -H 'Content-Type: application/json' -d'
{
"_source": ["firstname", "lastname", "age"],
"query" : {
"term" : { "age" : "22" }
}
}
'
Here is a sample output for the above search query. Here we are fetching only 3 fields from the document – firstname, lastname and age and we are fetching only documents with age 22.
{
"took" : 23,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 51,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "account",
"_type" : "_doc",
"_id" : "75",
"_score" : 1.0,
"_source" : {
"firstname" : "Sandoval",
"age" : 22,
"lastname" : "Kramer"
}
},
{
"_index" : "account",
"_type" : "_doc",
"_id" : "87",
"_score" : 1.0,
"_source" : {
"firstname" : "Hewitt",
"age" : 22,
"lastname" : "Kidd"
}
},
{
"_index" : "account",
"_type" : "_doc",
"_id" : "227",
"_score" : 1.0,
"_source" : {
"firstname" : "Coleman",
"age" : 22,
"lastname" : "Berg"
}
},

