Hive bucket map join with different bucket size -


in hive, can perform bucket map join of 2 tables different bucket size (but on same key) ? can please share thoughts explanation.

for example table-a bucketed col-1 48 buckets, while table-b bucketed col-1 64 buckets.

note: table-a bucket size not divisible bucket size of table-b.

thanks in advance..!!

according hive: if tables being joined bucketized on join columns, , number of buckets in 1 table multiple of number of buckets in other table, buckets can joined each other.

explanation: suppose table , table b needs joined. has 2 buckets , b has 4 buckets. select /*+ mapjoin(b) */ a.key, a.value join b on a.key = b.key

for query above, mapper processing bucket 1 fetch 2 buckets b. but, if not exact multiples, not possible exact number of buckets fetched.

so, in case, won't work unless number of buckets in 1 table multiple of number of buckets in other.


Comments

Popular posts from this blog

java - UnknownEntityTypeException: Unable to locate persister (Hibernate 5.0) -

python - ValueError: empty vocabulary; perhaps the documents only contain stop words -

ubuntu - collect2: fatal error: ld terminated with signal 9 [Killed] -