Engineering at DueDil

We're hiring! If you're interested in joining the DueDil engineering and product team to work with great people
on solving interesting and meaningful problems, check out the open positions on our careers page and get in touch!

Efficient broadcast joins in Spark, using Bloom filters

22 November 2018 Mohamed Abdelbary

Broadcast joins are a nice way to avoid a shuffle operation in Spark. However, Spark’s collect operation for the broadcast set can introduce memory pressure on the driver. Bloom filters can provide a neat solution to this problem. »