A Bloom filter is a space-efficient probabilistic set. It uses a bit array of m bits and k independent hash functions.
m
k
Insert: Hash the element with each of the k functions; set those k bit positions to 1.
Query: Hash with all k functions. If all those positions are 1 → "possibly in set". If any is 0 → "definitely NOT in set".
False positives are possible — all k bits may already be set by prior inserts. False negatives are impossible.
No deletions in a standard Bloom filter — clearing a bit could break membership for other elements. Counting Bloom filters solve this at higher cost.
■ Bit = 1 (set)
■ Bit = 0 (clear)
● H₁ hash position (purple)
● H₂ hash position (blue)
● H₃ hash position (green)
● H₄ hash position (yellow)
■ False positive — all bits set, not inserted
• Chrome Safe Browsing — malicious URL pre-filter
• Databases — skip disk reads for missing keys (Cassandra, RocksDB, LevelDB)
• CDN caches — avoid backend hits for definitely-absent content
• Distributed systems — weak membership in gossip protocols