[Ruby] eql? and hash on Sets

Scott Windsor swindsor at gmail.com
Sat Nov 24 14:22:49 PST 2007


A Set is intended to act like Array, but provide fast access for  
unordered values without duplicates.  The documentation states that  
it uses a Hash for storage.

You're incorrect that Set#eql? is implemented by Set.  When you call  
Set#eql?, you're actually calling the inherited Object#eql? - which  
if you compare two different objects, they will not match.  You can,  
however, use Set#== to determine if two sets are equal...


irb(main):012:0> Set.new([1,2]).eql? Set.new([1, 2])
=> false
irb(main):013:0> Set.new([1,2]) == Set.new([1, 2])
=> true

Set#hash is also not implemented, so the default of Object#hash is  
used.  This will vary from object to object

irb(main):048:0> Set[1, 2].hash
=> 149380
irb(main):049:0> Set[1, 2].hash
=> 144060

But Array#hash has been overriden so it does match for arrays with  
the same contents.

irb(main):046:0> [1, 2].hash
=> 11
irb(main):047:0> [1, 2].hash
=> 11

If you want Set#eql? and Set#hash to work correctly, you should  
consider how Array#hash and Array#eql? are implemented.  Calculating  
a hash for each array, then comparing the result is probably going to  
be far faster for the comparison than comparing all of the objects in  
each array to one another.  But, here is where there is a problem for  
Sets vs Arrays...

Arrays are only equal if the order matches...

irb(main):067:0> [1,2] == [1,2]
=> true
irb(main):068:0> [1,2] == [2,1]
=> false

Whereas, sets can be equal when created in a different order...

irb(main):070:0> Set.new([1,2]) == Set.new([2,1])
=> true

Therefore, you'll always have to perform extra operations to  
determine Set equality.  You can enhance this by sorting each Set  
first, then comparing, rather than comparing each without sorting,  
but this will still only get you to a worst case of nlogn rather than n.

- scott

On Nov 24, 2007, at 10:35 AM, Aaron Patterson wrote:

> Given the behavior of Array#hash and Array#eql?, what would you expect
> Set#hash and Set#eql? to do?  I expected Set to behave the same way as
> Array, but it does not.  That kind of made sense since you cannot
> guarantee order in a Set.  However, I was surprised to find that Set
> implements eql? and hash.  That leads me to believe that the original
> intent was for Set to behave like Array in that department, and that
> this is a bug.  As far as I can tell, no two Sets can eql? each other.
>
> Does anyone have thoughts on this?  I was able to monkey patch Set so
> that those methods work the way I expect, although the methods get
> slower as the set gets larger.
>
> -- 
> Aaron Patterson
> http://tenderlovemaking.com/
> _______________________________________________
> Ruby at zenspider.com - Seattle.rb non-commercial list
> http://www.zenspider.com/seattle.rb
> http://www.zenspider.com/mailman/listinfo/ruby



More information about the Ruby mailing list