[Ruby] eql? and hash on Sets
Scott Windsor
swindsor at gmail.com
Sat Nov 24 14:22:49 PST 2007
A Set is intended to act like Array, but provide fast access for
unordered values without duplicates. The documentation states that
it uses a Hash for storage.
You're incorrect that Set#eql? is implemented by Set. When you call
Set#eql?, you're actually calling the inherited Object#eql? - which
if you compare two different objects, they will not match. You can,
however, use Set#== to determine if two sets are equal...
irb(main):012:0> Set.new([1,2]).eql? Set.new([1, 2])
=> false
irb(main):013:0> Set.new([1,2]) == Set.new([1, 2])
=> true
Set#hash is also not implemented, so the default of Object#hash is
used. This will vary from object to object
irb(main):048:0> Set[1, 2].hash
=> 149380
irb(main):049:0> Set[1, 2].hash
=> 144060
But Array#hash has been overriden so it does match for arrays with
the same contents.
irb(main):046:0> [1, 2].hash
=> 11
irb(main):047:0> [1, 2].hash
=> 11
If you want Set#eql? and Set#hash to work correctly, you should
consider how Array#hash and Array#eql? are implemented. Calculating
a hash for each array, then comparing the result is probably going to
be far faster for the comparison than comparing all of the objects in
each array to one another. But, here is where there is a problem for
Sets vs Arrays...
Arrays are only equal if the order matches...
irb(main):067:0> [1,2] == [1,2]
=> true
irb(main):068:0> [1,2] == [2,1]
=> false
Whereas, sets can be equal when created in a different order...
irb(main):070:0> Set.new([1,2]) == Set.new([2,1])
=> true
Therefore, you'll always have to perform extra operations to
determine Set equality. You can enhance this by sorting each Set
first, then comparing, rather than comparing each without sorting,
but this will still only get you to a worst case of nlogn rather than n.
- scott
On Nov 24, 2007, at 10:35 AM, Aaron Patterson wrote:
> Given the behavior of Array#hash and Array#eql?, what would you expect
> Set#hash and Set#eql? to do? I expected Set to behave the same way as
> Array, but it does not. That kind of made sense since you cannot
> guarantee order in a Set. However, I was surprised to find that Set
> implements eql? and hash. That leads me to believe that the original
> intent was for Set to behave like Array in that department, and that
> this is a bug. As far as I can tell, no two Sets can eql? each other.
>
> Does anyone have thoughts on this? I was able to monkey patch Set so
> that those methods work the way I expect, although the methods get
> slower as the set gets larger.
>
> --
> Aaron Patterson
> http://tenderlovemaking.com/
> _______________________________________________
> Ruby at zenspider.com - Seattle.rb non-commercial list
> http://www.zenspider.com/seattle.rb
> http://www.zenspider.com/mailman/listinfo/ruby
More information about the Ruby
mailing list