I have a Java program that stores a lot of mappings from Strings to various objects.
Right now, my options are either to rely on hashing (via HashMap) or on binary searches (via TreeMap). I am wondering if there is an efficient and standard trie-based map implementation in a popular and quality collections library?
I've written my own in the past, but I'd rather go with something standard, if available.
Quick clarification: While my question is general, in the current project I am dealing with a lot of data that is indexed by fully-qualified class name or method signature. Thus, there are many shared prefixes.
- 
                        Are HashMap and TreeMap that slow? We maintain some pretty big maps and speed has never seemed to be that much of an issue. Andrew Dashin : TreeMap and HashMap are interfaces.: In terms of speed, a hash should be faster than a trie. But I think he's more worried about memory efficiency.Esko Luontola : @Andrew Dashin: They are not interfaces. Their respective interfaces are Map and SortedMap.Uri : They're interfaces but are usually used with an implementation that conforms to their name.Uri : Tries are cheaper space wise since they supposedly don't store entire strings (a lot of my strings have common prefixes)Tom : @Uri/Andrew Dashin - they're classesmtruesdell : Concrete classes. Not abstract, not interfaces: http://java.sun.com/javase/6/docs/api/java/util/TreeMap.html http://java.sun.com/javase/6/docs/api/java/util/HashMap.htmlFortyrunner : @uri You now have me interested in Tries as well - thanks! @andrew, They're definitely classes and have been since Java 1.2 when I first started using em!Uri : @thaggie: Sorry, my bad, I had an IHashMap and ITreeMap years ago in a C++ project, don't ask. Yes, Map and Tree are the interfaces, but I am using either HashMap or TreeMap and wondering if a Trie wouldn't be better.Uri : My main concern with Tries is the cost of accesses. Since each cell may be at a different location in memory.Andrew Dashin : My appologies, I did a mistake.nos : TreeMap/HashMaps doesn't solve the same problem as tries. You can do prefix lookups in tries.
- 
                        What you need is org.apache.commons.collections.FastTreeMap, I think.
- 
                        You might look at this TopCoder one as well (registration required...). 
- 
                        You might want to look at the Trie implementation that Limewire is contributing to the Google collections library. 
 
0 comments:
Post a Comment