Talk:Hash table (C)

From LiteratePrograms
Jump to: navigation, search


[edit] Compilation warning

I get the "implicit declaration of function 'strdup'" warning in build.log. I have included <string.h>, which should, according to my man page, declare strdup. I also don't get this warning on my system (NetBSD). Anyone who knows what is wrong? Ahy1 10:03, 27 July 2006 (PDT)

Looks like strdup isn't part of the standard library. I have created mystrdup as a replacement.
bzero is also not part of the standard. Replaced with memset. All warnings are gone :-) Ahy1 05:04, 30 July 2006 (PDT)

[edit] Hash function

The default hash function used here is probably not very efficient. Can someone with more knowledge about these things make a better one? Ahy1 10:05, 27 July 2006 (PDT)

[edit] New type for hash table size

I've just made some changes to the code I'd like to explain. I didn't like the result from the hashing function was an int - it shouldn't ever become negative, yet it could (it is implementation defined whether "char" is signed or unsigned, if your compiler uses signed char and you hash a string that uses an extended 8 bit charset, you could get a negative result from the hashing function).

Also, when you calculated the hash value for a key, you calculated int modulo size_t and put the result in an int. Now size_t is always unsigned and implementation defined, so the number of "buckets" in your hash table may be far larger than the range of possible hash values and some thus may never be used, especially on 64 bit systems where size_t may well be unsigned int64 and int remain at 32 bit signed.

I introduced a new type for hash values now. I think it's neater than working which two distinct types that both have no fixed sizes.

One more note which is maybe personal taste, you return 0 on success and -1 on failure in some functions, which forces you to use success checks like "if (!hashtbl_insert(...))" which I do not find very intuitive to read. Unless it is desireable to have multiple error codes so the user can see what went wrong (if that makes sense), I'd personally favour a plain boolean return value. Or define a success constant that can be tested against so user code becomes more intuitive. Ruediger Hanke 02:45, 31 July 2006 (PDT)

Negative values are of course not good when using them as index into an array. Thaks for fixing that bug! Also thanks for fixing the rather embarassing memory leaks. I was probably too tired to think when I wrote this program.
Regarding return values, this is of course a matter of taste. The reason I use 0 for success and -1 for error, is that standard functions, like fclose(), fflush() and signal() do it that way. This is not important to me, so I would not object if you changed it. Ahy1 10:55, 31 July 2006 (PDT)

[edit] resize function of hash

valgrind throws an error when the resize function is used.

This is due to the fact that the element pointed by node is removed before node = node->next is executed. A simple hack would be to change the code to

for(node = hashTable->nodes[index]; node;){

     temp  = node->next;
     hashTableInsert(&newTable, node->key, node->data);
     hashTableRemove(hashTable, node->key);
     node = temp;

where temp transiently helds the node->next pointers.