Jump to content

Talk:Hash table

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by 62.145.195.57 (talk) at 15:26, 18 November 2004 (Unreadable). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

I'm not sure the intro sentance is correct in saying that an associative array is the same as a hash table. A hash table is only one of many kinds of assiciative data structures, and I think it might be more appropriate to make a seperate associative array page that hash table links too. Any problems with this?

Also, there was a statement saying that keys are usually strings, but don't have to be. I don't think this is generally true, and as it doesn't add much, I took it out. --BlckKnght (it is probably true of perl and awk programs --drj)

Changed to "an implementation of associative array". AVL trees, red black trees, 2 3 4 trees (and 2 3 trees), B star trees, splay trees, association lists are all implementations of associative array too. Expanded rest. --drj

Im acutally a coder, Ive used hash tables before through STL, I vaguely recognize them as being "fast" in some magic way. But, I dont UNDERSTAND them. This little text helped a little bit, now if I just draw this thingy on a paper, then maybe Ill get it. Anyway, what I wanted to say whas that, some kind of example of a simple, simple hash and the way it works would be great. If i get the time, maybe Ill do it myself and add to this page, or subpage /Example or so. sandos


> many key => value associations (this is a many to one relationship as the hash table is almost universally smaller than the number of keys).

I find this unclear; to what does "this" refer ? I believe it refers to the mapping of keys to slots in the hash array, but at this point (the introductory sentence), this mapping has not even been introduced.

Rehashing and O(1)

> Rehashing is ... by ... recomputing ... the O(1) property can be maintained.

I'm not familiar with the relevant math, but I would speculate that perhaps there are some restrictions as to how frequently the rehashing may be done, to preserve the O(1) ?

Reply: It is similar to re-indexing a database. It's an intermittent procedure that is not a part of the search algorithm. The rehashing is probably not O(1), but you can comfortably divorce the procedure from the actual search.

Rehashing (i.e. growing the table) is O(n). If you exponentially grow the table size as it fills you get O(nln(n))

Revision

I'm working on some revisions of this page. The working copy is at User:T Long/Hash Table. T Long 05:08, 2004 Aug 29 (UTC)

Revisions done. T Long 06:05, 2004 Sep 4 (UTC)

Simplification

I'm thinking the introduction is a bit terse when it says that hash tables use a hash function to perform index range reduction. This could probably be made friendlier with an example, but I'm not sure just how, and how much overlap there would be with hash function. Deco 20:21, 23 Sep 2004 (UTC)

Yes, it doesn't read well to me. Ok, I admit it; I don't know what range reduction is; even though I've implemented hash tables several times. Presumably they mean that the size of the key is bigger than the size of the hash. If so that's probably not even worth saying; or if it is, it should spell it out.
I don't like this piece in general. It's written to assume too much knowledge of specialist terms on behalf of the reader. It seems to be about MSc level, when undergraduate or lower is more appropriate. It just doesn't have to be this complicated.
-Wolfkeeper
You have my agreement there. The likely article viewers are college freshman. It needs to be dumbed down somewhat. I'll look at this pretty soon if no one else gets to it first. Deco 01:25, 24 Sep 2004 (UTC)
Looks much better now I think. Good work. - WolfKeeper.

Fixed hash value

I deleted the section on using a fixed hash value because this seems like a bad idea to me. The only conceivable reason one might want to do this is to simulate worst-case behaviour.

No, the main reason you use it is to make sure that the collision resolution code works correctly; and measure the worst case performance.

It certainly hides errors one might wish to find during debugging, especially in the hash table implementation.

The hash table code itself is generally very short. It is predominately the collision resolution that is more likely to have bugs in. You take it right back out again if the test passes. You can leave it in if the performance is adequate however :-)
Okay, this is fine, I really just wanted to hear your justification. I'm sorry for re-removing it, but I felt like you were ignoring me. I apologize again and have added it back with some explanation. Deco 16:35, 3 Oct 2004 (UTC)

Hash tables are deterministic, and the argument used with random number generators does not apply. Can anyone explain this paragraph? Deco 00:33, 3 Oct 2004 (UTC)

Please answer this question. I don't appreciate having my edits undone without justification, especially when I had a good reason. Deco 15:26, 3 Oct 2004 (UTC)
The wikipedia NPOV says that it is incorrect to remove other peoples stuff; not that it is incorrect to add back in stuff that other people have removed. Please act appropriately given the nature of the document you are contributing to. Your point of view is not necessarily better than other peoples, only different.
I'm glad we worked out this issue, but I think you're misinterpreting the NPOV policy. I did not remove the content because I wanted to espouse my own non-constant hash point of view; I deleted it because I thought it was incorrect (partly because I misunderstood the intention). Deleting content is perfectly okay and is part of being bold, although it understandably upsets the original editor, and that's why I asked for your explanation here. Sometimes editors who add things really don't mind them being later removed, if they can agree about why they were removed. Deco 16:35, 3 Oct 2004 (UTC)

Unreadable

This article is unreadable. Especially the first paragraph. For start dont use the word "key" in the first sentence.