Jump to content

String-to-string correction problem

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by 130.209.241.223 (talk) at 13:19, 17 May 2005 (First version includes a general but accurate description.). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

The string-to-string correction problem refers to the minimum number of edit operations necessary to change one string into the other. A single edit operation may be changing a single symbol ("character") of the string into another, deleting or inserting a symbol. The length of the edit sequence provides a measure of the differences (or "distance") between the two strings.

Several algorithms exist to provide an efficient way to determine string distance and specify the minimum amount of transformation operations required. Such algorithms are particularly useful for "delta" creation operations where something is stored as a set of differences relative to a base version. This allows to store several versions of a single object much more efficiently than storing them separately. This holds true even for single versions of several objects if they do not differ greatly, or anything in between. Notably, such difference algorithms are used in molecular biology to provide some measure of kinship between different kinds of organisms based on the similarities of their macromolecules (like proteins or DNA).