Jump to content

Talk:String-searching algorithm: Difference between revisions

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Content deleted Content added
Line 40: Line 40:
== Two-way string search algorithm ==
== Two-way string search algorithm ==


This page seems to forget to mention the existence of the [http://monge.univ-mlv.fr/~mac/Articles-PDF/CP-1991-jacm.pdf|two-way string-matching] algorithm, which provides a worst-case running time of O(n+m). This is the algorithm used by glibc's strstr() function, for example. [[User:EdSchouten|EdSchouten]] ([[User talk:EdSchouten|talk]]) 10:18, 20 June 2016 (UTC)
This page seems to forget to mention the existence of the [http://monge.univ-mlv.fr/~mac/Articles-PDF/CP-1991-jacm.pdf two-way string-matching] algorithm, which provides a worst-case running time of O(n+m). This is the algorithm used by glibc's strstr() function, for example. [[User:EdSchouten|EdSchouten]] ([[User talk:EdSchouten|talk]]) 10:18, 20 June 2016 (UTC)

Revision as of 10:18, 20 June 2016

WikiProject iconComputer science Start‑class High‑importance
WikiProject iconThis article is within the scope of WikiProject Computer science, a collaborative effort to improve the coverage of Computer science related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
StartThis article has been rated as Start-class on Wikipedia's content assessment scale.
HighThis article has been rated as High-importance on the project's importance scale.
Things you can help WikiProject Computer science with:

Moved from article

These descriptions are insufficient!

Could somebody expand this entry ?

The "Finite Automata" link is entirely unhelpful, redirection to a page about finite state machines that has no information on string searching. --Furrykef 05:53, 29 Jun 2004 (UTC)

Nasty encodings

Some mention should be made of nasty character encodings that require special precautions when searching.......... Plugwash 11:35, 21 March 2006 (UTC)[reply]

Matching times

Are the matching times listed in the comparison table factual at all? I see most of them have been marked as Θ(n). Is it just a stub? References? --Bisqwit 11:46, 26 May 2006 (UTC)[reply]

Also, Θ() notation is used incorrectly, e.g. naive string searching is O(nm) -- when the two strings are the same, running time is Θ(n).

DFA search image

I've just fixed up the DFA image Image:DFA search mommy.svg so that it renders again. It should be useful here as a demonstration of the simple DFA-based algorithm, but I'm not sure where to put it. Please assist. Dcoetzee 03:57, 2 April 2007 (UTC)[reply]

Hi,

  I would like to include the following link which has more algorithms for exact string search EXACT STRING MATCHING ALGORITHMS.
  Any other thoughts also welcome.

Thanks czar — Preceding unsigned comment added by Crxz0193 (talkcontribs) 06:37, 23 April 2012 (UTC)[reply]

O( (N/M + M) logM ) for alphabet size 2 and up, optimal average case.

Alpha Skip Search was published at Combinatorial Pattern Matching 1998. The method appears to be optimal on average, for all alphabet sizes, and large M,N. Keys are sequences of logM letters. Preprocess the pattern into a lookup table with M-logM+1 keys of size logM (log base is alphabet size), encoding the key position in the pattern. Then skip through the text by length M-logM+1, building a logM sized key, and try to match the key in the lookup table. If a match is found, then try naively to match the pattern at the implied start position in the text. The expected run time of the algorithm is O((N/M + M)logM) for all alphabet sizes, including 2! Daniel Pehoushek 24.33.70.144 (talk) 18:45, 24 August 2014 (UTC)[reply]

Two-way string search algorithm

This page seems to forget to mention the existence of the two-way string-matching algorithm, which provides a worst-case running time of O(n+m). This is the algorithm used by glibc's strstr() function, for example. EdSchouten (talk) 10:18, 20 June 2016 (UTC)[reply]