21-09-2013, 04:09 PM
Strings and Pattern Matching
Strings and Pattern .ppt (Size: 551 KB / Downloads: 412)
String Searching
The previous slide is not a great example of what is meant by “String Searching.” Nor is it meant to ridicule people without eyes..
The object of string searching is to find the location of a specific text pattern within a larger body of text (e.g., a sentence, a paragraph, a book, etc.).
As with most algorithms, the main considerations for string searching are speed and efficiency.
There are a number of string searching algorithms in existence today, but the three we shall review are Brute Force,Rabin-Karp, and Knuth-Morris-Pratt.
Rabin-Karp
The Rabin-Karp string searching algorithm calculates a hash value for the pattern, and for each M-character subsequence of text to be compared.
If the hash values are unequal, the algorithm will calculate the hash value for next M-character sequence.
If the hash values are equal, the algorithm will do a Brute Force comparison between the pattern and the M-character sequence.
In this way, there is only one comparison per text subsequence, and Brute Force is only needed when hash values match.
Perhaps an example will clarify some things...
Rabin-Karp Complexity
If a sufficiently large prime number is used for the hash function, the hashed values of two different patterns will usually be distinct.
If this is the case, searching takes O(N) time, where N is the number of characters in the larger body of text.
It is always possible to construct a scenario with a worst case complexity of O(MN). This, however, is likely to happen only if the prime number used for hashing is small.