written 6.7 years ago by | • modified 6.7 years ago |
Rabin Karp Algorithm A string search algorithm which compares a string's hash values, rather than the strings themselves. For efficiency, the hash value of the next position in the text is easily computed from the hash value of the current position. he Rabin–Karp algorithm or Karp–Rabin algorithm is a string searching algorithm created by Richard M. Karp and Michael O. Rabin (1987) that uses hashing to find any one of a set of pattern strings in a text. For text of length n and p patterns of combined length m, its average and best case running time is O(n+m) in space O(p), but its worst-case time is O(nm).
he Naive String Matching algorithm slides the pattern one by one. After each slide, it one by one checks characters at the current shift and if all characters match then prints the match. Like the Naive Algorithm, Rabin-Karp algorithm also slides the pattern one by one. But unlike the Naive algorithm, Rabin Karp algorithm matches the hash value of the pattern with the hash value of current substring of text, and if the hash values match then only it starts matching individual characters. So Rabin Karp algorithm needs to calculate hash values for following strings.
1) Pattern itself. 2) All the substrings of text of length m.
We assume that the text is an array T [1..N] of length n and that the pattern is an array P [1..M] of length m, where m << n. We also assume that the elements of P and T are characters in the finite alphabet S. (e.g., S = {a,b} We want to find P = ‘aab’ in T = ‘abbaabaaaab’) 1. The idea of the string matching problem is that we want to find all occurrences of the pattern P in the given text T.
2.We could use the brute force method for string matching, which utilizes iteration over T. At each letter, we compare the sequence against P until all letters match of until the end of the alphabet is reached.
3.The worst case scenario can reach O(N*M)
Given T = 31415926535 and P = 26 We choose q = 11 P mod q = 26 mod 11 = 4
As we can see, when a match is found, further testing is done to insure that a match has indeed been found.