{"id":2437,"date":"2023-06-26T11:18:39","date_gmt":"2023-06-26T05:48:39","guid":{"rendered":"https:\/\/www.interviewbit.com\/blog\/?p=2437"},"modified":"2023-06-26T11:23:21","modified_gmt":"2023-06-26T05:53:21","slug":"kmp-algorithm-pattern-search","status":"publish","type":"post","link":"https:\/\/www.interviewbit.com\/blog\/kmp-algorithm-pattern-search\/","title":{"rendered":"Pattern Search &#8211; KMP Algorithm"},"content":{"rendered":"\n<p><\/p>\n\n\n\n<div class=\"gutentoc tocactive nostyle\"><div class=\"gutentoc-toc-wrap\"><div class=\"gutentoc-toc-title-wrap\"><div class=\"gutentoc-toc-title\">Table Of Contents<\/div><div id=\"open\" class=\"text_open\">show<\/div><\/div><div id=\"toclist\"><div class=\"gutentoc-toc__list-wrap\"><ul class=\"gutentoc-toc__list\"><li><a href=\"#problem-statement\">Problem Statement<\/a><\/li><li><a href=\"#naive-approach\">Naive Approach<\/a><\/li><ul class=\"gutentoc-toc__list\"><li><a href=\"#dry-run-of-naive-approach-for-pattern-search\">Dry Run of Naive Approach For Pattern Search<\/a><\/li><li><a href=\"#c-implementation\">C Implementation<\/a><\/li><li><a href=\"#java-implementation\">Java Implementation<\/a><\/li><li><a href=\"#python-implementation\">Python Implementation<\/a><\/li><\/ul><li><a href=\"#kmp-pattern-searching\">KMP Pattern Searching<\/a><\/li><ul class=\"gutentoc-toc__list\"><li><a href=\"#dry-run-of-the-above-approach\">Dry Run Of the Above Approach<\/a><\/li><li><a href=\"#c-implementation-of-kmp\">C++ Implementation of KMP<\/a><\/li><li><a href=\"#java-implementation-of-kmp\">Java Implementation of KMP<\/a><\/li><li><a href=\"#python-implementation-of-kmp\">Python Implementation of KMP<\/a><\/li><\/ul><li><a href=\"#practice-questions\">Practice Questions<\/a><\/li><li><a href=\"#frequently-asked-questions\">Frequently Asked Questions<\/a><\/li><ul class=\"gutentoc-toc__list\"><li><a href=\"#q1-which-of-the-above-approaches-is-better-with-respect-to-space-complexity\">Q.1: Which of the above approaches is better with respect to space complexity?<\/a><\/li><li><a href=\"#q2-are-there-any-other-pattern-matching-algorithms\">Q.2: Are there any other pattern-matching algorithms?<\/a><\/li><\/ul><\/ul><\/div><\/div><\/div><\/div>\n\n\n\n<h2 id=\"problem-statement\">Problem Statement<\/h2>\n\n\n\n<p>The Pattern Searching algorithms are also called String Matching Algorithms. These algorithms are very helpful in the case of searching a string within another string.<\/p>\n\n\n\n<p>Given a text str[0..n-1] and a pattern pat[0..m-1], write a program with a function PatternSearch(char pat[], char str[]) that prints all occurrences of pat[] in str[]. Given that n &gt; m.<\/p>\n\n\n\n<p><strong>Examples:<\/strong><\/p>\n\n\n\n<p><strong>Input:<\/strong> str[] = &#8220;THIS IS A TEXT&#8221;<br><strong>pat[] =<\/strong> &#8220;TEXT&#8221;<\/p>\n\n\n\n<p><strong>Output<\/strong>: Pattern found at index 10.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-full\"><img  loading=\"lazy\"  width=\"704\"  height=\"464\"  src=\"data:image\/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABAQMAAAAl21bKAAAAA1BMVEUAAP+KeNJXAAAAAXRSTlMAQObYZgAAAAlwSFlzAAAOxAAADsQBlSsOGwAAAApJREFUCNdjYAAAAAIAAeIhvDMAAAAASUVORK5CYII=\"  alt=\"\"  class=\"wp-image-2445 pk-lazyload\"  data-pk-sizes=\"auto\"  data-ls-sizes=\"(max-width: 704px) 100vw, 704px\"  data-pk-src=\"https:\/\/www.interviewbit.com\/blog\/wp-content\/uploads\/2021\/10\/Input-1.jpg\"  data-pk-srcset=\"https:\/\/www.interviewbit.com\/blog\/wp-content\/uploads\/2021\/10\/Input-1.jpg 704w, https:\/\/www.interviewbit.com\/blog\/wp-content\/uploads\/2021\/10\/Input-1-300x198.jpg 300w, https:\/\/www.interviewbit.com\/blog\/wp-content\/uploads\/2021\/10\/Input-1-380x250.jpg 380w, https:\/\/www.interviewbit.com\/blog\/wp-content\/uploads\/2021\/10\/Input-1-260x170.jpg 260w, https:\/\/www.interviewbit.com\/blog\/wp-content\/uploads\/2021\/10\/Input-1-550x363.jpg 550w\" ><\/figure><\/div>\n\n\n\n<hr class=\"wp-block-separator has-text-color has-background has-black-background-color has-black-color is-style-wide\"\/>\n\n\n\n<h2 id=\"naive-approach\">Naive Approach<\/h2>\n\n\n\n<p>In this approach, we will check each and every substring of the string. Below are the steps to follow.<\/p>\n\n\n\n<ul><li>Iterate over the string for i from 0 to n &#8211; 1 (n is the size of the string).<\/li><li>For every value of i slide the pattern over the text one by one and check for a match. If there is a match, then slides by 1 again to check for subsequent matches.<\/li><\/ul>\n\n\n\n<h3 id=\"dry-run-of-naive-approach-for-pattern-search\">Dry Run of Naive Approach For Pattern Search<\/h3>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-full\"><img  loading=\"lazy\"  width=\"554\"  height=\"281\"  src=\"data:image\/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABAQMAAAAl21bKAAAAA1BMVEUAAP+KeNJXAAAAAXRSTlMAQObYZgAAAAlwSFlzAAAOxAAADsQBlSsOGwAAAApJREFUCNdjYAAAAAIAAeIhvDMAAAAASUVORK5CYII=\"  alt=\"Dry Run Of Naive Approach\"  class=\"wp-image-2446 pk-lazyload\"  data-pk-sizes=\"auto\"  data-ls-sizes=\"(max-width: 554px) 100vw, 554px\"  data-pk-src=\"https:\/\/www.interviewbit.com\/blog\/wp-content\/uploads\/2021\/10\/Input.jpg\"  data-pk-srcset=\"https:\/\/www.interviewbit.com\/blog\/wp-content\/uploads\/2021\/10\/Input.jpg 554w, https:\/\/www.interviewbit.com\/blog\/wp-content\/uploads\/2021\/10\/Input-300x152.jpg 300w, https:\/\/www.interviewbit.com\/blog\/wp-content\/uploads\/2021\/10\/Input-380x193.jpg 380w, https:\/\/www.interviewbit.com\/blog\/wp-content\/uploads\/2021\/10\/Input-550x279.jpg 550w\" ><\/figure><\/div>\n\n\n\n<h3 id=\"c-implementation\">C Implementation<\/h3>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"c\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">void Patternsearch(char* pat, char* str) {\n    int M = strlen(pat);\n    int N = strlen(str);\n    for (int i = 0; i &lt;= N - M; i++) {\n        int j;\n \n        for (j = 0; j &lt; M; j++)\n            if (str[i + j] != pat[j])\n                break;\n \n        if (j == M)\n            printf(\"Pattern found at index %d \\n\", i);\n    }\n}<\/pre>\n\n\n\n<h3 id=\"java-implementation\">Java Implementation<\/h3>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"java\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">public static void Patternsearch(String str, String pat) {\n  int M = pat.length();\n  int N = str.length();\n\n  for (int i = 0; i &lt;= N - M; i++) {\n\n    int j;\n    for (j = 0; j &lt; M; j++)\n      if (str.charAt(i + j) != pat.charAt(j))\n        break;\n\n    if (j == M)\n      System.out.println(\"Pattern found at index \" + i);\n  }\n}<\/pre>\n\n\n\n<h3 id=\"python-implementation\">Python Implementation<\/h3>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">def Patternsearch(pat, str):\n    M = len(pat)\n    N = len(str) \n    for i in range(N - M + 1):\n        j = 0\n        while(j &lt; M):\n            if (str[i + j] != pat[j]):\n                break\n            j += 1\n \n        if (j == M):\n            print(\"Pattern found at index \", i)<\/pre>\n\n\n\n<ul><li><strong>Time Complexity:<\/strong> O(n) in the Best case and O(m*(n-m+1)) in the worst case.<\/li><li><strong>Space Complexity:<\/strong> O(1).<\/li><\/ul>\n\n\n\n<h2 id=\"kmp-pattern-searching\">KMP Pattern Searching<\/h2>\n\n\n\n<p>The KMP calculation utilizes the deteriorating property (design having the same sub-designs showing up more than once in the example) of the example and further develops the most pessimistic scenario running time complexity to O(n). The thought for the KMP calculation is: at whatever point the string gets mismatched, we definitely know a portion of the characters in the text of the following window. We will exploit this data to try not to coordinate with the characters that we realize will coordinate.<\/p>\n\n\n\n<ul><li>The KMP algorithm pre-computes pat[] and creates an array lps[] of size m (same as the size of pattern) which is used to jump characters while matching.<\/li><li>We search for lps in sub-patterns. More commonly we focus on sub-strings of patterns that are either prefixes and suffixes.<\/li><li>For every sub-pattern pat[0..i] where i range from 0 to m-1, lps[i] stores the size of the maximum matching proper prefix which is also a suffix of the sub-pattern pat[0..i].<\/li><\/ul>\n\n\n\n<p>How can we utilize lps[] to decide the next positions or to know the number of characters to be jumped?<\/p>\n\n\n\n<ul><li>We begin contrasting pat[j] with j = 0 with characters of the current window of text.<\/li><li>We keep checking characters str[i] and pat[j] and keep incrementing i and j while pat[j] and str[i] keep matching.<\/li><li>When we see there is a mismatch then,<ul><li>It is already known that characters pat[0..j-1] are the same as str[i-j\u2026i-1]<\/li><li>From the above points, we can conclude that lps[j-1] is the frequency of characters of pat[0\u2026j-1] that are both proper prefixes and proper suffixes.<\/li><li>In conclusion, we don\u2019t need to check these lps[j-1] characters with str[i-j\u2026i-1] because we know that these characters will always match.<\/li><\/ul><\/li><\/ul>\n\n\n\n<h3 id=\"dry-run-of-the-above-approach\">Dry Run Of the Above Approach<\/h3>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-full\"><img  loading=\"lazy\"  width=\"704\"  height=\"601\"  src=\"data:image\/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABAQMAAAAl21bKAAAAA1BMVEUAAP+KeNJXAAAAAXRSTlMAQObYZgAAAAlwSFlzAAAOxAAADsQBlSsOGwAAAApJREFUCNdjYAAAAAIAAeIhvDMAAAAASUVORK5CYII=\"  alt=\"Dry Run of KMP approach\"  class=\"wp-image-2444 pk-lazyload\"  data-pk-sizes=\"auto\"  data-ls-sizes=\"(max-width: 704px) 100vw, 704px\"  data-pk-src=\"https:\/\/www.interviewbit.com\/blog\/wp-content\/uploads\/2021\/10\/Image-2-5.jpg\"  data-pk-srcset=\"https:\/\/www.interviewbit.com\/blog\/wp-content\/uploads\/2021\/10\/Image-2-5.jpg 704w, https:\/\/www.interviewbit.com\/blog\/wp-content\/uploads\/2021\/10\/Image-2-5-300x256.jpg 300w, https:\/\/www.interviewbit.com\/blog\/wp-content\/uploads\/2021\/10\/Image-2-5-380x324.jpg 380w, https:\/\/www.interviewbit.com\/blog\/wp-content\/uploads\/2021\/10\/Image-2-5-550x470.jpg 550w\" ><\/figure><\/div>\n\n\n\n<h3 id=\"c-implementation-of-kmp\">C++ Implementation of KMP<\/h3>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"cpp\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">void KMPStringSearch(char* pat, char* str) {\n    int M = strlen(pat);\n    int N = strlen(str);\n  \n    int lps[M];\n    computeLPSArray(pat, M, lps);\n  \n    int i = 0; \n    int j = 0;\n    while (i &lt; N) {\n        if (pat[j] == str[i]) {\n            j++;\n            i++;\n        }\n  \n        if (j == M) {\n            printf(\"Found pattern at index %d \", i - j);\n            j = lps[j - 1];\n        }\n        else if (i &lt; N &amp;&amp; pat[j] != str[i]) {\n            if (j != 0)\n                j = lps[j - 1];\n            else\n                i = i + 1;\n        }\n    }\n}\n  \nvoid computeLPSArray(char* pat, int M, int* lps) {\n        int len = 0;\n  \n    lps[0] = 0;   \n       int i = 1;\n    while (i &lt; M) {\n        if (pat[i] == pat[len]) {\n            len++;\n            lps[i] = len;\n            i++;\n        }\n        else{\n           if (len != 0) {\n                len = lps[len - 1];\n           }\n            else             {\n                lps[i] = 0;\n                i++;\n            }\n        }\n    }\n}<\/pre>\n\n\n\n<h3 id=\"java-implementation-of-kmp\">Java Implementation of KMP<\/h3>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"java\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">void KMPStringSearch(String pat, String str) {\n  int M = pat.length();\n  int N = str.length();\n\n  int lps[] = new int[M];\n  int j = 0; \/\/ index for pat[]\n\n  computeLPSArray(pat, M, lps);\n\n  int i = 0;\n  while (i &lt; N) {\n    if (pat.charAt(j) == str.charAt(i)) {\n      j++;\n      i++;\n    }\n    if (j == M) {\n      System.out.println(\"Found pattern \" +\n        \"at index \" + (i - j));\n      j = lps[j - 1];\n    } else if (i &lt; N &amp;&amp; pat.charAt(j) != str.charAt(i)) {\n\n      if (j != 0)\n        j = lps[j - 1];\n      else\n        i = i + 1;\n    }\n  }\n}\n\nvoid computeLPSArray(String pat, int M, int lps[]) {\n  int len = 0;\n  int i = 1;\n  lps[0] = 0; \/\/ lps[0] is always 0\n\n  while (i &lt; M) {\n    if (pat.charAt(i) == pat.charAt(len)) {\n      len++;\n      lps[i] = len;\n      i++;\n    } else {\n\n      if (len != 0) {\n        len = lps[len - 1];\n\n      } else {\n        lps[i] = len;\n        i++;\n      }\n    }\n  }\n}<\/pre>\n\n\n\n<h3 id=\"python-implementation-of-kmp\">Python Implementation of KMP<\/h3>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">def KMPStringSearch(pat, st):\n    M = len(pat)\n    N = len(st)\n  \n   \n    lps = [0]*M\n    j = 0 \n  \n    computeLPSArray(pat, M, lps)\n  \n    i = 0     \n    while i &lt; N:\n        if pat[j] == st[i]:\n            i += 1\n            j += 1\n  \n        if j == M:\n            print (\"Found pattern at index \" + str(i-j))\n            j = lps[j-1]\n  \n        \n        elif i &lt; N and pat[j] != st[i]:\n            \n\n            if j != 0:\n                j = lps[j-1]\n            else:\n                i += 1\n  \ndef computeLPSArray(pat, M, lps):\n    len = 0 \n  \n    lps[0] \n    i = 1\n  \n    \n    while i &lt; M:\n        if pat[i]== pat[len]:\n            len += 1\n            lps[i] = len\n            i += 1\n        else:\n            \n            if len != 0:\n                len = lps[len-1]\n            else:\n                lps[i] = 0\n                i += 1<\/pre>\n\n\n\n<p><strong>Time Complexity:<\/strong> O(n) in the worst case where n is the length of text.<br><strong>Space Complexity:<\/strong> O(m) where m is the size of the pattern.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-text-color has-background has-black-background-color has-black-color is-style-wide\"\/>\n\n\n\n<h2 id=\"practice-questions\">Practice Questions<\/h2>\n\n\n\n<p><a rel=\"noreferrer noopener\" href=\"https:\/\/www.interviewbit.com\/problems\/regular-expression-match\/\" target=\"_blank\">Regular Expression Match<\/a><br><a href=\"https:\/\/www.interviewbit.com\/problems\/regular-expression-ii\/\" target=\"_blank\" rel=\"noreferrer noopener\">Regular Expression II<\/a><\/p>\n\n\n\n<hr class=\"wp-block-separator has-text-color has-background has-black-background-color has-black-color is-style-wide\"\/>\n\n\n\n<h2 id=\"frequently-asked-questions\">Frequently Asked Questions<\/h2>\n\n\n\n<h3 id=\"q1-which-of-the-above-approaches-is-better-with-respect-to-space-complexity\"><span id=\"q-1-which-of-the-above-approaches-is-better-with-respect-to-space-complexity\">Q.1: Which of the above approaches is better with respect to space complexity?<\/span><\/h3>\n\n\n\n<p>Ans: The brute or simple approach is better if we only consider the space complexity.<\/p>\n\n\n\n<h3 id=\"q2-are-there-any-other-pattern-matching-algorithms\"><span id=\"q-2-are-there-any-other-pattern-matching-algorithms\">Q.2: Are there any other pattern-matching algorithms?<\/span><\/h3>\n\n\n\n<p>Ans: Yes, Rabin Karp algorithm, Boyer Moore algorithm, etc.<\/p>\n","protected":false},"excerpt":{"rendered":"Problem Statement The Pattern Searching algorithms are also called String Matching Algorithms. These algorithms are very helpful in&hellip;\n","protected":false},"author":5,"featured_media":2443,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_daextam_enable_autolinks":"1","csco_singular_sidebar":"","csco_page_header_type":"","csco_appearance_grid":"","csco_page_load_nextpost":"","csco_post_video_location":[],"csco_post_video_location_hash":"","csco_post_video_url":"","csco_post_video_bg_start_time":0,"csco_post_video_bg_end_time":0},"categories":[145],"tags":[409,408],"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/www.interviewbit.com\/blog\/wp-json\/wp\/v2\/posts\/2437"}],"collection":[{"href":"https:\/\/www.interviewbit.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.interviewbit.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.interviewbit.com\/blog\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/www.interviewbit.com\/blog\/wp-json\/wp\/v2\/comments?post=2437"}],"version-history":[{"count":5,"href":"https:\/\/www.interviewbit.com\/blog\/wp-json\/wp\/v2\/posts\/2437\/revisions"}],"predecessor-version":[{"id":20723,"href":"https:\/\/www.interviewbit.com\/blog\/wp-json\/wp\/v2\/posts\/2437\/revisions\/20723"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.interviewbit.com\/blog\/wp-json\/wp\/v2\/media\/2443"}],"wp:attachment":[{"href":"https:\/\/www.interviewbit.com\/blog\/wp-json\/wp\/v2\/media?parent=2437"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.interviewbit.com\/blog\/wp-json\/wp\/v2\/categories?post=2437"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.interviewbit.com\/blog\/wp-json\/wp\/v2\/tags?post=2437"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}