Minimum Edit distance between two strings str1 and str2 is defined as the minimum number of insert/delete/substitute operations required to transform str1 into str2. For example if str1 = “INTENTION”, str2 = “EXECUTION”, then the minimum edit distance between str1 and str2 turns out to be 5. Using dynamic programming, we use bottom up approach to compute the edit distance between str1 and str2. We start by computing edit distance for smaller sub-problems and use the results of these smaller sub-problems to compute results for sub-sequent larger problems.
In above matrix, cell (m,n) represents distance between str1 of length ‘m’ characters and str2 of length ‘n’ characters, if ‘m’th character of str1 and ‘n’th character of str2 are same, then we simply need to fill cell(m,n) using value of cell (m-1, n-1) which represents edit distance between first ‘m-1’ characters of str1 and first ‘n-1’ characters of str2.
If ‘m’th character of str1 is not equal to ‘n’th character of str2, then we choose minimum value from following three cases-
- Delete ‘m’th character of str1 and compute edit distance between ‘m-1’ characters of str1 and ‘n’ characters of str2. For this computation, we simply have to do – (1 + array[m-1][n]) where 1 is the cost of delete operation and array[m-1][n] is edit distance between ‘m-1’ characters of str1 and ‘n’ characters of str2. After performing delete operation on str1, we need to consider first ‘m-1’ characters of str1 and first ‘n’ characters of str2 for computing edit distance.
- Similarly, for the second case of inserting last character of str2 into str1, we have to do – (1 + array[m][n-1]). After performing insert operation on str1, we need to consider first ‘m’ characters of str1 and first ‘n-1’ characters of str2 for computing edit distance.
- And for the third case of substituting last character of str1 by last character of str2 we use – (1 + array[m-1][n-1]). After performing substitute operation on str1, we now need to consider first ‘m-1’ characters of str1 and first ‘n-1’ characters of str2 for computing edit distance.