Algorithm for Identifying Barely Legible or Embossed Text in an Image
Azoft Blog Algorithm for Identifying Barely Legible or Embossed Text in an Image

Algorithm for Identifying Barely Legible or Embossed Text in an Image

By Ivan Ozhiganov on April 11, 2013

Imagine being able to take a picture of barely legible text, and then have your smartphone automatically identify whatever is written in the text. In fact, there are already many solutions available for scanning and decoding printed text in an image, but these solutions usually require the text to be clear and with good contrast. Now, what if you need to detect text that doesn’t have much contrast compared to the background, such as embossed credit card numbers?

Detecting embossed text in images is a task that poses a number of challenges. Embossed characters don't typically have a uniform color and may have low contrast with their background or intersect various surrounding irregularities. Traditional approaches to character segmentation designed for scanned text cannot be used in such conditions. Obviously, some kind of preprocessing is required here, but classical filters such as Gaussian and Median fail to produce good results. For all these reasons, we decided to search for a specialized algorithm and found one particularly suitable for our project[1]. In this article we'd like to present a slightly modified version of this algorithm to fit our task.

Stroke width algorithm is based on an assumption that textual characters generally have nearly constant stroke width. This kind of strokes is separated from other elements by the algorithm to recover regions with text. Background noise is reduced, leaving out lines and patterns.

Stroke width algorithm requires certain preprocessing of the original data to achieve the desired result. The preprocessing stage consists of the following steps:

Step 1. Convert the source image I to grayscale.

Step 2. Detect edges in the grayscale image using the Sobel or similar operator. The resulting image we denote by Ie.

Step 3. Perform binarization of Ie using Otsu's method. The resulting image we denote by Ib.

Both Ie and Ib are used as input for the stroke width algorithm. Next, we need to perform the local binarization and voting steps:

Step 4. Create a 2-dimensional array S with the same dimensions as I and fill it with zeroes.

Step 5. Create a binary mask Win and binary mask Wout. Their dimensions should be Nin×Nin and Nout×Nout respectively. Nin and Nout values depend on the stroke width in the image and Nin is always less than or equal to Nout.

Step 6. For every pixel Ie[i, j] that satisfies the condition Ib[i, b] = 1 we apply the Win mask centered on this pixel to the image and look for minimum and maximum values (Pmin, Pmax) among the pixels found within this mask.

Step 7. The same pixel Ie[i, j] is then used to center the Wout mask and for every pixel falling into Wout we perform the transform: S[i+k, j+l] = S[i+k, j+l] + 1, if Ie[i+k, j+l] ≥ t(i,j), where k,l ≤ Nout / 2 and t(i, j) = (Pmax + Pmin) / 2.

The resulting grayscale image stored in 2-dimensional array S will have suppressed background and intensified strokes that compose text. The image is suitable for additional binarization or further processing (segmentation of digits, etc.). Like Ie, S is a grayscale image but with decreased range of pixel brightness. The brightness range depends on the size of Win and Wout (smaller masks result in a smaller range).

After some experiments, we discovered that the results can be improved if the binarization level used to produce Ib (step 3) is calculated as follows:

  • Apply Gaussian blur to Ie after detecting edges with the Sobel operator. The resulting image we denote by Ig, and we'll continue to use Ie in step 6.
  • Calculate the binarization level by processing the difference matrix abs(Ig-Ie) using Otsu's method.

Several example images processed using the described algorithm are provided below:

Grayscale Image Sobel Operator Method Otsu Method Strokes
Grayscale Image  Sobel Operator Method Otsu Method Otsu
Grayscale Image  Sobel Operator Method Otsu Method Otsu
Grayscale Image  Sobel Operator Method Otsu Method Otsu
Grayscale Image  Sobel Operator Method Otsu Method Otsu
Grayscale Image  Sobel Operator Method Otsu Method Otsu
Grayscale Image  Sobel Operator Method Otsu Method Otsu
Grayscale Image  Sobel Operator Method Otsu Method Otsu
Grayscale Image  Sobel Operator Method Otsu Method Otsu

The presented algorithm is applicable for practical use and does have several distinct features, namely:

  • The algorithm reduces noise and emphasizes text boundaries resulting in better character segmentation;
  • Distinctive character shapes don't get lost during processing, except when the original image is preprocessed by binarization methods.
  • The algorithm works equally well with high contrast and low contrast images. It does not require a separate normalization step.
  • The resulting image has a reduced brightness range compared to the original. This may come useful for character recognition and further binarization.

Implementing credit card number recognition in Objective C: Hands-on example

- (void)processingByStrokesMethod:(cv::Mat)src dst:(cv::Mat*)dst
{
/*
   src - input grayscale image
   dst - output grayscale image
*/
    cv::Mat tmp;
    cv::GaussianBlur(src, tmp, cv::Size(3,3), 2.0);                    // gaussian blur
    tmp = cv::abs(src - tmp);                                          // matrix of differences between source image and blur iamge
    
    //Binarization:
    cv::threshold(tmp, tmp, 0, 255, CV_THRESH_BINARY | CV_THRESH_OTSU);
    
    //Using method of strokes:
    int Wout = 12;
    int Win = Wout/2;
    int startXY = Win;
    int endY = src.rows - Win;
    int endX = src.cols - Win;
    
    for (int j = startXY; j < endY; j++) {
        for (int i = startXY; i < endX; i++) {
            //Only edge pixels:
            if (tmp.at(j,i) == 255)             {                 //Calculating maxP and minP within Win-region:                 unsigned char minP = src.at(j,i);                 unsigned char maxP = src.at(j,i);                 int offsetInWin = Win/2;                                  for (int m = - offsetInWin; m < offsetInWin; m++) {                     for (int n = - offsetInWin; n < offsetInWin; n++) {                         if (src.at(j+m,i+n) < minP) {                             minP = src.at(j+m,i+n);                         }else if (src.at(j+m,i+n) > maxP) {                             maxP = src.at(j+m,i+n);                         }                     }                 }                                  //Voiting:                 unsigned char meanP = lroundf((minP+maxP)/2.0);                                  for (int l = -Win; l < Win; l++) {                     for (int k = -Win; k < Win; k++) {                         if (src.at(j+l,i+k) >= meanP) {                             dst->at(j+l,i+k)++;                         }                     }                 }             }         }     }               ///// Normalization of imageOut:     unsigned char maxValue = dst->at(0,0);          for (int j = 0; j < dst->rows; j++) {              //finding max value of imageOut         for (int i = 0; i < dst->cols; i++) {             if (dst->at(j,i) > maxValue)                 maxValue = dst->at(j,i);         }     }     float knorm = 255.0 / maxValue;      for (int j = 0; j < dst->rows; j++) {             //normalization of imageOut         for (int i = 0; i < dst->cols; i++) {             dst->at(j,i) = lroundf(dst->at(j,i)*knorm);         }     } } 

References

[1] Jeong-Hun Jang, Ki-Sang Hong. Binarization of noisy gray-scale character images by thin line modeling, 1998.

VN:F [1.9.22_1171]
Rating: 4.7/5 (9 votes cast)
VN:F [1.9.22_1171]
Rating: +9 (from 9 votes)
Algorithm for Identifying Barely Legible or Embossed Text in an Image, 4.7 out of 5 based on 9 ratings



Request a Free Quote
 
 
 

Please enter the result and submit the form

Content created by Anton Vitvitsky