Imagine being able to take a picture of barely legible text, and then have your smartphone automatically identify whatever is written in the text. In fact, there are already many solutions available for scanning and decoding printed text in an image, but these solutions usually require the text to be clear and with good contrast. Now, what if you need to detect text that doesn’t have much contrast compared to the background, such as embossed credit card numbers?

Detecting embossed text in images is a task that poses a number of challenges. Embossed characters don't typically have a uniform color and may have low contrast with their background or intersect various surrounding irregularities. Traditional approaches to character segmentation designed for scanned text cannot be used in such conditions. Obviously, some kind of preprocessing is required here, but classical filters such as Gaussian and Median fail to produce good results. For all these reasons, we decided to search for a specialized algorithm and found one particularly suitable for our project^{[1]}. In this article we'd like to present a slightly modified version of this algorithm to fit our task.

Stroke width algorithm is based on an assumption that textual characters generally have nearly constant stroke width. This kind of strokes is separated from other elements by the algorithm to recover regions with text. Background noise is reduced, leaving out lines and patterns.

Stroke width algorithm requires certain preprocessing of the original data to achieve the desired result. The preprocessing stage consists of the following steps:

**Step 1**. Convert the source image **I** to grayscale.

**Step ****2**. Detect edges in the grayscale image using the Sobel or similar operator. The resulting image we denote by **Ie**.

**Step ****3**. Perform binarization of **Ie** using Otsu's method. The resulting image we denote by **Ib**.

Both **Ie** and **Ib** are used as input for the stroke width algorithm. Next, we need to perform the local binarization and voting steps:

**Step ****4**. Create a 2-dimensional array **S** with the same dimensions as *I* and fill it with zeroes.

**Step ****5**. Create a binary mask **W _{in}** and binary mask

**W**. Their dimensions should be

_{out}**N**×

_{in}**N**and

_{in}**N**×

_{out}**N**respectively.

_{out}**N**and

_{in}**N**values depend on the stroke width in the image and

_{out}**N**is always less than or equal to

_{in}**N**.

_{out}**Step ****6**. For every pixel **Ie[i, j]** that satisfies the condition **Ib[i, b] = 1** we apply the **W _{in}** mask centered on this pixel to the image and look for minimum and maximum values (

**P**,

_{min}**P**) among the pixels found within this mask.

_{max}**Step ****7**. The same pixel **Ie**[i, j] is then used to center the **W _{out}** mask and for every pixel falling into

**W**we perform the transform:

_{out}**S[i+k, j+l] = S[i+k, j+l]**+ 1, if

**Ie[i+k, j+l] ≥ t(i,j)**, where

**k**,

**l ≤ N**and

_{out}/ 2**t(i, j) = (Pmax + Pmin) / 2**.

The resulting grayscale image stored in 2-dimensional array **S** will have suppressed background and intensified strokes that compose text. The image is suitable for additional binarization or further processing (segmentation of digits, etc.). Like **Ie**, **S** is a grayscale image but with decreased range of pixel brightness. The brightness range depends on the size of **W _{in}** and

**W**(smaller masks result in a smaller range).

_{out}After some experiments, we discovered that the results can be improved if the binarization level used to produce Ib (step 3) is calculated as follows:

- Apply Gaussian blur to Ie after detecting edges with the Sobel operator. The resulting image we denote by
**Ig**, and we'll continue to use**Ie**in step 6. - Calculate the binarization level by processing the difference matrix
**abs(Ig-Ie)**using Otsu's method.

Several example images processed using the described algorithm are provided below:

Grayscale Image |
Sobel Operator |
Method Otsu |
Method Strokes |

The presented algorithm is applicable for practical use and does have several distinct features, namely:

- The algorithm reduces noise and emphasizes text boundaries resulting in better character segmentation;
- Distinctive character shapes don't get lost during processing, except when the original image is preprocessed by binarization methods.
- The algorithm works equally well with high contrast and low contrast images. It does not require a separate normalization step.
- The resulting image has a reduced brightness range compared to the original. This may come useful for character recognition and further binarization.

## Implementing credit card number recognition in Objective C: Hands-on example

- (void)processingByStrokesMethod:(cv::Mat)src dst:(cv::Mat*)dst { /* src - input grayscale image dst - output grayscale image */ cv::Mat tmp; cv::GaussianBlur(src, tmp, cv::Size(3,3), 2.0); // gaussian blur tmp = cv::abs(src - tmp); // matrix of differences between source image and blur iamge //Binarization: cv::threshold(tmp, tmp, 0, 255, CV_THRESH_BINARY | CV_THRESH_OTSU); //Using method of strokes: int Wout = 12; int Win = Wout/2; int startXY = Win; int endY = src.rows - Win; int endX = src.cols - Win; for (int j = startXY; j < endY; j++) { for (int i = startXY; i < endX; i++) { //Only edge pixels: if (tmp.at(j,i) == 255) { //Calculating maxP and minP within Win-region: unsigned char minP = src.at (j,i); unsigned char maxP = src.at (j,i); int offsetInWin = Win/2; for (int m = - offsetInWin; m < offsetInWin; m++) { for (int n = - offsetInWin; n < offsetInWin; n++) { if (src.at (j+m,i+n) < minP) { minP = src.at (j+m,i+n); }else if (src.at (j+m,i+n) > maxP) { maxP = src.at (j+m,i+n); } } } //Voiting: unsigned char meanP = lroundf((minP+maxP)/2.0); for (int l = -Win; l < Win; l++) { for (int k = -Win; k < Win; k++) { if (src.at (j+l,i+k) >= meanP) { dst->at (j+l,i+k)++; } } } } } } ///// Normalization of imageOut: unsigned char maxValue = dst->at (0,0); for (int j = 0; j < dst->rows; j++) { //finding max value of imageOut for (int i = 0; i < dst->cols; i++) { if (dst->at (j,i) > maxValue) maxValue = dst->at (j,i); } } float knorm = 255.0 / maxValue; for (int j = 0; j < dst->rows; j++) { //normalization of imageOut for (int i = 0; i < dst->cols; i++) { dst->at (j,i) = lroundf(dst->at (j,i)*knorm); } } }

## References

[1] Jeong-Hun Jang, Ki-Sang Hong. Binarization of noisy gray-scale character images by thin line modeling, 1998.

Algorithm for Identifying Barely Legible or Embossed Text in an Image,