The purpose of image thresholding is to divide pixel sets according to gray level, and each subset forms an area corresponding to the real scene. Every region has consistent attributes, but adjacent regions do not. This division can be achieved by selecting one or more thresholds from gray levels.
The basic principle is that the image pixels are divided into several categories by setting different feature thresholds.
Commonly used features include: gray or color features directly from the original image; Features derived from the transformation of original gray or color values.
Let the original image be f(x, y), and find the eigenvalue t according to certain criteria f(x, y), and divide the image into two parts. The segmented image is:
If b0=0 (black) and b 1= 1 (white), it is what we usually call image binarization.
The threshold segmentation method is actually the following transformation from the input image f to the output image g:
Where t is the threshold, g(i, j)= 1 for the image element of the object, and g(i, j)=0 for the image element of the background.
It can be seen that the key of threshold segmentation algorithm is to determine the threshold. If an appropriate threshold can be determined, the image can be accurately segmented. After the threshold is determined, the threshold is compared with the gray value of each pixel one by one, and each pixel can be segmented in parallel, and the segmentation result directly gives the image area.
The advantages of threshold segmentation are simple calculation, high operation efficiency and high speed. There are various threshold processing techniques, including global threshold, adaptive threshold and optimal threshold.
Threshold processing technology see:
Region segmentation refers to dividing an image into different regions according to similarity criteria, which mainly includes region growth, region segmentation and merger, and watershed.
Region growing is an image segmentation method of continuous region segmentation. Regional growth refers to starting from one pixel and gradually increasing adjacent pixels according to certain standards. When certain conditions are met, regional growth is terminated. The quality of regional growth depends on 1. Selection of initial point (seed point). 2. Growth criteria. 3. Termination conditions. Region growth starts from one or several pixels, and finally gets the whole region, thus extracting the target.
The basic idea of region growing is to gather pixels with similar properties to form regions. Specifically, first, a seed pixel is found for each region to be segmented as the starting point of growth, and then pixels with the same or similar properties as the seed pixel in the neighborhood around the seed pixel (judged according to some predetermined growth or similarity criteria) are merged into the region where the seed pixel is located. These new pixels are used as new seed pixels to continue the above process until the pixels that meet the conditions are no longer included. So an area grew up.
Regional growth needs to select a group of seed pixels that can correctly represent the required region, determine the similarity criteria in the growth process, and formulate the conditions or criteria for stopping growth. Similarity criteria can be gray, color, texture, gradient and other characteristics. The selected seed pixel can be a single pixel or a small area containing several pixels. Most regional growth standards use local attributes of images. Growth standards can be formulated according to different principles, and the use of different growth standards will affect the process of regional growth.
Figure 1 is an example of regional growth.
Region growing is an ancient image segmentation method, and the earliest region growing image segmentation method was proposed by Levine and others. This method generally has two ways. One way is to give a small block or seed point in the target object to be segmented in the image, and then add the surrounding pixel points into it according to certain rules on the basis of the seed area, and finally combine all the pixel points representing the object into one area; The other is to divide the image into many small areas with strong consistency, such as the same gray value of pixels in the area, and then merge the small areas into large areas according to certain rules to achieve the purpose of image segmentation. Typical region growing methods, such as the region growing method based on faceted model proposed by T. C. Pong, have an inherent disadvantage, that is, it often leads to excessive segmentation, that is, the image is divided into too many regions.
The steps to achieve regional growth are as follows:
The basic idea of the regional splitting and merging algorithm is to determine a criterion of splitting and merging, that is, the measure of regional feature consistency. When the characteristics of a region in an image are inconsistent, the region is divided into four equal sub-regions, and when the adjacent sub-regions meet the consistent characteristics, they are merged into a large region until all regions no longer meet the conditions of division and merger. When it can't be further divided, the division is over, and then it will look for whether the adjacent areas have similar characteristics, and if so, it will merge similar areas and finally realize the function of division. To a certain extent, region growing and region splitting and merging algorithms have the same effect, which promote and complement each other. Region splitting to the extreme is to divide into single pixel points and then merge them according to certain metrics, which can be regarded as a region growing method of single pixel points to some extent. Region growing saves the splitting process compared with region splitting and merging method, which can carry out similar merging on the basis of larger similar regions, while region growing can only start from a single pixel.
An algorithm for repeatedly splitting and aggregating to satisfy constraints.
Let R represent the whole image area and choose a predicate P. One way to segment R is to divide the segmented image into four regions repeatedly until P (Ri) = true for any region Ri. This starts with the overall image. If P(R)=FALSE, the image is divided into four regions. If the value of p is false. For any one region, divide each of the four regions into four regions, and so on. This special segmentation technique is most convenient to be expressed in the form of a so-called quadtree (that is, each non-leaf node has exactly four subtrees), just like the tree shown in figure 10.42. Note that the root of the tree corresponds to the whole image, and each node corresponds to the segmented sub-part. At this point, only R4 is further subdivided.
If only splitting is used, the last partition may contain adjacent areas with the same attributes. This defect can be corrected by allowing regional aggregation while splitting. That is to say, only when P(Rj∪Rk)=TRUE can two adjacent regions Rj and Rk be aggregated.
The foregoing discussion can be summarized as the following process. At each step of the iterative operation, we need to do:
You can make some changes to the basic ideas mentioned above. For example, one possible variation is to initially divide an image into a set of image blocks. Then each block is further divided as described above, but the aggregation operation is limited by the fact that only four blocks can be merged into one group at the beginning. These four blocks are descendants of the nodes in the quadtree representation, and all satisfy the predicate P. When this aggregation can no longer be carried out, the process ends with the final region aggregation satisfying step 2. In this case, the size of the aggregation area may be different. The main advantage of this method is that before the last step of polymerization, both splitting and polymerization use the same quadtree.
Watershed segmentation is a mathematical morphology segmentation method based on topology theory. The basic idea is to regard the image as geodesic topological landform, and the gray value of each pixel in the image represents the altitude of the point. Each local minimum and its influence area are called catchment basins, and the boundaries of catchment basins form watersheds. The concept and formation of watershed can be illustrated by simulating the soaking process. Punch a small hole in each local minimum surface, and then slowly immerse the whole model in water. With the deepening of immersion, the influence area of each local minimum slowly expands outward, and a dam is built at the junction of two water collection tanks, which forms a watershed.
The calculation process of watershed is an iterative labeling process. The classical watershed calculation method was put forward by L. Vincent. In this algorithm, watershed calculation is divided into two steps, one is sorting process, and the other is flooding process. Firstly, the gray level of each pixel is sorted from low to high, and then in the process of flooding from low to high, the influence domain of each local minimum at the height of H order is judged and marked by FIFO structure.
Watershed transformation is a watershed image of the input image, and the boundary points between watersheds are watersheds. Obviously, the watershed represents the maximum point of the input image. Therefore, in order to obtain the edge information of an image, a gradient image is usually used as an input image, that is,
Watershed algorithm has a good response to weak edges, and noise in the image and subtle gray changes on the surface of the object will produce excessive segmentation. But at the same time, it should be noted that the watershed algorithm has a good response to weak edges, which is guaranteed by closed continuous edges. In addition, the closed catchment basin obtained by watershed algorithm provides the possibility for analyzing the regional characteristics of the image.
In order to eliminate the over-segmentation caused by watershed algorithm, there are usually two processing methods. One is to use prior knowledge to remove irrelevant edge information. The second is to modify the gradient function so that the catchment basin only responds to the target to be measured.
In order to reduce the over-segmentation caused by watershed algorithm, gradient function usually needs to be modified. A simple method is to threshold the gradient image to eliminate the over-segmentation caused by the slight change of gray level. that is
The program can adopt the following methods: using a threshold to limit the gradient image, eliminating excessive segmentation caused by small changes in gray values, and getting suitable areas, and then sorting the gray levels of the edge points of these areas from low to high, and then realizing the flooding process from low to high, and calculating the gradient image with Sobel operator. In the threshold segmentation of gradient image, the selection of appropriate threshold has great influence on the final segmentation image, so the selection of threshold is a key to the image segmentation effect. Disadvantages: the actual image may contain weak edges, and the numerical difference of gray level change is not particularly obvious. If the threshold is too large, these weak edges may be eliminated.
Reference article:
An important way of image segmentation is through edge detection, that is, detecting the place where the gray level or structure suddenly changes, indicating that the end of one area is also the place where another area begins. This discontinuity is called an edge. Different images have different gray levels, and there are obvious edges at the boundary. This feature can be used to segment images.
The gray value of image edge pixels is discontinuous, which can be detected by derivation. For the step edge, its position corresponds to the extreme point of the first derivative and the zero crossing point of the second derivative. Therefore, differential operators are often used for edge detection. Commonly used first-order differential operators include Roberts operator, Prewitt operator and Sobel operator, while second-order differential operators include Laplace operator and Kirsh operator. In practice, various differential operators are often represented by small-area templates, and differential operations are realized by convolution of templates and images. These operators are sensitive to noise and are only suitable for images with low noise and complexity.
Because edge and noise are both gray-scale discontinuities and high-frequency components in frequency domain, it is difficult to overcome the influence of noise by direct differential operation. Therefore, before edge detection by differential operator, the image must be smoothed. LoG operator and Canny operator are second-order and first-order differential operators with smoothing function, and the edge detection effect is good.
In edge detection algorithm, the first three steps are very common. This is because in most cases, the edge detector only needs to point out that the edge appears near the pixels of the image, and does not need to point out the exact position or direction of the edge. Edge detection error usually refers to the error of edge misclassification, that is, false edges are identified as edges and kept, while true edges are identified as false edges and removed. The error of edge estimation is described by probability statistical model. We distinguish between edge detection error and edge estimation error, because their calculation methods are completely different and their errors are also different.
Roberts operator: accurate edge location, but sensitive to noise. It is suitable for image segmentation with obvious edges and less noise. Roberts edge detection operator is an operator that uses local difference operator to find the edge. After image processing, the edge of Robert operator is not very smooth. After analysis, because Robert operator usually produces a wide response in the area near the edge of the image, the edge image detected by the above operators often needs thinning, and the accuracy of edge location is not very high.
Prewitt operator: It can suppress noise. The principle of noise suppression is pixel average, but pixel average is equivalent to low-pass filtering of images, so Prewitt operator is not as good as Roberts operator in edge location.
Sobel operator: Sobel operator and Prewitt operator are both weighted averages, but Sobel operator thinks that the influence of adjacent pixels on the current pixel is not equivalent, so the different weights of pixels at different distances have different effects on the operator results. Generally speaking, the farther the distance, the smaller the impact.
Isotropic Sobel operator: weighted average operator, whose weight is inversely proportional to the distance between adjacent points and central point, and the gradient amplitude is the same when detecting edges in different directions, commonly known as isotropy.
Sobel operator is a common template in edge detection. There are two Sobel operators, one is to detect the horizontal edge; The other is to detect vertical flat edges. Another form of Sobel operator is isotropic Sobel operator, which has two operators, one for detecting horizontal edges and the other for detecting vertical flat edges. Compared with the ordinary Sobel operator, the isotropic Sobel operator has more accurate position weighting coefficient, and the gradient amplitude is consistent when detecting edges in different directions. Because of the particularity of building images, we can find that it is not necessary to calculate the gradient direction when processing the contours of such images, so the program does not give the processing method of isotropic Sobel operator.
In 197 1, R.Kirsch[34] proposed a new Kirsch operator method that can detect the edge direction: eight templates are used to determine the gradient amplitude and gradient direction.
Every point in the image is convolved with 8 masks, and each mask has the greatest response to a certain edge direction. The maximum values in all eight directions are used as the output of the edge amplitude image. The sequence number of the maximum response mask constitutes the coding in the edge direction.
The gradient amplitude value of Kirsch operator is expressed by the following formula:
Comparison of different detection operators;
Reference article:
The article is quoted from "Wood Night Traces".
Edit Lornatang
Calibrate Lornatang