Question 1: For a given shredded paper (only vertically cut) from a printed text file on the same page, we can establish a model and algorithm for splicing and restoring the shredded data of one page of Chinese and one page of English files given in Appendix 1 and Appendix 2. For conventional document fragments, computer mosaic methods generally use geometric features such as sharp point features, sharp corner features and area features of fragment edges to search for matching adjacent fragments and splice them. This mosaic method based on boundary geometric features is suitable for fragments with similar edges. According to the pictures given in Appendix 1 and Appendix 2, we can see that there are only two kinds of pictures in black and white, and they are cut by machine, and the size and shape of each picture are basically the same. Therefore, the image mosaic of this topic is not suitable for feature matching, but only suitable for gray-scale mosaic. We can use the gray matching model to binarize all the pictures in Annex 1 and Annex 2, and convert the pictures into a numerical matrix. In this way, the text signal can be converted into a digital signal. Finally, we can use matlab programming, according to a certain similarity measure, automatically splicing the first column and the last column of the converted digital matrix of each picture. According to the obtained mosaic picture, human eyes are recognized. If the obtained picture is complete and correct, there is no need for manual intervention. If there is an error in the picture, manual intervention is needed.
Question 2: When the shredder cuts vertically and horizontally, we design a model and algorithm for splicing and restoring paper scraps, which can be modified on the basis of the model and algorithm of question 1 to splice and restore the fragment data of one page of Chinese and one page of English documents given in Appendix 3 and Appendix 4. Our basic idea to solve the problem is to binarize the gray level of each picture and select the first and last rows of each matrix. Then use matlab to program, first match the first row and the last row of the data matrix, and then get 19 vertically spliced pictures. Then 19 images are binarized, and the first and last columns of each binarization matrix are selected. A complete picture can be obtained by using the matlab program in question 1. If an error occurs in the initial column splicing process, manual intervention will be conducted. After stitching, if there are errors in the obtained pictures, we will manually intervene to correct the errors.
Thirdly, when the given fragment data is a double-sided printed file, we need to use different models to recover the fragments. Appendix 5 shows the fragment data of a double-sided printed document with English printed text. We must design the corresponding model and algorithm to splice and restore paper pieces. The picture given in Annex 5 only shows two sides of each paper, and it is not certain which side is the front and which side is the back. And both sides are the same text and font, there is no difference.