If you don’t know what the scanner deskew function is already, then you probably haven’t worked that much with scanners.
The deskew function in a scanner is able to straighten documents that have either gone skewed through the machine, or in some cases even documents that were printed skewed.
You see, document scanners these days are really quick machines. By quick we mean that most of them will handle 60 – 70 pages per minute. While this is still an entry level value, you can’t imagine how quick it actually is.
When the scanner starts scanning it seems hard to stop. But all this speed comes through some inherent paper feeding issues. Add to that improper maintenance and you have a killer.
How does Image skew occur
Before we go into deskew and skew correction, let’s focus on why this phenomenon occurs.
Difficult documents – Yes, no matter how good your scanners is, not all paper documents are built the same. Sometimes documents can be torn, crumpled or even be made of a certain paper that is difficult to feed through the scanner. Glossy paper is notorious for being quite difficult to scan. All these elements can increase the skew of the documents.
Improper scanner maintenance – Sometimes it may happen that you forgot to clean the equipment. Cleaning the paper path, rollers and other elements that help paper feeding is very important. Also, check for the general life of the transport rollers. These are also consumables you have to change from time to time. If you don’t do it regularly you might have increased skew in your documents.
Badly printed documents – When you do a fair bit of scanning, you actually learn just how many documents are badly printed. You won’t imagine just how many there are. You won’t notice it with the naked eye, but through scanning it always becomes clear.
Special situations – There are quite a fair bit of special cases in which image skew will appear.
- Problems with the scanner or the scanner driver are common reasons for your documents to come out skewed. Also, the driver might be corrupt and affecting the deskew function of the scanner.
- Another case is when you have documents that are “cheating” the deskew algorithms. This can be the case with magazines or any full-color document. The colored edges might be affecting the deskew function and generate really bad results.
How does the deskew Feature work in a scanner
While it might seem the case that the deskew function is built into the scanner, this is rarely the case. The deskew function uses the communication between the scanner and the scanner driver.
Through this communication, the scanner feeds information to the driver, that it has collected through different sensors. The driver or the scanning software takes all the inputs and calculates what the correct deskew angle should be. It then applies the settings to the image.
While it sounds reasonable, just imagine how quickly this has to happen. That is why, in today’s scanning bureaus, the processing power of the computers gets higher and higher. This way, the software can keep up with the hardware scanning speed.
Deskew by the alignment of the page edges
This is the deskew feature that is most used and most popular with scanner manufacturers. The driver uses the difference in color between the document background ( usually white ) and the scanner background color ( usually black ). It recognizes where the page edge is, cuts a part of it and it checks for the skew value.
This sounds simplistic, but it can prove quite difficult sometimes. Anything that will affects the sensors means that they will feed wrong data to the software. Wrong data means the calculations that the software makes are incorrect. When the software applies them you get bad results.
Most often, the results are pretty good. So you can rely on the scanner that values are correct.
But when you have full-color pages you will start to see problems. In such cases, the operator’s experience comes into play. With time they learn how to process each type of document. Given my experience in this field, I have always recommended that in such cases the deskew feature be turned off. Then we would use a different solution or a 3rd party software to straighten the images.
What I can also recommend for these cases is that you clean the rollers more often. This will actually decrease the skew value from one page to the other.
Deskew by the content of the page
This is a feature that it’s less used then the feature that deskews based on the page edge. What it does is to actually convert each text block into lines.
Based on the direction of the lines, the software calculates a general angle by which they are skewed. When it calculates an average value, it applies this value to the entire page.
While this may seem like the perfect solution it does have some limitation. It takes longer to process first of all. Requires more memory and decreases the scanning speed significantly.
Also, you have to make sure that the text blocks are quite homogenous. For example, newspapers may have differently angled text blocks within the same page. In such cases, the results will vary from one scan to another.
We only recommend this option when you know the original documents are badly printed. Otherwise, try and stick with the page alignment deskew function.
Why is the scanner deskew function so important
While in most cases, document skew will not affect the overall readability of the document, you have to look at the bigger picture.
The deskew function is very important first of all for the level of scanning quality. We can assure you that a supplier who delivers documents with barely any skew, achieves the same level of quality in other areas too. So rest assured, this you can fully trust such a supplier.
Second of all, your skewed documents are harder to OCR. When you OCR documents, there are 3 factors you should always take into account :
- Scanning resolution
- Text sharpness
- Last but not least a perfect deskew. Yes, if you correctly deskew documents you will have a higher level of OCR accuracy.
Last but not least think about the users of the documents. If they have to constantly adjust their eyes to your scanned documents it will make them tired. Tired operators equal decreased productivity.
Therefore you can be sure that in time you will see a decrease in the general productivity of work. If you’re a scanning service, even worse, customers will start avoiding you.