Created a basic document scanner using OpenCV that detects a 4-sided document in an image and straightens it out for a full frontal view.
I created a basic document scanner using OpenCV that detects a 4-sided document in an image and straightens it out for a full frontal view. This is based on the excellent How to Build a Kick-Ass Mobile Document Scanner in Just 5 Minutes tutorial from pyimagesearch.
The basic steps are:
- load the image and resize it (apparently the edge detection does not work well on large images)
data:image/s3,"s3://crabby-images/0ea5a/0ea5af0f4b1e21a06ea195192ea65eb187565d80" alt=""
- detect edges
data:image/s3,"s3://crabby-images/6e59b/6e59bf46cd182b5ffb2ce36c605cf0973c351c10" alt=""
- find all contours and return the top 5 largest of these contours
- convert the contours to polygon approximations (e.g. if the contours are a list of many points tracing out a n-sided polygon, then just turn it into n points that represent a similar approximated polygon – this is done using the Ramer-Douglas-Peucker algorithm)
- store the 4 points of the largest contour that can be approximated as a 4-sided polygon
data:image/s3,"s3://crabby-images/881b6/881b615018fee7672f7aa06baa315e784fa37808" alt=""
- calculate the width and height of the output image based on the 4-sided polygon
- perspective warp image by corner-pinning the 4 points on the polygon to 4 corners of output image – this will straighten out the sides to a frontal view
- apply thresholding for a photocopy look
data:image/s3,"s3://crabby-images/db502/db50253b70064ce07d3fea16264c19e546c89c06" alt=""
Seems like this method works well only with text documents on white paper. I tried a bunch of other examples that are more complicated and the script did not manage to pick out the boundaries.