It is fully automated and can batch-process PDF/DJVU files.
Graphics or figures, and it works equally well on native and/or scanned or bitmapped PDF or DJVUįiles. K2pdfopt is meant for text-based files on a white background which may also have 6-inch) mobile reader and smartphone screens such as the K2pdfopt (Kindle 2 PDF Optimizer) is a stand-alone program which optimizesįor viewing on small (e.g. The accuracy of the Tesseract v4.0.0 OCR. This fixes an issue in v2.50 where the Tesseract OCR would not run on modern PCs and enhances
The MS Windows binaries will show v2.51a, but the Linux and OSX binaries will show v2.51 PDF file information in the MS Windows GUI. Re-direct http requests to https requests. Today I configured my site (and my backup site ) to My site () now offers SSL/https connectivity.Īpparently this happened without my being notified,Īt no charge to me, which is nice.
Mobileread forum and I just did a long overdue update to the OCR help page. In the mean time, I do still answer questions on the K2pdfopt has dwindled, but I still hope to get around to some key bug fixes and updatedīuilds at some point. The amount of time I can afford to work on To my third-party contribution page, I took the time to update that page, particularly the section on KOReader, which I had not looked it in a while. Wang's goal is to mimic the Windows GUI as closely as possible.
There are also a few enhancements including the ability to directly download Tesseract This is primarily a bug-fix release, fixing over 20 issues that have accumulated over time. See details in the k2pdfopt version history. This version improves OCR multithreading, adds better DJVU support (text layer extraction),Īdds CBZ support, and is compiled with the latest third party libraries, e.g. Performance is even more impressive when you consider that its thermal design power (TDP) isĪbout 20 W compared to the i9-9900's 65 W. Interesting also that clang v12 beats gcc v11 handily. Is not as dramatic, probably because Tesseract has optimizations for the hardware extensions in Performance since only the OCR processing in k2pdfopt is multithreaded. The "No OCR" row compares single-threaded an Apple M1 with twoĭifferent C compilers (I posted the clang v12 version). See the tableīelow comparing k2pdfopt performance on a core i9-9900 vs. On the latest Macs with the M1 chip, which is a very impressive performer. The download page manually in your browser (click the refresh button). I've re-worked my download page a bit to try and make it smarterĪbout forcing a fresh load every time as opposed to the browser pulling it up fromĪn internal cache, which can cause problems with expired capcha values. I had a user request a binaryįor their Pinephone, so I'm hoping this will work. If anybody can give me feedback whether it works or not. I cross-compiled on a Debian 10 virtual linux box on my Windows PC. I've added a Linux Aarch64 binary to my download page which The optimum character height of a capital letter is between 25 and 35 pixels for the bestĪccuracy for both Tesseract v4.1 and v5.1. Technique, that, interestingly, is 30% faster on the "best" english training file butĤ0% slower than Tesseract 4.1 on the "fast" english training file (on a Core i9-9900 CPU). Has identical accuracy to Tesseract v4.1, but uses a new 32-bit floating point calculation
It and benchmarked it with a standard test I have. I did some experimenting with Tesseract (OCR) v5.1 today. "Thanks so much for your extremely useful software." - May 2, 2012 "This is a terrific program and for regular 2-column PDFs it works like a charm." - September 7, 2012 Unsolicited comments from k2pdfopt users (refresh the page to change the comments): Here's a quick example of what k2pdfopt can do (click on the images below to get the PDF files): My most recommended solution for this on my PDF Conversion Tips page. Want to convert your PDF to an e-book format like epub or mobi, you might want to check Note to MS Word users: While I still suggest you try out k2pdfopt, if you truly
There areĭownloads for MS Windows, Mac OSX, and Linux. It can generate native or bitmapped PDF output, with an optional OCR layer. It can also be used as a general PDF copying/cropping/re-sizing/OCR-ing manipulation tool. It works well on multi-column PDF/DJVU files and can re-flow text even on scanned PDF files. K2pdfopt optimizes PDF/DJVU files for mobile e-readers ( e.g.