/r/bookscanning
For those who want to scan old books, magazines, and other paper documents to create and collect digital copies.
For those who want to scan old books, magazines, and other paper documents to create and collect digital copies.
We welcome new or expert users whether you have a DIY book scanning machine, or an inexpensive flatbed.
Feel free to post your methods, questions, and resources. Also, don't hesitate to share your scans or scans done by others!
/r/bookscanning
Now that we're both WFH for the time being I've got my husband's industrial office printer/scanner set up in our bedroom and I'm taking full advantage! My current project is scanning all my old notebooks from college and high school. Turns out a lot of them were written in pencil which is a little tricky. The writing keeps coming through super faint.
Does anyone have any standard setting suggestions for scanning pencil handwriting? I have a lot of different options I can adjust. DPI. Brightness. Contrast. Gamma. Text Enhancement. Unsharp Mask...?
Any tips would be much appreciated. Thanks!
Is there an alternative to Scan Tailor that’s a bit more automated?
I’ve used it on two books so far but find (due to my less than perfect DIY shooting setup) that I have to make a lot of manual adjustments in Scan Tailor.
Also, when books have got many chapters of text in the same format (eg same font size, paragraph width and number of lines per page) it’d be great to have them intelligentlly lined up with one another based on content, but as far as I can tell Scan Tailor doesn’t do this? Is there some software that does?
I’m not knocking ST, it’s amazing for free software, but it’s also quite old now. I’m hoping to find something that perhaps takes advantage of object recognition to process pages.
I should have posted my initial query on this subreddit or at the piracy sub since digitally scanning is apparently considered piracy regardless that it is only for personal use and backup which warrants an instant permanent ban on another sub.
I have 35 books from my old collection that I'm considering digitally scanning for my own personal safekeeping and preservation, however, I prefer not to scan books that have already been scanned by the community.
My initial question was: "Does anyone have or know where I can obtain the catalog file or spreadsheet that has a listing of all independent publisher comics that have been scanned and unscanned to date".
It would look very similar to this:
Wanted to see if anyone had used the CZUR ET18 Pro with Mac OS X in the past year (the scanning software seems to have been updated for Mac OS X earlier in 2019). The knock seems to have been it wasn't very well integrated back in 2016-2017 timeframe. Wanted to se if that had changed.
What are the recommended book scanning services for those of us that don't want to make a set up just to scan a couple of books?
Here's what I've found so far:
I have a contract to scan around 100 books. I currently own the Fujistu SV600.
Would one of these (https://youtu.be/JoU3Q4CUNog) with my SV600 be a better option or CZUR ET16 Plus be a better option?
I scanned a document that is 8.5x5.5 (half page size), and was contained within a mini 3 ring binder. Because of that, the margins on each page are different, so as to make a uniform appearance within the 3 ring binder. This is great for a physical document, but makes for very odd shifting within the scanned pages.
Additionally, the 3 ring binder holes are visible within the scan. I'd like to remove them to make a clean document.
Any advice on how to correct these issues so I can have a clean image pdf, which can then be OCRed (the text itself is very clean and shouldn't have a problem), would be deeply appreciated!
Any tips for noob scanner?
Hey guys, I am scanning a book that was too big for the flatbed.
I used my iphone to capture the pics, but the pages weren't flattened
Now my OCR is lopsided and I can't get ABBYY or Acrobat to deskew, or correct the geometry.
Does anyone have any ideas how to deskew these photos and straighten the text?
The text is just a little skewed but it makes for an imperfect PDF
Edit: here's a few example pages
Ok, so I've already purchased a CRUZ ET-16.
And I'll just start off by saying that I'm a little disappointed. Operation wise it couldn't be smoother. Does everything it says - scanning is a breeze, the software works very well (most of the time). These aren't the problem.
The whole purpose of me getting the ET-16 was so I could scan images from books that wouldn't fit on a traditional scanner. This is where is problem is. The quality of the image scans leave much to be desired. At the very best they are not worth putting on my websites. I've tried every setting in the program, with light, without light - it doesn't matter. Whatever I do the images come out way to dark or way over exposed. I've even attempted to use a photo-editing program to fix them and this has proved fruitless.
So has anyone else experienced this and been able to correct it? I would really like to get this to work.
I have a book which was written by my great Uncle. The book is about his experience in the 26th infantry during WWI in France. All I have of the book is a proof. The proof was done by a publisher in the late 20s, this publisher went out of business later.
I tried scanning it on a copy machine but the copy doesn't look great. It looks like a lot of the scanners discussed in this sub work well.
I really want to get a digital version to preserve this book and try to get it published. Is there anyone in the Houston or New Orleans area with scanner that could help or recommend a resource in those area? Thanks so much
I recently scanned my entire family photo collection for digital records and so of course no good deed goes unpunished.
My aunt has dozens of cookbooks, magazines and random pages of recipes she asked if I could scan them and put them into something organized / searchable on her PC for her. She older and having issues remembering what recipe is in what book etc.
I've got a brother flat bed scanner and tried the adobe app, its a manageable process but I'm sure there is a better option out there. I'm also having issues when it comes to saving them. I don't know if I should put them all in a PDF(cant seem to make one that's not an E pdf) or leave them in Jpeg and convert them later. I did try a few methods on some of the smaller books but its just a mess so far. Since I'm new to the entire process I'm still a bit lost on how to start this hole thing so I can have 1 process and power through it(if possible) like with the photo albums
The most I've found is people who cut the binding and scan them that way and I came across scanner apps but they don't seem to show good quality when it comes to the text or amount of pages
Side note- Some of the cookbooks are extremely old so care while scanning is appreciated
I know this this wont be as simple as scanning photos but any suggestions on the process, apps, tactics, helpful links to articles etc would be Greatly appreciated. Anything that saves time and ease of organizing big batches is a top priority. Thanks in advance
I can't OCR and break the text and images into elements. What is the best way to darken the text layer to improve the contrast in colour images?
As the final file will be a PDF with jpeg compression on text and images file sizes are not too good. Ideally I would want my PDF file to contain a Jpeg image of the background layer with high compression (and maybe a low dpi as well) and then overlayed on that should be the foreground text layer.
For the foreground text layer the text could be 2-8 colours (1 to 3 bit) and have a lower compression scheme and higher DPI... it just now clicks to check out DJVU files and it seems that it does something like that.
If it is not possible for a PDF 1.6 to have two image layers (1 trasparent), what is the best way to target good compression, and text quality when there are images and coloured backgrounds?
So...
I've utilized my works large flatbed scanner to scan in a textbooks and I have 43 pages of TIFF images each of 2 pages of the textbook.
How can I split these into single pages then get them into a PDF? Not bothered about OCR as I don't think the scans are of decent enough quality.
TL;DR: I am making an automatic page-turning open source book scanner and I create weekly videos about the project.
Sometimes I take on projects that are reaching far. This one is no exception. I wanted to share this project with all of you for a while now, and now as the project is well underway, I can do that. With the support of Wikimedia Deutschland and c-base I have created a project called Libreflip, a page turning book scanner! It is open source, because I want everybody to be able to recreate one of these machines. I tried to minimize the necessary tools as much as possible and I am creating a video series about every step of building the machine.
I believe that too much knowledge is bound to one place. Libraries are bound to opening hours and charge you to scan your books. Companies scan books but do not give you free access. By open sourcing this machine I want to enable makers and communities all over the world to create automatic book scanners. Information should be accessible for everyone and this is a important step towards it.
Current commercial book scanners, are usually cost-wise well in the 6-digits, Libreflip is in the lower 4-digits. This low price point keeps it accessible for communities, small libraries and book-rescue organisations.
Please support the project by telling others about it, on- and offline. I publish new content weekly on Youtube, which gives you regular helpful propaganda. And, Youtube still offers this rare and free feature called “subscribe” which helps you never miss new content, so please subscribe now ;-) Youtube: https://www.youtube.com/channel/UC-kFCGw0FF8Jc06tcGcUrlg
Some more Links:
twitter.com/libreflip
facebook.com/libreflip
github.com/libreflip
PS. Actually, I just want to enable our future AI-Overlords to parse all of human knowledge, but please don't tell anyone.
Hello! I'm using Finereader 12 to automate book scanning process. Clicking each time to scan a page would drive me crazy, so this really is a lifesaver. In Finereader 12 scanning dialogue, you just tick "Pause for ... 0 ... seconds after each page" and it goes hundreds of pages till you stop it.
Is there any other software that can do this?
Hello guys! What are some fastest flatbed scanners under $500? Scanning speed is usually not indicated in product specs and it is difficult to tell how fast it really is. My current office flatbed scanner does one page in 6 seconds with ABBYY Finereader 12, continuous multi-page scanning, 300 dpi, grayscale.
Hi everyone,
Just wondering if anyone knows of any free to use bookscanners in London. I know they have scanners in the British Library, but they charge for using them. Any libraries or other places where you're not asked to pay for scanning?
Best, x
Hi all,
Wondering if anyone of you combines ABBYY and Acrobat Pro in their scan-to-OCR-PDF workflow? ABBYY has absolutely flawless OCR, but Acrobat produces nicer looking text. If someone knows how to get the best of both worlds, I'd be very grateful for a reply.
I am interested in buying one to scan books. Most reviews i see online seem to be propaganda..does anyone have experience with it for scanning books in the real world? Would you recommend it? Is there something else coming out soon? Or a different unit you'd recommend?
Hi
Couple of questions, that actually converge in regards to post processing.
1 - I downloaded a load of out of copyright books from Archive.org. See https://archive.org/details/receiptbook00rolf In the above example, the book has been photographed and it's paper colour is included within the PDF. When I scan my own books, my brother MFC scanner has a background removal feature, that does a great job, but I can't find an equivalent for things already scanned.
2 - There have been a couple of items and books that i have destructivley scanned through said Brother MFC printer scanner. the only options on it's scan is either black white or colour. However when you have a black white printed book and nearly every page has a colour image, the whole scan is colour, which over a couple of hundred pages increases the size of the PDF.
3 - The brother MFC printer scanner, scans A4 or A3. So if a book or document is a little smaller, there are edge markings. My smaller A4 brother MFC crops in on the edge, but not the large duplexing unit. I have spoken to Brother and it is a feature of the device, not a setting to change. How do I post process these.
I have Foxit Phantom PDF, which does a great job, but I cannot seem to convert books to black white, or get it to change the spec to black white & colour.
I am aware that Photoshop can do these tasks manually, but how to I automate a variable process?
Is there other process's or software that can be used. I have the above, but am unwilling to spend on Acrobat or other software until I know my issues 'will' be resolved, and of course free or cheaper solutions are preferred.
Ta
Any way to convert PDF book scans into PDFs with vector text? I want to convert the IMAGES of text into VECTOR text.
What is the best way to do this? What software do I need?
Hi everyone! Apparently I am not the most capable OSX command line user, as I fail persistently trying to build Scantailor (the standard version, not the experimental one) from source. Would anyone in here be so kind as to provide me with a *.dmg file?
Good evening,
Has anyone succeeded in compiling scantailor experimental from here?
If so, I would be much indebted if it could be posted, as I'm having a hard time getting this new version to compile.
TIA