Jump to content

Recommended Posts

Posted

Any techie folks out there set up their own gaming archives with Paperless NGX? If so, what sort of resources is it consuming and how relevant are the search results you get?

Inquiring minds setting up their homelabs what to know! 🙂

SDLeary

Posted
1 hour ago, mfbrandi said:

Just so. 

Basically, can create a database of various formats, including PDFs; creates PDF/A files from that, and runs OCR (though not sure why is should if it can raw convert files) and then enable the ability to search within each title.

Thus a search could be simultaneously run on all the titles in your library. I would assume also beneficial for versioning in personal projects.

SDLeary

Posted (edited)
3 hours ago, SDLeary said:

 ... and runs OCR (though not sure why is should if it can raw convert files) ...

Some PDF files are just photocopy scans per-page; a bunch of images-of-text, needing to be OCR'ed to make said text searchable.

I am frankly dubious, as a gaming thing...
Older (and thus non-pristine) original texts, sloppily-copied books (i.e. deformed imagery near the spine), and frankly-weird "fantasy name" spellings often result in ... um... less-than-correct OCR'ing.

It's my understanding that the RQ2Classic KickStarter engendered   Passion: "Hate(OCR)"  within @Rick Meints, though I may have misunderstood.

Edited by g33k
  • Like 1

C'es ne pas un .sig

Posted (edited)
3 hours ago, g33k said:

It's my understanding that the RQ2Classic KickStarter engendered   Passion: "Hate(OCR)"  within @Rick Meints, though I may have misunderstood.

After dealing with Abbyy Fine Reader I can empathize. The hit rate is in the high 90 percentiles, but with thousands of words finding all the misses can be seriously frustrating. And there is ALWAYS error sequences that seem to make it past spell checkers.

SDLeary

Edited by SDLeary
  • Haha 1
Posted

Such an OGX was what I was trying to create back in the 90ies and naughties with my index, with the dominant OCR method at the time being "Jörg types it". Of course we had no pdf format back then.

The basic logical model behind my index still is sound, but I haven't found time and energy to revive it and to feed it the data available now, and it needs to be expanded by a topographical model, maps generated from that, and imagery.

Telling how it is excessive verbis

 

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...