Hacker Newsnew | past | comments | ask | show | jobs | submit | seedgou's commentslogin

The site's logic is nearly identical to this gist: use ImageMagick to do the rotate, noise, etc.



The GitHub link is prominent on their main page but what wasn't obvious was an example PDF

https://github.com/rwv/lookscanned.io/blob/main/public/examp...


Good idea! A random distribution on rotation seems a more user-friendly way instead of setting 10 rotation values.


Obviously you need to randomly fold an edge and wrinkle a page too. Goddamn paper feeders.


I didn't manipulate the data in pixel level. Maybe because I render PDF in 2x which causes 4x more pixels?


That could be the case yes, still I feel it should be better, let me do a quick test as I have some spare time.


Here's a very naive blur implementation (which is your most expensive operation there),

https://codepen.io/almosnow/pen/abEXBZP?editors=0011

(at the end of the blur pass it prints the elapsed time to the console)

You're right, it does get kind of slow at 2x, but not that slow, on my laptop it takes around 1 sec/page, while on your site takes 20-30 secs/page. Also, my very naive code does not take into account "warming up" and some other code optimizations to make the blur much faster, you could easily get it down to 100ms/page, I'm sure!

Best luck!


Oh! You mean the scanning speed. I thought you was talking about the original PDF preview. For now, scanning is using emscriptened ImageMagick Wasm. Due to the translation from C++ to Wasm, the scanning speed is very slow. Maybe re-writing blur, rotate and noise algorithm will speed up the scanning.


Ah, wasm... The site managed to almost kill my machine until the tab committed OOM suicide, I guess this explains why.


The rendering logic is in `src/utils/pdf/renderPage.ts` and has only 26 lines.


yeah even playing with the preview and using the sliders it's super slow apart from that it's amazing! Do some work on the perf pls.


You are right about performance, but does it really matter?

It feels like this is the sort of tool one needs (very) infrequently, and those cases don’t seem like the sort of thing where seconds really matter. I think it’s plenty good enough.

I prefer to focus on how grateful I am that the author has made this and published it for free.


I believe it does matter.

When one first opens the site and nothing happens for 30 secs. you assume that the pdf you're looking at is the actual result (that happened to me, at least), then the other one pops up and you're like ... ooooh I get it!

Most users wouldn't be as patient and just leave.


There's an example PDF after clicking "START SCANNING" button. Maybe add more real-world examples.


This is inspired by baicunko/scanyourpdf and previous HN link: https://news.ycombinator.com/item?id=23157408


Use PDF.js and magica to do the rendering and processing. You could see the credits in GitHub repo page.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: