seedgou's comments

seedgou · on April 19, 2022

The site's logic is nearly identical to this gist: use ImageMagick to do the rotate, noise, etc.

seedgou · on April 19, 2022

GitHub URL: https://github.com/rwv/lookscanned.io

hatmatrix · on April 20, 2022

The GitHub link is prominent on their main page but what wasn't obvious was an example PDF

https://github.com/rwv/lookscanned.io/blob/main/public/examp...

seedgou · on April 19, 2022

Good idea! A random distribution on rotation seems a more user-friendly way instead of setting 10 rotation values.

fnordpiglet · on April 19, 2022

Obviously you need to randomly fold an edge and wrinkle a page too. Goddamn paper feeders.

seedgou · on April 19, 2022

I didn't manipulate the data in pixel level. Maybe because I render PDF in 2x which causes 4x more pixels?

moralestapia · on April 19, 2022

That could be the case yes, still I feel it should be better, let me do a quick test as I have some spare time.

moralestapia · on April 19, 2022

Here's a very naive blur implementation (which is your most expensive operation there),

https://codepen.io/almosnow/pen/abEXBZP?editors=0011

(at the end of the blur pass it prints the elapsed time to the console)

You're right, it does get kind of slow at 2x, but not that slow, on my laptop it takes around 1 sec/page, while on your site takes 20-30 secs/page. Also, my very naive code does not take into account "warming up" and some other code optimizations to make the blur much faster, you could easily get it down to 100ms/page, I'm sure!

Best luck!

seedgou · on April 19, 2022

Oh! You mean the scanning speed. I thought you was talking about the original PDF preview. For now, scanning is using emscriptened ImageMagick Wasm. Due to the translation from C++ to Wasm, the scanning speed is very slow. Maybe re-writing blur, rotate and noise algorithm will speed up the scanning.

hobo_mark · on April 19, 2022

Ah, wasm... The site managed to almost kill my machine until the tab committed OOM suicide, I guess this explains why.

seedgou · on April 19, 2022

The rendering logic is in `src/utils/pdf/renderPage.ts` and has only 26 lines.

zikohh · on April 19, 2022

yeah even playing with the preview and using the sliders it's super slow apart from that it's amazing! Do some work on the perf pls.

obeattie · on April 19, 2022

You are right about performance, but does it really matter?

It feels like this is the sort of tool one needs (very) infrequently, and those cases don’t seem like the sort of thing where seconds really matter. I think it’s plenty good enough.

I prefer to focus on how grateful I am that the author has made this and published it for free.

moralestapia · on April 19, 2022

I believe it does matter.

When one first opens the site and nothing happens for 30 secs. you assume that the pdf you're looking at is the actual result (that happened to me, at least), then the other one pops up and you're like ... ooooh I get it!

Most users wouldn't be as patient and just leave.

seedgou · on April 19, 2022

There's an example PDF after clicking "START SCANNING" button. Maybe add more real-world examples.

seedgou · on April 19, 2022

This is inspired by baicunko/scanyourpdf and previous HN link: https://news.ycombinator.com/item?id=23157408

seedgou · on April 19, 2022

Use PDF.js and magica to do the rendering and processing. You could see the credits in GitHub repo page.