Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Papers are already difficult to process when reading them carefully multiple times, what even is the point of turning them into an audio version? I am genuinely at a loss, unless we are talking about blind people


The YouTube Channel may shed some light. As I understand this, it is not reading the paper, but interpreting or summarizing it with visual cues as to which section it is analyzing.


I still don't get the purpose. If you have a video to watch it's not an audiobook anymore. Secondly, why not just read the abstract? The paper might contain formulas (need to be carefully read to understand) and data (need to be carefully read to understand). If you strip the paper of its scientific elements then only a series of badly justified steps remain, at which point you might as well just consider the abstract + conclusions paragraphs


The choice of the word "audiobook" is really unfortunate. That's never mentioned on the GitHub project page. I find LLMs to do a decent job of summarizing text. Obviously, it depends on the audience. If it is a subject-matter expert, they may not be happy with the result, but a layperson might be.


What if you want to hear about the latest arxiv updates while on your morning run?

This seems like a fantastic idea for that purpose.


I mean, couldn't you do it with a program that takes an RSS feed, parses the abstract from each paper and puts ot through a run of the mill TTS engine?


Sure, I've done something similar to that in a few hours.

The free TTS options aren't great still, and "just the abstract" is not the problem. I did full articles, and the hard...er (it was a few hours) part was extracting out the relevant sections of full papers without the 'junk' info (page numbers, superscript citations, 7 pages of authors in any cern paper, etc.

So you get something, but it's often not good.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: