Hacker Newsnew | past | comments | ask | show | jobs | submit | sidnb13's commentslogin

I don't think vector databases are intended to be secure, encrypted forms of data storage in the first place.


+1, immensely satisfying read for any aviation nut


Cool to see that this worked well for someone. Super hard to force the key insight in a problem to magically appear given more time sunk into it. Big weakness of mine honestly, and requires a lot of self-awareness to pull myself out of a problem-solving rut. I like the idea of hacking sleep - do you find yourself priming your mind with the problem before nodding off? Curious how a bedtime wind-down routine factors into how effective this is.


Over years of math undergrad and grad school I tried very hard and was never able to get this to work, so you're not alone. I was able to reliably reproduce hopeful feelings after sleep, but upon investigation the "new leads" were either things I had already tried (and forgotten why they didn't work) or they were the type of imprecise high-level vague direction ideas that were never difficult to generate and still had 99% of the true effort remaining to grind through the details.


Wow, sorry to hear that. Came across his blog in 2021 and was instantly hooked. RIP


Very cool. I used to dream of this stuff when I was younger. Reminds me of Atlantik Solar: https://www.atlantiksolar.ethz.ch/. Hasn't been updated in a while, but focused more on low-altitude autonomous survey missions.


Atlantik Solar was a very cool project, read a lot of their research - count me as a fan :)


I would assume the datacenter and infra needed would also contribute a sizeable chunk to the costs when you consider upkeep to run it 24/7


> I also believe that within say 1-3 years there will be a different type of training approach that does not require such large datasets or manual human feedback.

I guess if we ignore pretraining, don't sample-efficient fine-tuning on carefully curated instruction datasets sort of achieve this? LIMA and OpenOrca show some really promising results to date.


distilbert was trained from Bert. there might be an angle using another model to train the model especially if your trying to get something to run locally.


Yep, batching is a feature I really wish the OpenAI API had. That and the ability to intelligently cache frequently used prompts. Much easier to achieve this with a hosted OS model, so I guess it's a speed + customizability/cost tradeoff for the time being.


imo they dont have batching because they pack sequences before passing through the model. so a single sequence in a batch on OpenAI might have requests from multiple customers in it


Ah that would make sense. Similar to vLLM which does dynamic packing.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: