Hacker Newsnew | past | comments | ask | show | jobs | submit | hodgehog11's commentslogin

Just a heads up in case you didn't know, taking the Hessian over batches is indeed referred to as Stochastic Newton, and methods of this kind have been studied for quite some time. Inverting the Hessian is often done with CG, which tends to work pretty well. The only problem is that the Hessian is often not invertible so you need a regularizer (same as here I believe). Newton methods work at scale, but no-one with the resources to try them at scale seems to be aware of them.

It's an interesting trick though, so I'd be curious to see how it compares to CG.

[1] https://arxiv.org/abs/2204.09266 [2] https://arxiv.org/abs/1601.04737 [3] https://pytorch-minimize.readthedocs.io/en/latest/api/minimi...


For solving physics equations there is also Jacobian-free Newton-Krylov methods.


Yes the combination of Krylov and quasi-Newton methods are very successful for physics problems (https://en.wikipedia.org/wiki/Quasi-Newton_method).

Iirc eg GMRES is a popular Krylov subspace method.


I lately used these methods and BFGS worked better than CG for me.


Absolutely plausible (BFGS is awesome), but this is situation dependent (no free lunch and all that). In the context of training neural networks, it gets even more complicated when one takes implicit regularisation coming from the optimizer into account. It's often worthwhile to try a SGD-type optimizer, BFGS, and a Newton variant to see which type works best for a particular problem.


I also see the shrinking sentence length celebrated among my scientific colleagues who abhor the dreaded "run-on sentence". Maybe it is because I have no formal literacy or linguistic training but I mourn this loss; older, classical novels used to have a tremendous flavor in their sentence structure by prioritizing the longform. Some English translations of Russian literature can run into the absurd (sentences at half a page long), but even then there is a beauty to it.

I see this much less in modern novels and articles. Where is the flavor from pausing. all. the. time?


Yes. A long sentence can be thought of as a room, not a hallway.

I learned in high school lit that sentence length is an artistic choice as meaningful as word selection: long sentences can reflect stream of consciousness, recursive thought, associative or digressive exploration. Short sentences can reflect anxiety, urgency, vigilance, cognitive compression.

There are a lot of factors that have led to the decay of long sentences. Scientific writing norms, ubiquitous style guides like Strunk & White, modern distraction/multitasking/short(er)-form content, and my favorite, impoverished education - and the concomitant lack of trust in the reader on the part of the author.


> concomitant

Thanks for the new word! Native speaker but I’ve never seen/heard that one before. Might be more common in a commonwealth country though tbf.


> Yes. A long sentence can be thought of as a room, not a hallway.

The irony of this post having an initial sentence consisting of one word is either a sublime statement regarding the topic at hand or an unintentional affirmation of the subsequent factors enumerated.


I recently read "The Sense Of Style", which explained the actual principle behind making an understandable run-on. The trick was to allow the brain to mentally store away the earlier parts of the sentence, and take it out of the parsing context into the logical connections context. Not going to try and remake the point from scratch, if you're curious go read the book!

(as a sidenote, trying to make a point about grammar made me very self-conscious about mine, this is why I had to read a good book!)


Thanks for the reference! I think this very neatly puts into words some impressions that I've had about these long sentences. There is certainly nuance to it, as long sentences can feel exhausting if constructed inappropriately.


The vogue for artificially-short sentences removes not just shape and color, but also logical relationships. Writers and readers are unburdened of tracing chains of cause-and-effect or the dreaded wondering "why". It's part of the larger societal craving to shrug off reality and one's place in it.


I definitely agree there is a strong element of this, especially in the last few decades.

Perhaps it is also due to a widening of the audience that can provide literary criticism back to the author. Only the educated wealthy individuals with connections could offer critiques in the Victorian era of fiction; now it is anyone with a social media account. Judging by the failure of widespread peer review in "hype" research fields, I'm not sure this is a good thing.


Russian is much more conducive to long sentences because it's highly inflected. Adjectives have to agree with the nouns, and verbs can carry the grammatical gender and person markers. This all helps to keep the context clearer, the reader doesn't have to strain their brain to connect the clauses. So long-winded descriptions fit really well into the flow of the text.

It just feels more artificial and self-indulgent in English. As if the author wants to show off how well they can string together longer sentences, and it's up to you, the reader, to keep up with the magnanimousness of the author allowing their readers to glimpse upon their greatness.

Chinese novels are on the other side of the spectrum. The sentences simply can't be very long and but often don't have any connecting words between sentences. The readers have to infer.


> Chinese novels are on the other side of the spectrum. The sentences simply can't be very long and but often don't have any connecting words between sentences. The readers have to infer.

There is no grammatical ceiling on sentence length in Sinitic languages, Chinese languages (all of them) can form long sentences, and they all do possess a great many connecting words. Computational work on Chinese explicitly talks about «long Chinese sentences» and how to parse them[0].

However, many Chinese varieties and writing styles often rely more on parataxis[1] than English does, so relations between clauses are more often (but not always) conveyed by meaning, word order, aspect, punctuation, and discourse context, rather than by obligatory overt conjunctions. That is a tendency, not an inability.

[0] https://nlpr.ia.ac.cn/2005papers/gjhy/gh77.pdf

[1] https://hub.hku.hk/bitstream/10722/127800/1/Content.pdf


Sure. You can try to create arbitrarily long sentences with nested clauses in Chinese. Just like in English you can create arbitrarily long sentences like: "I live in a house which was built by the builders which were hired by the owner who came from England on a steamship which was built...".

But it feels unnatural. So most Chinese sentences are fairly short as a result. And it's also why commas, stops, and even spacing between words are a fairly recent invention. They are simply not needed when the text is formed of implicitly connected statements that don't need to be deeply nested.

To give an example, here's our favorite long-winded Ishmael: "Yes, here were a set of sea-dogs, many of whom without the slightest bashfulness had boarded great whales on the high seas—entire strangers to them—and duelled them dead without winking; and yet, here they sat at a social breakfast table—all of the same calling, all of kindred tastes—looking round as sheepishly at each other as though they had never been out of sight of some sheepfold among the Green Mountains." The Chinese translation is: "是的,这里坐着的是一群老水手,其中有很多人,在怒海中会毫不畏怯地登到巨鲸的背上——那可是他们一无所知的东西啊——眼都不眨地把鲸鱼斗死;然而,这时他们一起坐在公共的早餐桌上——同样的职业,同样的癖好——他们却互相羞怯地打量着对方,仿佛是绿山山从未出过羊圈的绵羊"

Or word-for-word: "Yes, here sitting [people] are the group of old sailors, among them there are many people, [who] in the middle of the raging sea can/will without fear on the whale's back climb. That whales were something they knew nothing about".

The subordinate clauses become almost stand-alone statements, and it's up to the reader to connect them.


I can see your point now, and we are in agreement that nested clauses are uncommon and at the very least sound unnatural in Sinitic languages, but it is distinct from «The sentences simply can't be very long and often don't have any connecting words between sentences».

Strictly speaking, complex nested clauses are slowly on the way out of English as well due to the analytical nature of its present form, which is what the cited article partially laments, and remain a distinctive feature of highly inflected languages (German, Scandinavian, Slavic, etc.).


When I was a kid, I learned a run-on sentence was a sentence without adequate conjunctions or punctuations to mark and separate the clauses. E.g.: "My wife and I went to a concert we saw The Cure they were terrific." I still have a tendency to write long sentences, but sometimes when I go overboard (e.g., a whole paragraph turns out to be one long sentence) I might break it in two, for clarity. But I don't go to grug-speak extremes.

I think the preference for short sentences in today's prose is a lot like vocal fry among North American women: a deliberate attempt to sound young.


> Some English translations of Russian literature can run into the absurd (sentences at half a page long), but even then there is a beauty to it.

C. K. Scott Moncrieff and Terence Kilmartin’s translation of Marcel Proust’s «In Search of Lost Time (Remembrance of Things Past)» contains nearly half-page long sentences.

Many modern readers complain about the substantial difficulty in following such sentences, although I personally find them delightful.


likewise. they are staggeringly beautiful when your mind is in "the zone". It's like a kind of focused meditation with images just flooding the mind


I started reading melancholy of resistance after the author won the Nobel prize this year. The sentences are very long, the book is really difficult to read imo though.


Not the OP, but I have a 2015 Macbook Pro and a desktop PC both running Linux. I love Fedora, so that's on the desktop, but I followed online recommendations to put Mint on the Macbook and it seems to run very well. However, I did need to install mbpfan (https://github.com/linux-on-mac/mbpfan) to get more sane power options and this package (https://github.com/patjak/facetimehd) to get the camera working. It runs better than Mac OS, but you'll need to really tweak some power settings to get it to the efficiency of the older Mac versions.


I permanently switched from Windows to Linux about five years ago. I had the same issue as you with Dropbox, so I switched to using the Maestral client for Dropbox instead which has support for selective sync. Works like a charm for me.


+1 for Maestral, have been using it for about a year on my Linux install and it works seamlessly.


As always, it's the intent that matters.

For the sake of argument, what if Amazon decided tomorrow that they would secure exclusive contracts with all food suppliers and then hoard all the food to starve out the people they don't want to have it? Or at least, drive up the price of food so it becomes completely unaffordable? I know people can simply grow their own food so it's a bit different, but hopefully it gets the point across. It's anti-trust on an unprecedented level.


But OpenAI legitimately needs HBM. Amazon in this instance doesn't need food and is doing purely to create artificial scarcity. If OpenAI were to actually not use the HBM then it could mean something.


That's the whole problem: it's unlikely that OpenAI will actually use all of that HBM. It seems probable that they are using it to create artificial scarcity for their competitors.


"needs" is doing a lot of heavy lifting in your argument...


"As always, it's the intent that matters."

That's certainly not a universal Legal Standard. If I'm harmed, but you didn't "intend" to harm me, does that nullify my Claim?

Hardly.


Voluntary manslaughter, involuntary manslaughter, degrees of murder, hate crimes.


Lack of intent doesn’t mean your claim is nullified. “Intent matters” means it’s taken into account when deciding what damages were wrought


IANAL, but yes, I believe it can nullify the claim. Bumping into someone on the sidewalk is only battery if the prosecution demonstrates intent to harm.


> I know people can simply grow their own food

Small thing, but this is not simple or realistic at all. How does someone in an apartment grow enough food for their family?


Yeah it would definitely still be a problem, but history shows that life finds a way. Even if everyone has to eat nothing but planted potatoes from any patch of grass that one can lay eyes on.


What history has taught us is that life finds a way by staying together and each person having their function within society, only some of which is growing or producing food.


Surprisingly, many do. When I mention to people (family, friends, etc.) that they should open new chat windows for new topics due to memory corruption, it's pretty clear they never even considered the possibility that the model can go off the rails if the chat is too long. Later I often get a comment like "wow, I kept thinking this AI stuff was rubbish, but it's really good now!".


I suspect that the "image-based" strategy taken here is unlikely to be appealing to many members of this community.

It could be very effective for bringing in those who are not particularly computer literate under the claimed guarantee that a random update is unlikely to break the machine. But you would also need significant financial backing and marketing with strong brand recognition to inspire that kind of confidence.


I think the right way to do this is with snapshots, the way opensuse microos is doing it, for example. You get the best of both worlds that way - you still can easily install packages into the OS to customise it, and you do get painless updates and rollbacks. There's a very narrow use case where you _do_ want images, but for that you'll want to control the complete secure boot chain for attestation, so I'd dismiss it here.

Fun fact, a bit over a decade ago we were probably the first one ever to publish a distribution to rely on btrfs snapshots per default with the Jolla phone. Sadly that did bite us due to reliability of btrfs at the time, and later phones switched to ext4, but with a stable filesystem it's a nice mechanism for handling updates and factory reset.


I would think of my self as atleast computer literate and i very much prefer atomic linux to traditional distros, arch or nixos. If you are in luck with hardware - you get system that is hard to polute with my actions and everything developer i do in separate distrobox. Rolling back versions or even hoping to completely different immutable distro is just restart away. I've never been so peaceful with linux.


That's a great comparison. The consequences are pretty universal too. History implies this won't end well for OpenAI.


I don't buy it. It's easier to make this argument for companies that are building their own hardware, since they know it can be immediately used. OpenAI's move is tantamount to hoarding for the sake of strangling competition. There was plenty of supply to allow for their plans without this move (especially since they will probably go bankrupt at this rate).


How exactly could they 'hoard' this? There's no place in the world to store that much undiced wafer. It will all go bad.


If stored in proper conditions, the shelf life of undiced wafers is pretty much limitless. But if moisture or dust get in, it’ll start to have corrosion and other damage. Diced wafers on the other hand are placed on UV tape and I recently learnt that the shelf life of that is like a few months at best.

Nevertheless, unless you’re not storing them in a clean room of appropriate class (or in a vacuum pack), assume 18 months for undiced wafers.

In terms of size, I would think that stored in cassettes or carriers a vacuum sealed, it shouldn’t be too much. A 25 wafer pack is about half meter cube? A 40ft shipping container is about 68m^3. So one container can store about 140 packs of 25 wafers each. That’s 3500 wafers. They’re talking about 900k wafers. That’s about 260 containers. Call it 300 with extra space and stuff. Not exactly hard to provision.


In what time will packaged wafers go bad? What is this based on?


Wonder if they'd rather have it go bad than let their competitors get it?


Granting this premise is true (I have no idea), that makes it even worse. They would deliberately be hoarding 40% of the global supply not to lock in for future growth but simply to make sure no one else gets to have it. It’s figuratively setting chips on fire.


As an AI researcher, I thought it was relatively well established (at least among my colleagues) that being pro-AI actually meant you were anti-Sam as well. He's the worst actor in the industry and has done an incredible amount of damage to its brand.


Care to elaborate on this?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: