Hacker Newsnew | past | comments | ask | show | jobs | submit | jcalvinowens's commentslogin

You can do it on x86 too, just use jmp instead of call and invent your own arbitrary register scheme for it. This x86 program has no stack: https://github.com/jcalvinowens/asmhttpd

I don't think it's too hard to imagine a compiler that does that, although it would obviously be very limited in functionality (nesting would be disallowed as you note, or I guess limited to the number of registers you're willing to waste on it...).


> The Linux kernel learned this the hard way. Early 2.6 kernels used spinlocks everywhere, wasting 10-20% CPU on contended locks because preemption would stretch what should’ve been 100ns holds into milliseconds. Modern kernels use mutexes for most subsystems.

That's not accurate: the scalability improvements in Linux are a result of broadly eliminating serialization, not something as trivial as using a different locking primitive. The BKL didn't go away until 2.6.37! As much as "spinlock madness" might make a nice little story, it's just simply not true.


Look at it a different way: if you'd invested that $10K/year you've been blowing on hardware, how much more money would you have today? How about that $800/month car payment too?

I don’t understand a world where spending $1k/mo on business equipment that is used to earn dozens of times more than that is crazy. It’s barely more than my minuscule office space costs.

My insurance is the vast majority of that $800, fwiw.


Having a 10% faster laptop does not enhance your ability to earn money in any meaningful way. Just like driving around in a luxury car doesn't enhance your ability to travel from point A to point B in any meaningful way.

It's okay to like spending money on nice things, it's your money and you get to decide what matters to you. What you're getting hate for here is claiming it's justified in some way.


  Location: Bay Area, CA, USA
  Remote: Yes
  Willing to relocate: No
  Technologies: C, C++, Linux, drivers, embedded, HPC, networking, video, radio, yocto
  Résumé/CV: https://github.com/jcalvinowens/misc/blob/main/resume/resume.pdf
  Email: [email protected]
I solve technical problems in exchange for monetary compensation. I do a little bit of everything: https://github.com/jcalvinowens

I currently have 20 hours/week available. I'm not considering full time roles at this time, only contract work. Thanks.


> Lazy RCU loading is good on a laptop

Do you mean RCU_LAZY? Most distros will already enable that: it doesn't do anything without rcu_nocbs, so there's no negative impact on server workloads.

    [calvin@debian-trixie ~] grep RCU_LAZY /boot/config-6.12.57+deb13-amd64
    CONFIG_RCU_LAZY=y
    # CONFIG_RCU_LAZY_DEFAULT_OFF is not set
    [calvin@debian-trixie ~] grep RCU_NOCB_CPU /boot/config-6.12.57+deb13-amd64 
    CONFIG_RCU_NOCB_CPU=y
    # CONFIG_RCU_NOCB_CPU_DEFAULT_ALL is not set
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...

You just have to set rcu_nocbs on the kernel cmdline.


I've been running homebuilt NAS for a decade. My advice is going to irritate the purists:

* Don't use raid5. Use btrfs-raid1 or use mdraid10 with >=2 far-copies.

* Don't use raid6. Use btrfs-raid1c3 or use mdraid10 with >=3 far-copies.

* Don't use ZFS on Linux. If you really want ZFS, run FreeBSD.

The multiple copy formats outperform the parity formats on reads by a healthy margin, both in btrfs and in mdraid. They're also remarkably quieter in operation and when scrubbing, night and day, which matters to me since mine sits in a corner of my living room. When I switched from raid6 to 3-far-copy-mdraid10, the performance boost was nice, but I was completely flabbergasted by the difference in the noise level during scrubs.

Yes, they're a bit less space efficient, but modern storage is so cheap it doesn't matter, I only store about 10TB of data on it.

I use btrfs: it's the most actively tested and developed filesystem in Linux today, by a very wide margin. The "best" filesystem is the one which is the most widely tested and developed, IMHO. If btrfs pissed in your cheerios ten years ago and you can't figure out how to get over it, use ext4 with metadata_csum enabled, I guess.

I use external USB enclosures, which is something a lot of people will say not to do. I've managed to get away with it for a long time, but btrfs is catching some extremely rare corruption on my current NAS, I suspect it's a firmware bug somehow corrupting USB3 transfer data but I haven't gotten to the bottom of it yet: https://lore.kernel.org/linux-btrfs/20251111170142.635908-1-...


I use mergerfs + snapraid on my HDDs for “cold” storage for the same reason: noise. Snapraid sync and scrub runs at 4am when I am not in the same room as the NAS.

The drives stay spun down 99% of the time, because I also use a ZFS mirrored pool on SSDs for “hot” files, although Btrfs could also work if you're opposed to ZFS because it's out of tree.

Basically using this idea, but with straight Debian instead of ProxMox: https://perfectmediaserver.com/05-advanced/combine-zfs-and-o...

I also use mergerfs 'ff' (first found) create order, and put the SSDs first in the ordered fstab list of the mergerfs mount point. This gives me tiered storage: newly created files and reads hit the SSDs first. I use a mover script that runs nightly with the SnapRAID sync/scrub to keep space on the SSDs open.

https://github.com/trapexit/mergerfs/blob/master/tools/merge...


I have had zero issues running ZFS on Linux for the last 10 years. (Not saying there were no issues that have annoyed or even caused data loss.)

I was wondering what the parent's beef was with ZFS on Linux. I have a box I might change over (B-to-L) and I haven't come across any significant discontent.

No beef: I just simply don't run out of tree kernel code, I've been burned too many times. Linux ZFS is mostly used by hobbyists and tinkerers, it doesn't get anything close to the amount of real world production testing and follow up bugfixing with linux that a real upstream filesystem like btrfs does today.

If ZFS ever goes upstream, I will certainly enjoy tinkering with it. But until it does, I just don't see the point, I build my own kernels and dealing with the external code isn't worth the trouble. There's already more than enough to tinker with :)

All my FreeBSD machines run ZFS, FWIW.


I've even been using ZFS on Linux with USB enclosures for 5+ years with no issues.

This is the first time I've ever had a problem with the USB enclosures. And its fantastically rare, roughly one corrupt 512b block per TB of data written. With a btrfs-raid1 it's self-correcting on reads, if I didn't look at dmesg I'd never know.

I've figured out it only happens if I'm streaming data over the NIC at the same time as writing to the disks (while copying from one local volume to another), but that's all I really know right now. I seriously doubt it's a software bug.


Btrfs raid was also the one who had data loss bugs.

Does the hardware only support the NIST curves? Or is that just the example that happens to be given?


Only supports NIST curves and ECDSA yes.

I've heard people make the point before that EdDSA is not great for secure enclaves due to being suspictable to Fault Attacks which could lead to (partial) key extraction


I don't trust the NIST curves: they were generated in a dubious way which has been written about extensively elsewhere (the coefficients for P-256 were generated by hashing the unexplained seed c49d360886e704936a6678e1139d26b7819f7e90). I always avoid them unless I have to use them. It makes me sad when hardware forces me to use them.

> I've heard people make the point before that EdDSA is not great for secure enclaves due to being suspictable to Fault Attacks which could lead to (partial) key extraction

Huh, got a link? My understanding is that eddsa is better with respect to side channels in every way, that was part of the intent of it's design. I've worked with crypto hardware which supports it.


https://romailler.ch/project/eddsa-fault/

I think this can be solved by using hedged eddsa (Signal does this)


Install powertop, the "tunables" tab has a list of system power saving settings you can toggle through the UI. I've seen them make a pretty big difference, but YMMV of course.


It mostly just breaks things unfortunately. You can faff around for ages trying to figure out which devices work and which don’t but you end up with not much to show for it.


Yeah I tried that but it made no difference at all.


Is Arduino actually used for anything serious? While I certainly appreciate how their whole ecosystem has made working with microcontrollers more accessible... even the most casual hobbyists I know very quickly move on to something like an ESP32.


> The C++ aliasing rules map quite poorly into hardware.

But how much does aliasing matter on modern hardware? I know you're aware of Linus' position on this, I personally find it very compelling :)

As a silly little test a few months ago, I built whole Linux systems with -fno-strict-aliasing in CFLAGS, everything I've tried on it is within 1% of the original performance.


Even with strict aliasing, C and C++ often have to assume aliasing when none exists.


If they somehow magically didn't, how much could be gained?

I've never seen an attempt to answer that question. Maybe it's unanswerable in practice. But the examples of aliasing optimizations always seem to be eliminating a load, which in my experience is not an especially impactful thing in the average userspace widget written in C++.

The closest example of a more sophisticated aliasing optimization I've seen is example 18 in this paper: https://dl.acm.org/doi/pdf/10.1145/3735592

...but that specific example with a pointer passed to a function seems analogous to what is possible with 'restrict' in C. Maybe I misunderstood it.

This is an interesting viewpoint, but is unfortunately light on details: https://lobste.rs/s/yubalv/pointers_are_complicated_ii_we_ne...

Don't get me wrong, I'm not saying aliasing is a big conspiracy :) But it seems to have one of the higher hype-to-reality disconnects for compiler optimizations, in my limited experience.


Back in 2015 when the Rust project first had to disable use of LLVM's `noalias` they found that performance dropped by up to 5% (depending on the program). The big caveat here is that it was miscompiling, so some of that apparent performance could have been incorrect.

Of course, that was also 10 years ago, so things may be different now. There'll have been interest from the Rust project for improving the optimisations `noalias` performs, as well as improvements from Clang to improve optimisations under C and C++'s aliasing model.


Thanks! I've heard a lot of anecdotes like this, but I've never found anyone presenting anything I can repeoduce myself.

Strict aliasing is not the only kind of aliasing.


Yes, that's why I described it as "silly" :)

Is there a better way to test the contribution of aliasing optimizations? Obviously the compiler could be patched, but that sort of invalidates the test because you'd have to assume I didn't screw up patching it somehow.

What I'm specifically interested in is how much more or less of a difference the class of optimizations makes on different calibers of hardware.


Well, the issue is that "aliasing optimizations" means different things in different languages, because what you can and cannot do is semantically different. The argument against strict aliasing in C is that you give up a lot and don't get much, but that doesn't apply to Rust, which has a different model and uses these optimizations much more.

For Rust, you'd have to patch the compiler, as they don't generally provide options to tweak this sort of thing. For both rust and C this should be pretty easy to patch, as you'd just disable the production of the noalias attribute when going to LLVM; gcc instead of clang may be harder, I don't know how things work over there.


Thanks!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: