Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
A Bit Overcomplicated (thedailywtf.com)
131 points by jjgreen on Aug 5, 2021 | hide | past | favorite | 249 comments


At the risk of sounding stupid, I have no idea how bit-shifting and bitwise operators work, though I've worked in the field for over a decade and am now a Staff Engineer or whatever. There, I said it.

It's just something I haven't gotten around to reading yet. I suspect some code would be made clearer by it, but even when doing something similar I wound up using the math textbook was to do it even though I suspect the bit operators would be much faster.

https://github.com/sudhirj/shortuuid.rb/blob/master/lib/shor...

https://github.com/sudhirj/shortuuid.go/blob/master/shortuui...


> At the risk of sounding stupid, I have no idea how bit-shifting and bitwise operators work, though I've worked in the field for over a decade and am now a Staff Engineer or whatever. There, I said it.

I mean no offense, but honestly I'm a bit flabbergasted that anyone who has worked in programming computers for over a decade wouldn't have acquired this exceptionally basic bit of computing knowledge.


20 years of application development and I’ve used bit shifting a couple times, none of them were all that essential or much better than the alternatives. It’s a bit condesending and par for the course on HN to be flabbergasted at someone’s lack of knowledge of something but honestly there are career paths where it truely does not matter.


Because it is something that naturally flows from having an understanding of boolean logic and that computers store data in binary digits.

Being a "software engineer" who doesn't understand this seems to me like being an architect who doesn't understand trigonometry or something.


Believe it or not, there are many different kinds of software engineering, many of which don't require you to know anything about binary or bitwise arithmetic. I don't know that I've ever used these as a full stack web developer, but I did learn it in my Computer Science program; the only time I think about the bit position of something is when setting/viewing file permissions, or reading source code for a library which does deal with this (there are legitimate reasons to use bitwise operators in JS for things like hashing, cryptography, secure string generation, etc.)

But really, there are people doing software engineering who have never thought about this, and while I agree that any blind spot in knowledge is detrimental (and everyone has some blind spots), it's condescending to suggest that this specific blind spot is noteworthy to the point of lacking foundational knowledge. Your software engineering isn't the only type of software engineering.


I must question the value of a "software engineer" who is working at a layer so abstracted from the reality of implementation that they don't have to understand binary.

> the only time I think about the bit position of something is when setting/viewing file permissions, or reading source code for a library which does deal with this (there are legitimate reasons to use bitwise operators in JS for things like hashing, cryptography, secure string generation, etc.)

People should understand how the libraries they use work. Not necessarily in detail, but they shouldn't be a magic box they're helpless to understand if it isn't working right.

> it's condescending to suggest that this specific blind spot is noteworthy to the point of lacking foundational knowledge.

Possibly. I'll admit that though it seems foundationally important to my understanding of computing it may not necessarily be foundational to understanding of computing in a conceptual sense. However, it is so basic, and follows so naturally from boolean logic (which is foundational), that it is quite surprising it never came up for someone in this profession. I mean, if you already know how logic gates work it is trivial to leverage that into understanding bitwise operations.


> People should understand how the libraries they use work. Not necessarily in detail, but they shouldn't be a magic box they're helpless to understand if it isn't working right.

What I'm saying is that 'should' is unfairly prescriptive. There are things I don't understand that would be massively beneficial such as being able to understand drivers and driver code, C (we didn't learn it to any meaningful capacity in my program), and many other skills.

Fortunately, I've managed to take what I do know about computing, software, and programming, and find places I can apply that knowledge in a software engineering discipline that utilizes the knowledge I do have most effectively. Would I be able to write a driver if I was being paid to do it? Probably; there would be some learning required, maybe not even that much, but someone would be paying me to learn things outside of my general experience. I'd welcome it as well, but it just hasn't happened.

Similarly, there are people whose knowledge encompasses a subset of what I know (but possibly excluding binary and bitwise operations in totality) who are also very effective software engineers.

If you can think about browser state, reactivity, CSS specificity, the DOM, and git, you can be a very effective front-end engineer, and you never have to know much about how computers work. There are absolute wizards with 'front-of-front-end' who will be much more effective at animations, effects, device breakpoints, and visually pleasing layouts than I will ever be, who will never think about binary. And it will never be remotely relevant to them.


I think everyone, no matter where they are in the stack, should have at least a basic idea about how their programs eventually work. How they get turned into machine code, how that machine code gets fetched and executed, how registers work, memory accesses, RAM vs. ROM, what various caches are doing, how I/O devices work including disk, how interrupts work, how syscalls work, how memory allocation is done, what the stack is, typical segments of address space: code, data, stack, heap. I've been in the field for a few decades, but this is still kind of basic second-year compsci, right?

You don't need to be an expert at bitwise operations to grok these important concepts.


> How they get turned into machine code, how that machine code gets fetched and executed, how registers work, memory accesses, RAM vs. ROM, what various caches are doing, how I/O devices work including disk, how interrupts work, how syscalls work, how memory allocation is done, what the stack is, typical segments of address space: code, data, stack, heap

Yeah, I've learned all of this stuff ~10 years ago (in compsci also) and forgotten a lot of it due to never using it. Should I learn it again? To be honest, I'd like to, but I haven't taken the time either. I contemplated doing a deep dive the one time I've had to interview recently, but then didn't end up needing it. I'm sure if I wanted to move to a FAANG I'd need to brush up, but in the mean time I'm bombarded with plenty of things I need to learn on a regular basis to keep up with the actual technologies I work with

> but this is still kind of basic second-year compsci, right?

To be honest, I find I have a very hard time really integrating knowledge of something until I've had ample opportunity to apply it in different ways over some extended period of time. I think this was why the knowledge never stuck when I learned it in CS. There were so many things I'd cram before the test and then never need to think about again.

If I don't have the opportunity to apply these concepts, trying to really 'learn' them is a Sisyphean task for me. With enough effort and repetition I'm sure I could, but I don't agree that I should. I think your attitude really overlooks the variance in learning styles and capabilities. Not everyone can remember things they don't apply for 10 years.


No, but you don't need to know them to do the front end work mentioned by OP.


> I must question the value of a "software engineer" who is working at a layer so abstracted from the reality of implementation that they don't have to understand binary.

To me it sounds like very productive work and a progression of technology. It's 2021, where 99.99% of us don't have to implement our own binary protocols, and work with computers rather than microcontrollers. If you're bit shifting, then it's most likely extremely low level work, which, as is the point of advancements in technology, is necessarily becoming more and more rare, or some case of NIH.

I do work with binary protocols, so I'm very familiar with bit shifting, but I also realize this is all grunt work that I'm having to implement due to poor vendor support of usable/no libraries.


"Understanding binary" on a basic level is different than being comfortable using bitshifting / bitwise operators.


...not really? Bitwise operators can be explained in a couple minutes, tops, and follow naturally from another concept everyone working in computing should be familiar with: boolean logic.


> Bitwise operators ... follow naturally from another concept everyone working in computing should be familiar with: boolean logic.

I'd say to most people raised on "ordinary" maths they follow more naturally from quite a different concept: Moving the decimal point.


That only works for shifting, not the other bitwise operations.


Uh yes, I’m an example of one. I’ve understood all of the operators at one point or another. I get the basic ideas. I’ve just barely used them so I would have to look them up before using them. I also don’t find the concept of bit-shifting to be that intuitive. Boolean logic is easy and a very necessary fundamental concept for all programming — I’m completely comfortable with that…

For what it’s worth, I think most people consider me a competent software engineer. I have a degree in computational mathematics from a good school.


> I also don’t find the concept of bit-shifting to be that intuitive.

Can you clarify what you mean by this? Because the concept of shifting bits to the left or right is so simple that I don't see how it could possibly strike you as unintuitive unless you are completely unfamiliar with the idea of data being represented by a string of contiguous binary digits.

If your problem is merely that knowing what bit-shifting is does not present any obvious use cases to you, that isn't the same as finding the underlying concept unintuitive.


Hmmm. I'll try. I guess the main issue is that I rarely ever deal with data in binary format. The times that I've dealt with binary strings has probably been most often as binary numbers in some sort of programming or math puzzle. Thinking of it as "multiplying" or "dividing" by a power of 2 feels more natural to me than "shifting" bits left or right. Maybe this is because I was a math major instead of CS :p

If I had more experience with lower level languages and / or dealing with raw binary data formats, I probably would have internalized it better by now.


Then why don't you spend that few minutes explaining the thing to the person you initially replied to, or link them to a resource instead of doing what you're currently doing?


Because several other people already gave perfectly adequate direct responses.

Mostly what I'm doing currently is defending my level of surprise.


That's fair. A point that I didn't see mentionned is that what seems really easy for you may not be for others. I know about boolean logic, and bits, but I'm not comfortable with bitwise operation and they certainly don't seem obvious to me. On the other hand, I have coded with other people, and was really surprised that they didn't understand a certain thing (going from for loops to map for example).

When you look at the number of programming languages and how different "idiomatic code" can be for the same task, I think it's a proof that different people think in a different way. For some low-level stuff will be very intuitive, for some it'll be functional programming, for others OO, etc.

As some other people said, CS/SWE is very wide, it's hard to know everything at any point. Add to that that for most people working with high level software, knowing more about the business gives more leverage than learning more about CS fundamentals, you end up with people knowing widely different things.

I think that in general, what's important is having a basic idea about most thing (known unknowns instead of unknown unknowns) so that you can look into stuff deeper if you need it.


> That's fair. A point that I didn't see mentionned is that what seems really easy for you may not be for others. I know about boolean logic, and bits, but I'm not comfortable with bitwise operation and they certainly don't seem obvious to me.

....how in the world can that be? Bitwise operations are just the application of boolean logic to each pair of bits in the arguments. If you know what "exclusive or" means, XOR should be perfectly comfortable to you no matter how many bits are involved.


XOR is fine, I understand how it works, though I often don't think about it in some leetcode-type exercise where a xor is the "clever" solution. Byteshift is not really intuitive to me, for example in the article, I wouldn't know if I had to shift left or right and would take a bit of time to look that up. As I said, different people think in a different way. I don't understand how people that can understand for loops don't understand map and filter, but I know that they exist.


That's just wild to me. If I held up a card with a decimal number on it and asked you to remove the X least significant digits, would you have to look up which way to shift them?


Yes, I would have to.


Would you have to consult a reference in order to be able to point to the least significant digit of a number written down in front of you? Maybe the "least significant/most significant" terminology is unfamiliar enough to you to present a stumbling block. Certainly the fact that you're communicating on a written forum means you're probably comfortable and experienced reading a string of digits and knowing how to understand it as representing a quantity, even if you've forgotten the formal terminology describing that process.


No, I wouldn't have to consult a reference. What's unfamiliar to me is how exactly byteshifting allows you to take the X most significant bits from a number, although now that I've talked a bit and thought a bit about this it starts to make more sense. The thing is, I've never spent much time thinking about this, so it's unfamiliar to me.


If you have a number base n, shift right/left k digits means, respectively, "divide/multiply by n^k".

That's the same whether it's decimal (n=10) or binary (n=2).


It's floor division with bit shifting though, right?


It is a clear sign that the person never had any genuine interest in computing, since that would have led to learning how a CPU works (it's the most important part of a computer!) and thus its instruction set, which would have of course led to learning about binary operators and shift operators.

Instead the person clearly was motivated only by some specific application of computing or by making money.


To me this reads like someone arguing that a champion race car driver must never have had a genuine interest in automobiles if they've never rebuilt an engine.


Clearly you do not have a genuine interesting in computing because you do not have a complete theory of everything, unifying gravity and all the fundamental forces, so really you have no idea how computation even happens at a fundamental level.

Stop gatekeeping there’s going to be someone smarter than you.

People work at different levels.

The logic of how a huge program works can be abstracted above bit shifting. It is still computation.

Maybe try be more understanding.


Talk about gate keeping…

You don’t get to decree what genuine interest in computing looks like for all of humanity.


Everyone thinks their work/what they know is important, that's it. Why don't study electrical engineering while we're at it. Even better, go even deeper and get us all some physics

There's a reason Randall Munroe draw that butterflies xkcd comics.


> and that computers store data in binary digits

I know that computers store data in binary digits but it has absolutely no impact on how I do my job. I'm vaguely aware of how bitwise operators work but have never had to use them in my work, only in random hacky side projects.

> like being an architect who doesn't understand trigonometry or something

I'd equate it with a an architect that doesn't know where the brick supplier they are working with sources their bricks from. It's absolute essential to the project, nothing is going to get built without the bricks. But the architect can still plan the use of those bricks and the project can be entirely successful without them needing that knowledge.


I'd say it's a bit more like a truck driver who doesn't know how an engine or transmission works.

Sure, you can learn the habits just fine, and maybe go your entire career without knowing what goes on under the hood. But sooner or later you're going to get stuck on the side of the road waiting for someone else to come out and fix something trivial.


But then you can call AAA and get fixed in under a hour. I would guess the number of truck drivers that know enough about their engine or transmission to fix a trivial issue is <10%. For Uber drivers I would be flabbergasted if it was over 1%.

I actually think this is a pretty good analogy. A truck driver does not need to know how to fix their truck to do their job. A web developer does not need to know how to bitshift. For mechanics or low level programmers it is a different story.


But this is like an architect trying to lay bricks without knowing how


It really isn’t.


I mean, if you work as a Staff engineer in a department that only does web app development (say, working at Shopify on their rails codebase) I can easily see that you wouldn't come into any contact with bitwise operators at all. Rather, your time is probably better spent on scaling databases, refactoring codebases and other such higher-level things.


People can understand that without knowing about bit-shift operators in a given language.

I know about them because I had to. My first computer had 4K of RAM. When I hit the limits of Applesoft BASIC, which didn't take long, I had to learn 6502 assembler. Later, I've done things like writing protocol decoders, so I've also learned them in Python.

But I think it's perfectly reasonable to have a solid career and never use these. To forget that they ever existed, because one is solving problems where they're just not relevant anymore. Our field is too big to know everything, so I think it's fine that people focus on different aspects of the work.


Same here, I started on 6502 assembly and you had to know a lot of that stuff just because we didn’t have cool libraries like the STL and so on. You want to sort something, well you’d better write a sort routine.


The common use for bitwise I have seen is to represent multiple boolean states in a single variable. It's not required for most people at all. A more descriptive multi-value hash/struct with each state explicitly listed as a boolean will create a more readable code. Unless there's a strong need for memory management or a binary protocol implementation, I don't see a need for it.


it's the attitude that's the problem. Rolling your own terrible solution without asking someone or just googling first.


It simply hasn't come up. I've written accounting and billing software that processes millions without using bit shifting, logistics solvers that do multithreaded depth first search systems on lazy graphs without bit shifting, ticketing systems that sold millions of tickets correctly without bit shifting etc.

I'm sure there are plenty of use cases out there, I just haven't worked on them.


Would you have recognised the opportunity to use it, not knowing how it works?

Sometimes when we learn a new technique we suddenly see all the places it can be used that we didn't know about before. There's the old (probably apocryphal) story about the landscaper who was able to fire one of his staff when a customer told him he can calculate the side of a right-angled triangle without measuring all 3 sides. What he did worked for him, he just didn't know there was a better way.

Anyway I do agree in this case you are probably correct. I think most programmers these days use bitwise logical operators for flags and rarely if ever bit shift outside of system code.


I almost had to learn it once for flags, but Go has iota and constants and Java has enums, and stuffing flags into an integer and storing that in Postgres or Redis would have made the problem arguably worse. Just used a Set.


It's not that they haven't come up in their job that's strange, but that they were never exposed to this level of base computing principle at any point in their career.


The right algorithm beats out low-level every time. Think of bitwise operators as a vector unit, but the scalar is a bit. That’s it.


I also have no idea how to properly do memory management, every language I've worked with has had garbage collection. I know the languages I write in use malloc or something to get more memory, but that's it. Never called it, don't know how it works. For flabberg-ammo for that gasty feeling.


I can understand how this situation happens, but honestly it really bothers me that it is the case. I see all this bloated slow software everywhere that pauses all the time and it is hard not to see "I've never done memory management" as part of the cause.


That might not be due to memory management - a lot of my work is in Go and Java, selling very high demand inventory to millions of people (opening night movie tickets across tens of thousands of screens and shows across the entire country) and the APIs are not slow and do not (noticeably) GC pause. But both Go and Java / JVM allow people to do high-value work without worrying about memory management (other than leaks of course). No allocations, free / release, no pointer arithmetic. Even the concept of pointers in Go isn't actual memory addresses but value/reference semantics. The only time I came close to having to manage memory was iOS apps, but then ObjectiveC added ARC and Swift came along, so nope.


It certainly feels like a GC pause, but it could be a lot of things. Regardless of the cause I feel like the fact that people don't think about how memory is managed is indicative of the real problem: people are programming computers without really understanding how they work. And while that is fine, even encouraged, for a lot of use cases I think anyone calling themselves a professional programmer should be held to a higher standard.


So how many levels or layers down or up should we expect someone to go? Say we take Go / Java as a starting point, L-1 would be C / C++, L-2 would be Assembly / SIMD / AVX processor instructions, L-3 would be chip design / architecture, L-4 would be diode chemistry and fabrication, L-5 is quantum mechanics?

And on the other side L+1 would be databases, L+2 would be system architecture / systems design, L+3 would be user experience and communication, L+4 organisation and team design, L+5 would be LISP?

How many levels would someone have to straddle to meet the higher standard? And does it matter which level their midpoint is?


I too despair when I remember the days of software instantly responding to keypresses. But I fear that no matter how much today's developers knew about computer architecture software would still be slow, because latency is like a gas: it expands to take up all available latency-toleration space in a program's audience. Compressing it back is expensive, ergo it's a race to the bottom where business leadership would not let those developers actually take the time to make software more responsive.


>>I see all this bloated slow software everywhere that pauses all the time and it is hard not to see "I've never done memory management" as part of the cause.

There's no way your ordinary programmer would have done memory management better than the GC options available with something like the JVM.

This is in fact Java/Python/Perl exist in the first place.

Slow software is better than software that crashes often.


I started with C 30 years ago, wrote my PhD thesis in pain, tears and C, and then developed at a more or less amateur level until today (thankfully not in C anymore).

I read about bit-shifting operations but never, ever had the need to use them. So I did not push further and just know that they exist and that they look like <<.

I hope I did not made your flabbergastering worse.

Ah, I also wrote a (tiny) bit of the Linux kernel in ~1994 (what today would be modules).

And still survived without <<


I think programming has become so high-level that there's a lot of fundamental computer science that is largely ignored. I mean, I'm sure this person has no idea what an ISA, cache, CAM, or pipeline is either, but you don't need to know to write front-end JavaScript.

It does unnerve me as well, since it is so fundamental in my mind. But hey, maybe there is just so much to learn there's no time to teach computer architecture anymore?

If you want to feel humbled, read the book "Hackers Delight". It is about 300 pages of mostly bitwise algorithms and operations. I thought I knew a thing or two from decades coding, but this book goes so deep that my brain just shut off.

https://www.powells.com/book/hackers-delight-1st-edition-978...

EDIT: Upon further pondering, bitwise operations make certain assumptions about machine word size for that context. It could be dangerous for higher level languages to allow bitwise ops if the # of bits isn't known, as this context of instructions is generally tied to the hardware: the opposite of high level languages!


Nor do companies care that you know this when applying for a web developer position. IMO they should care since it shows you're capable of learning about different fields in the software engineering field.


We definitely care if you’re capable of learning other fields - we check for breadth in static and dynamic languages, SQL and NoSQL knowledge, UX and systems design, performance optimisation, a sense of aesthetics, writing and communication skills and more. We’d rather people spend time learning these than binary arithmetic or regex operators or SIMD instructions.


Yeah, not everybody has to work with networks or cryptography, but I would've imagined that people at least read libraries that they use (to figure out what's going on under the hood) and inspect/debug low-level stuff from time to time.


A lot of software development is just "executing recipes" and there's nothing wrong with that: https://youtu.be/Nb2tebYAaOA?t=1363


How do you simultaneously know what bit shifting is but then not know that you can easily avoid it completely for a career?


I know that it's the manipulation of binary numbers, and I do know what binary numbers are. But I've never operated on binary numbers professionally. I did do some AND OR NOR NOT XOR gate diagrams in high school and college, but haven't looked at binary numbers or thought in binary numbers since then.


Ok, so it isn't as though you weren't exposed to the concept at least. The way you phrased what you said made it sound like boolean logic itself might be alien to you. Still, there's such a tiny conceptual distance from the gates to bitwise operators that I'm still a bit surprised you didn't acquire it by osmosis.


The Boolean logic part is applicable everywhere, so that's actually unrelated to binary arithmetic or bit-shifting. Stuff like Demorgan's theorem etc I practice more in the context of simplifying if-else statements than binary operations. So there hasn't been enough significant overlap to spread into.


Been writing code for pay since about 2000.

I think I've bit-shifted in anger, like... twice.

I 1000% could not do anything useful with them without consulting a reference, as with most anything else I don't do on an ~weekly basis.

[EDIT] rather, that's all bitwise operators. I know I've used them once (in a "write it once, never touch it again" kind of context) and maybe one other time but recollection's hazy. Just a basic bit-packing thing IIRC, for the one I recall.


Just wait until Copilot is widely used. There will be a whole group of cargo cult programmers.


I mean no offense, but honestly I'm not a bit flabbergasted at all that anyone has your view. I keep seeing people acting as if the bubble they are in is a good representation of the world as a whole, and every indication that maybe they are wrong is met with a big surprise.

If you are truly flabbergasted by this you need to try to find a way out of your bubble so you can understand reality better.



10 years of one year's experience, probably.


Heh, no, I'm happy to say I've worked in a wide range of industries, countries, programming languages, datastores, team compositions, roles, front & backend, side projects and own products. I'm just saying non of them required shifting bits.


Bitwise shifting isn't that important, unless you happen to work with binary formats and protocols, which should be distinct low-level layer in any app, and ideally relegated to a library that does it better.

Back in time it was known for being a fast way to multiply and divide by powers of 2. This hasn't been true for a long time now, and in the rare cases it is, most optimizing compilers will bitwise shift where in your source code you divide or multiply.


> Back in time it was known for being a fast way to multiply and divide by powers of 2. This hasn't been true for a long time now, and in the rare cases it is, most optimizing compilers will bitwise shift where in your source code you divide or multiply.

Shifting is still faster than multiplication--a shift is a 1-cycle operation, multiplication typically 3 cycles. Division is somewhere in the region of a dozen cycles or more. (Also, division is one of the assembly operations whose time is data-dependent). So it's faster to replace a multiplication with a single shift (but not multiple shifts), and to replace division with shifts or even multiplies--there's a fast division replacement that replaces a division by a small constant with a multiply-hi and shift.

Will compilers replace multiplies and divides with shifts? Kind of. If you multiply a value with a constant (after inlining/constant propagation/etc.) that's a power of two, or if you multiply it with an expression 1 << var, then that will be converted to a shift.

For division, it's even more complicated because shifts and division by power of 2 are not equivalent if the type is signed: -1 / 2 is 0 but -1 >> 1 is -1, so the transform is not legal unless the compiler can find out either a) the value is never negative or b) the division is exact.


With a DSP core it often gets even better, the bit shift might be free. Say you have the instruction ACC = ACC + P << PM, that could compile to a single instruction, if you're not writing assembly directly.

Additionally if you're ever working directly with hardware, writing drivers, or firmware, bit-shift operations are often the best (and most common) way to get stuff done.


I wouldn't call it better because CPUs have a similar "sum and multiply" instruction, and GPUs are pretty much made of such units.

The problem with bitshifting is that it's just a special case of multiplication and division. So if you want your hardware utilized, you don't scale bitshifts, that'd be largely useless. You scale multiplication and division, and this is what modern hardware does.

Bitshifting is very simple in terms of hardware, but scale beats simplicity.



> unless you happen to work with binary formats and protocols

ie, embedded systems and microcontrollers - which is where binary formats and bitwise operations still rule out of necessity (limited processing power, precious cycle counts per operation, limited memory and storage, etc).

I highly doubt your average web or app developer will ever need to use bitwise operations. They're harder to read and understand quickly, and none of the limitations that make bitwise operations attractive exist on devices with gigabytes of ram and terabytes of storage.


>"They're harder to read and understand quickly"

Disagree. Never had problem with bits. Also it is one of the basics of general computing so it would not hurt to know about the subject.


Doesn't hurt to know it, sure, but it's entirely not useful for a large majority of developers these days.

Something like:

    (some_var & 0x003ff000) >> 12;
Or even:

    _displayControl &= ~DISPLAY_ON;
Is not immediately clear what is going on (other than some masking and shifting is happening), and these are simple cases without a lot going on. To understand these operations, we have to figure out what's significant about these numbers, what the variables might be here, etc.

It's a lot easier to understand:

    display.disable();
Which is what most higher-level-language developers (web, apps, even backend, and more) will use and expect to find in a codebase.


What you're saying: "In practice good modern API design prefers explicit intent over low-level details like exposing bitshifting for flags"

What people are saying to you: "You say bitshifting is complex, so if I say bitshifting is simple, it means I'm smarter than you"


>"so if I say bitshifting is simple, it means I'm smarter than you"

To BulgarianIdiot: This is quite a conclusion.


>"Doesn't hurt to know it, sure, but it's entirely not useful for a large majority of developers these days"

I read about a lot of things that are not immediately useful for me. Still somehow it falls into nooks and crannies of my brain and helps me to solve various problem when / if need arises. I do not even remember the exact things but general understanding of what is available and how does it functions approximately. If I need details Google does the rest.

>"It's a lot easier to understand:

    display.disable();"
No arguments about this one. Even though I do work with bits sometimes I wrap those into something digestible like your example. Inlining helps not to use performance when it is needed.


I think it’s important though to know that these operators exist.


More than that, I think it is important to have a certain level of understanding about how computers actually work if you're going to be programming them as a career, and boolean logic and bitwise operators are a very basic part of that.


How computers work is a very stretchy term. And what you need to know depends what you're working on. For example if you're scripting you don't need to know a thing about how computers work, your concerns are mostly at the software level.

If you're writing a high-performance server or a game engine, you need to understand the topology of computers. Network performance and protocols, disk, memory and cache architecture. But you still don't need to understand every machine code on the CPU.

We're risking falling into the trap of "more is more" like a chemistry teacher that is adamant that you need to absolutely understand covalent, ionic and so on bonds, because everything around you chemistry, including you. I mean how would you find a job as an accountant and start a family without understanding chemistry?


I take your point about the risk of falling into the trap of "more is more", but we're not talking about how to build NAND gates - we're talking about how integers are represented by bits and how that knowledge can be leveraged. For instance, if the author of the code showcased in the article had a bit more knowledge about how integers are stored internally, they would have realized they could have simply divided by 2^22 even if they didn't have access to a low-level bit-shift operator.


The solution in the article is a bit of a mystery. To come up with that particular solution you already know what bits are, because the first step is formatting the number as bits. Everyone, even people clueless of computers, know that computers are "ones and zeroes" i.e. binary.

Keep in mind "The Daily WTF" is widely suspected to be making up most of its articles.


Which OP obviously does.


It's not a big deal that you don't know about this stuff. You can get very far as, for example, a web programmer without ever having to explicitly shift a bit.

If you are curious, bit shifting is essentially multiplication or division by 2. It is faster than doing the actual arithmetic.

Bitwise operators just perform logical bitwise and, or, xor operations on the bit representations of two numbers.

This is especially useful if you have mapped some data to binary 0-1 vectors and those operations are semantically meaningful for the representation.

Just as an example, you can calculate the Hamming distance between two 0-1 vectors by taking an XOR of their integer representations and then accumulating the 1s digits for all shifts (towards 0) of the XOR-ed value.


One of the few positive and helpful posts in this thread. The anger and fighting seems to be invited much more. Welcome to the internet


No shame in that.

You're going to see lots of people scoffing at a failure to understand such "foundational" aspects of programming but I would not agree with them.

In a "traditional" (let's say, up through the 90's? perhaps longer? maybe it's still that way today?) computer science education with lots of C and assembly, you do a lot of twiddling of individual bits. Either because you're trying to pack multiple values into a single integer (bits 0-3 are the player's score, bit 4 is the number of lives remaining, bits 5-8 are the time remaining, etc) or for certain kinds of low level math. So, folks who went through that sort of education will tend to view this as foundational knowledge.

There will always be a need for such bit-twiddling, of course. Folks writing low level code, binary network protocols, etc.

However, as you may already be thinking to yourself, most software engineering work today decidedly does not involve that sort of thing.

Essentially, IMHO you just need to know that this stuff exists - to know that most languages give you ways to twiddle bits in case you actually need to twiddle bits somebody. Bitwise stuff is generally pretty simple, you can just get familiar with it you ever actually need it.


I disagree. Would you fly with a pilot who never learned the basics of lift and drag? Or hire a builder who didn't understand loads and support? But we call people professionals who build software with no understanding of computing fundamentals?

Even getting an associates degree in networking required me to understand how binary operations work and do variable length subnetting by hand, even though subnet calculators are readily available.

Maybe I'm being off base here in considering it a computing fundamental, but it is difficult for me to see things like this and not associate it with my super-computer phone routinely pausing for several seconds while I am typing.


This comes up once in a while, both when I talk to people who've been in the field much longer than me and when I interview new folks. We had a couple of people try to add bit shifting and memory management into the interview process, and we're a shop that does APIs on Go & Java, apps on Ruby/Rails, frontend on React/HTML/CSS, mobile apps on Kotlin, Swift, and data on Postgres, Redis, S3, DynamoDB etc. Most people in this kind of shop simply haven't heard of binary operations or malloc/free and none of our codebases have them. Lots of people had a hard time letting go of the notion that these skills were "foundational" or coming to grips with candidates not knowing this but still being effective developers.


Again, while I can see where you're coming from, the fact is I really do find it difficult to believe that they can be all that great while looking around at all the painfully slow software I have to use that was deployed by billion dollar companies and worked on by an army of developers who, presumably, were hired with that very mindset.


I really disagree with this. I've spent time optimizing web apps (both client and server side), and I've never used bit-twiddling to do it. Your application will be very fast and have none of that.

The performance problems you run into on the server side are generally related to bad data access patterns with an external database or caching layer. They are solved by writing better queries, or organizing the data more efficiently. As an end user of those systems, you do not interact with the raw bytes yourself. Of course if you're actually writing a database engine or a cache, then you would.

On the front-end, it's a bit more complex. Often what makes apps slow is actually the way their assets are loaded. For example, using web fonts can incur a major performance hit if you're not very careful. Many shops have moved away from using web fonts for this reason. Similarly, loading javascript at the wrong time can cause things to be slow, because the browser has to wait for the JS to be loaded. Beyond that, slowness is often about inefficiently rendering data, or again loading more data than you need. To make a fast app, you mostly need to be aware of how browsers load and render JS, CSS and HTML. Bit twiddling is not really relevant.

But really what this comes down to is that no one wants to pay for performance. An app that's kind of slow still pays the bills, and engineers are not in the driver's seat to prioritize performance. Instead, the focus is on finding ways to get people to pay you more money. It's much easier to sell features than performance, so that's where the focus is.


    looking around at all the painfully slow software I have to use
Unsound fundamentals (when to use the right data structures, etc) surely do contribute to software bloat but I don't think it's anywhere near the largest contributor to that problem, not by a long shot.

The overwhelming issue IMO is a lack of time allocated for developers to actually focus on perf. I've been doing this for years and I almost always need to fight+claw for that time, or just do it guerilla style when I'm really supposed to be doing something else.

It certainly is true that lot of performance issues could surely be avoided in the first place if developers made better choices in the first place, choosing the correct data structures and so on, and stronger fundamentals (though not bit twiddling in particular) would definitely help with that. But in most projects I've worked on, requirements change so rapidly it's tough to bake perf in from the start even if you're willing and capable.


Is there any evidence that engineers who do know all of the fundamentals you expect them to actually produce faster software? I would bet that a lot of the sluggish apps you have in mind were written by people with great fundamental comp sci knowledge.

Google probably hires for fundamentals as much as any company in the world and yet some of their main apps (ie. gmail) have abysmal performance. There are just so many factors that can result in poor performing software it's weird to me that you are so focused on something that a great many engineers have never needed to use in practice.


> Most people in this kind of shop simply haven't heard of binary operations or malloc/free

And this is how the industry reinvents the wheel every 10 years.


Or how it moves forward? I also don’t know about vacuum tubes, tape drives, GOTO, soldering, capacitors, IC schematics or assembly. So Steve Woz is a role model, but today’s Woz will write Swift or Go, or god-forbid, JavaScript.


You are making a strawman.

If people forget lessons from the past and redo the same mistakes over and over this is the opposite of technology moving forward. It hampers progress. This is why it's called "reinventing the wheel".

Yet, if you want an extreme example here's RISC-V *currently* adding support for vector instructions.

Introduced on DEC Vax in 1970 and forgotten for a good while.

https://riscv.org/wp-content/uploads/2015/06/riscv-vector-wo...


> vector instructions ... forgotten for a good while.

Not quite "forgotten" all of that time, though. Didn't x86 add them around the turn of the century? And probably other CISC architectures too, now and then over the decades. It's not like they went away totally, and only popped up again right now.


Apparently you do "know about" all those things, at least on some level -- otherwise, how would you know to mention them?


How do these developers fix implementation bugs? Go, Ruby, etc segfaults continuously with some code does all development just stop? Understanding these things doesn't make you an ineffective developer it just allows you to go deeper. Also learning low level programming gives knowledge to write performant solutions in high level languages.


No, you are right, it’s fundamental. It’s like not knowing what memory is, let alone how it’s managed. This isn’t an attempt at gatekeeping and I do understand that it’s possible to write useful software without understanding the fundamentals of von Neumann machines. But how is it possible to not wonder about how these magical boxes actually represent the software we write in memory? It’s not like you need a comprehensive understanding of solid state physics here.


I did plenty of the usual undergrad bitwise stuff in school and recently dove back into K&R C to refresh myself, so in a lot of ways I agree.

    But how is it possible to not wonder about how these 
    magical boxes actually represent the software we 
    write in memory? 
I mean, we all draw the line somewhere. Somewhere, there's a chip designer wondering how you have spent so long in this industry without bothering to get familiar with chip lithography techniques and such.


I get where you're coming from with the metaphor, but I see these types of comparisons a lot when talking about computing "fundamentals" and they've never really felt right to me.

Even though pilots may know the basics of lift and drag, they have abstractions over those things (tools, processes) to management. That really isn't any different than saying "I get what/why bit manipulation, but have never needed it".

Also - you _can_ learn fundamentals on the fly in software dev. Sure, not everyone has the drive to, but you can't reasonably google "how does lift/drag work" as a pilot who is flying the plane :)


    I disagree. Would you fly with a pilot who 
    never learned the basics of lift and drag?
Lift and drag are applicable to every flight. Bit twiddling is not relevant to every software development effort.

    But we call people professionals who build 
    software with no understanding of computing 
    fundamentals?
Broadly speaking, I'd actually agree with you here! Solid fundamentals are good and vital and a lot of developers don't have them.

I just don't think bit-twiddling is one of the particular software development fundamentals I'd deem necessary.


> Would you fly with a pilot who never learned the basics of lift and drag? Or hire a builder who didn't understand loads and support? But we call people professionals who build software with no understanding of computing fundamentals?

Try going to a construction site and ask the people working here about the properties of the materials they use, you're in for a surprise.


How many of those people are going to describe themselves as engineers though?


Probably no one because that's not the culture, in software everyone is a engineering because that's what we call people. In France we sometimes call cleaners "technicien de surface", which roughly translate to "surface technician". That's obviously not the kind of technician we usually think about, but in that context it's clear.


In Sweden, we went from "städare" (cleaner) to "lokalvårdare" ("premises caretaker"), and thence -- sarcastically! -- to "sanitetstekniker" ("sanitation technician")... in the 1980s, IIRC.


Ah I see why you're getting so much disagreement in this thread. Software job titles have no precise definitions in my experience. It would never occur to me to be astonished that a Software Engineer doesn't know something, while being unfazed by a Software Developer with the same knowledge and experience. ICs most often don't even pick their own titles; the company comes up with whatever they feel like.


"I don't need bitshift" is not enough.

You might never have to do a bitshift when writing analytics software in Python, sure, but wanting to understand how a computer works is necessary to be a good developer.

Curiosity is the point here. And it pays off in the long term.

If you don't look and explore outside of your comfort zone you'll end up missing something and doing poor design or reinventing the wheel quite a lot.


I did plenty of the usual bitwise stuff in C in my undergrad years and recently dove back into K&R C to see if I could re-learn it. No lack of curiosity here.

In general I agree with you; I just don't think demanding that my developers know bitwise stuff is the particular hill upon which I'd like to die. There are other fundamentals I'd value far more highly and would consider to be hard requirements such as understanding data structures and some decent idea of big-O, and a few other things as well.


> hard requirements such as understanding data structures and some decent idea of big-O,

That's the problem with current software. Web-devs and others who have gone through some sort of a CS curriculum, know only high-level stuff and know big-O, but have no idea of how a computer works. Big-O (mostly) only kicks in when large numbers of stuff are involved, but meanwhile what matters is how each pass of the iterating process is slow (and even when Big-O kicks in, it just multiplies the base unit, so if the base unit is slow, it will also account, it will scale as they like to say :-) ).

If you only know and trust what the high level gives you, you have no idea about how each base unit (operation) is slow, on how each may require data movement and memory. You can believe that copying a whole memory area (a data structure) is the same as copying a native integer, since it is the same operation in the high level language, except one can be a simple register-to-register transfer, while the other one means performing a loop with many memory accesses, both for reading and writing. You can think that exponentiation is the same as an addition, since they both are provided as primitives by the high level language, except one is a native single-cycle CPU operation, while the other one means looping over a couple of instructions because the CPU do not have hardware exponentiation. You can think that floating-point calculation are as fast as integer operations, especially when some of the favourite languages of that demographics found it clever to provide only a single type of number and implement it as FP; while they are still slower (except division).

That's a bit the theme of the original post. Instead of doing an actually simple operation, doing a lot of things without having any idea about how much pressure they put on the HW, how much resources they use: allocations, several loops. After all, it looks like a chain of simple operations from the high-level language; the author doesn't know which translate to basic operations, which translate into a complex loop-intensive and/or resources-heavy stream of instructions, which are just there to please the compiler and are not translated into anything.

The cure for some of those things is however quick and simple: opening and giving a look, once in one's life, at the PDF manual of a CPU at the instructions set chapter, should suffice to somewhat ground the reader into reality. It doesn't bite.


I am actually very curious, but I curate what to learn in the other direction - databases, multi core programming, systems architecture, organisation architecture, consumer and business psychology, sales and marketing, game theory, communication, understanding business processes, that sort of thing.


...and that's why a good interview process requires many interviews on many topics.

If you can prove that you have in-depth knowledge on a lot of topics a single datapoint about knowing bitshift will not be an issue.


Exactly. The other thing I get dirty look for from some people is when I tell them I've never done memory management. Every language I've worked in professionally has garbage collection, and I've also never done pointer arithmetic.


I never have used a bitshift operator in production code but I think it’s important to at least know that these fundamental operators exist.

Also, whenever I see code that takes a number, converts it to string, does some manipulation and then converts back to number , I always assume that this was a quick hack that can be done better.


> I think it’s important to at least know that these fundamental operators exist.

And some inkling of what they're good for. I have that too, although I'd have to look up the exact syntax of the operators (i.e. even what symbols they're actually written as) in pretty much every language I know that has them. Heh, look up if it has them to begin with...

> Also, whenever I see code that takes a number, converts it to string, does some manipulation and then converts back to number, I always assume that this was a quick hack

It's usually the other way around, IME: You get what should be a number as a string, have to clean it up (thousands separators, different decimal "point" (comma, dammit!) symbols, etc), convert it to a number to do some calculation, then apply the formatting back onto the result.

(Die besten Spätzle habe ich ironischerweise weder in Bayern noch im eigenen Ländle gegessen, sondern in der Dampfbäckerei in Berlin.)


Ben Eater’s YouTube videos are a great crash course on how computers work, and will make this kind of stuff (and much more) way more obvious if you watch.


Everything in a computer is binary. An 8 bit char has 8 bits. Bitwise operators operate on those bits. So a char with value 253 is represented in memory as 11111101. Bitwise operator >> would shift all those 1s to the right, resulting in 01111110. Bitwise & runs the AND operation between two values at a bit-level. Same for |, XOR, etc.

Hope that helps.

Here come the well actuallys.


That's clear enough, but I've never been able to wrap my head around why anyone would want to muck about with that. If I was working on the OS kernel or drivers or network implementations maybe? But even then if one wanted to do math with numbers wouldn't they do it with int and let the compiler optimise?


The usual reason is packing/unpacking bitfields.

To extract the hours/minutes/seconds from a DOS timestamp

  hour = (timestamp >> 11) & 0x1f
  min = (timestamp >> 5) & 0x3f
  sec = (timestamp & 0x1f) * 2
To check if a byte is a UTF-8 continuation byte

  (x >> 6) == 0b10
To pack 5-bit r, g, b values into a 16-bit integer

  (uint16)r | ((uint16)g << 5) | ((uint16)b << 10)
To swap the bytes of a uint32 (little endian <-> big endian conversion)

  (x >> 24) | ((x >> 8) & 0xff00) | ((x << 8) & 0xff0000) | (x << 24)


Depends on what you are doing. I personally have not had to do it in a few years. But typically for me it is if I am trying to move data between architectures (big vs little) and the native libs do not have the proper byte swaping things. Some of the older archs had some seriously odd ways of filling in bits to make a word. In some rare cases to save some bytes somewhere. Basically stuff them into some other field. For example I have 2 4 bit value range values and stuff them into 1 8 bit number. That takes up half the space. Also another use not seen much anymore was pixel bashing on a monitor. You had to know which bit a dataplane sat in so it would render correctly. These days it is usually 4 bytes, and even then only 3 are usually used.

These days you do not see bit packed fields as much but if you want to pop them out sometimes shifting and masking is more handy. Other times it gets in the way. Just depends on what you are doing. Also as storage and memory has increased the need for bitpacking things has decreased radically. But if you are doing something super critical it may be worth it just so your data fits into a cacheline and the extra instructions are worth not going out to the main store memory. The difference between 2MB total space to do something and 32GB is pretty immense so packing was borderline a necessity a few years ago. Now not so much.

Win32 OLE/COM error codes were a spot where masking and shifting was needed if you wanted to decode what was going wrong. If I remember correctly they did it that way so they could return the error code in 1 CPU register. But at a cost of code complexity.

If you do any automotive CAN bus work it is a veritable rabbit hole of playing with bits.


> But typically for me it is if I am trying to move data between architectures (big vs little) and the native libs do not have the proper byte swaping things.

Hm... Do any processors have native instructions for that? Like, swap "AA BB CC DD" into "DD CC BB AA", or perhaps "CC DD AA BB"? Maybe moving stuff from big- to little-endian or vice versa is too niche for that to be worthwile to implement in silicon (or microcode).

Anyway, speaking of endianness, ObPetPeeve: US date formatting -- fricking middle-endian! (And now, looking for that pithy quote about middle-endianness I can't find, I note that according to https://en.wikipedia.org/wiki/Endianness some machines actually were middle-endian. Sheesh...)


It's super common for embedded devices, and kernel development. Yeah compilers can mostly figure it out, but if you're really looking at the bits of a value, like in the example, you should operate on them bitwise, it's more clear, and definitely leaves no room for the compiler to leave it in an inefficient state.


Yeah, I'm a little embarrassed to say I'd have used an array do what's in the example. I wouldn't have done all the unnecessary string conversions, but if I have a sequence of 64 things and I want the first 22 things that's a slice operation in my head, not a bit-shift, even if the things are bits.


A great use of bitwise operators I came up with recently was to define a unique key with two sides(source and destination) in a single int. Just allocate one bit of the int to defining which side is represented and you have partitioned the space; everything after that is an exercise in masking and bit-flipping to determine matches.

Compare with the "sidecar a bool" approach that tends to come to mind to solve this task without bitwise operations; now you have to define a whole structure for it. It works, but it ain't elegant.


Fixed point math, cryptography, setting and testing flags in C code (or C-like API) are just some of the most common scenarios I use bit shifting for.

I've never written any kernel or driver code though.


That's like a carpenter who has never learned how to use a circular saw. Sure, you can work your way around it with all the other tools you have, but why would you? Just put 1 hour into studying it, and have a new tool under your belt.


> never learned how to use a circular saw.

More like a hand plane, or perhaps a chainsaw. Sure they're tools that work on wood but you can do lots of woodwork without ever needing one. Once upon a time i diddled Grey codes and similar and the elegance of bit banging is lovely and foundational... but someone who makes ships in bottles need never know how to cut down a tree. To mash a metaphor.


Right, but we could also say this about thousands of other things in the world of software engineering. There is not enough time to learn everything, or even 1% of it.

It's often best to have "wide, shallow" knowledge that is selectively deep in some places. I don't have much experience with time-series databases, but I know what they are and I'm familiar with their use cases. So if the need ever arises I'll be able to reach for the correct tool and invest my learning time wisely.


Then again, should take about 5 minutes tops to get bitshifting enough that you know where to look if you ever need it.


Personally I'd disagree with the only 5 minutes. I'm comfortable with bit shifting, and I've done some ATmega projects without Arduino (just plain C), some of which were bitbanging.

It took me several hours over the course of a couple days for the light to dawn. For some reason it just didn't want to click.

And it still took practice to get comfortable with it.


My point was that you don't need to be comfortable with it unless you need it. Learn enough to know what it does, so you know when you see a use-case and can come back and do the hard work.


Yeah, but I've filed along with "learn every regex operator". Might be interesting, but not sure where it would be useful these days.


Off topic, but I used to have "learn every regex operator" in that category but, after I actually did, I've been amazed at how often regex operators come in handy.


I don't much about it either until I started bit banging old gaming console controller protocols with an Arduino and other microcontrollers. Here it becomes necessary since you don't have a whole lot of processing power available for the lower level Arduinos and I use a hell don't know Atmega assembly to help speed things along.

Since learning there I have only used xor in a JavaScript frontend I wrote because I can. Otherwise I would have just used an if state to check for 1 and then change it to 0 and vice versa.


I learned how bit-shifting and bitwise operators worked while obtaining my BSCS degree. I even used it in a programming task in my Numerical Methods class, and my programs ran much, much faster than everyone else in that class. Like, orders of magnitude faster.

But I’ve never used that knowledge outside of my degree, or in any of the programs I’ve worked on since 1989.

So, interesting knowledge to have, but I can’t say it was actually useful in any professional setting.


Along the lines of AnIdiotOnTheNet's points, I'm surprised simple curiosity never led to browsing the bitwise operators - they're core features of every programming language (the languages themselves, not the standard libraries).

Bitwise Or is used all the time for flags in higher level languages (e.g. regex flags in python). And dealing with any kind of efficient binary file format or network protocol usually requires their use.

Guess these are things you could never encounter at a job, but it does worry me - modern software performance on our supercomputer phones and desktops is often atrocious, and needlessly so. Using compact binary representations has a lot of benefits - lets you do a lot more work in a single cache line (fetching from RAM is 100x slower), saves storage space (when you're dealing with a million or a billion of something) and is often easier to understand and maintain, surprisingly, than an over-abstracted class hierarchy that does everything it can to bury and hide the data your program works with.


Wow, you really kicked the hornet's nest with this one! My hot take is that people not knowing this is a good thing and is a sign of progress. Our (range of) languages are just - better - now. We still have languages for situations where people need to squeeze the performance out in resource constrained environments. We also don't have people wrangling raw pointers and bits for web applications because we have performant abstractions over all that generally extraneous detail. And that enable productivity gains.

For example C# has the [Flags] enum attribute which allows an enum variable to have multiple values. These enums are meant to have values which are powers of 2 like:

    [Flags]
    enum Options {
        IgnoreCase = 1, // 1 << 0
        Multiline = 2,  // 1 << 1
        Global = 4,     // 1 << 2
        Foo = 8,        // 1 << 3
        Bar = 16,       // 1 << 4
    }
And you can just use these things without having to really think about what they're doing, e.g:

    var myOpts = Options.Multiline | Options.Foo;
    var isGlobal = Enum.HasFlag(myOpts, Options.Global);
This is all bitwise stuff under the hood. For example:

    var myOpts = Options.Global | Options.IgnoreCase; // 5.
    var isGlobal = myOpts & 4 > 0;

I've ended up doing a lot of it at a low/raw level in C# when working with binary formats but most people just don't need to know it and that's great! Not working with that level of detail is good because it sure is easy to get it wrong. And it enables us to go faster and build quicker.

Sure HasFlag is less performant than actually doing the bitwise operations in most cases but it's an irrelevant thing to try to optimise versus caching, the database, CDN, authentication, permissions checks, etc. People getting worked up about the whole thing smacks of premature optimisation.


I think you could learn how bit-shifting and bitwise operators work in an afternoon or two. The concepts aren't very difficult, but a lot of people don't know them, because they haven't take a Computer Architecture class or they haven't had to use them in their domain.


It’s worth learning about. Fundamentally, computer processors don’t do anything more than add, subtract and bit shift, with comparison, GOTO, and moving numbers to and from memory to make all the magic happen.

The operators exist because they map directly to machine code.


I know how they work, but I’m really bad at arithmetic in general, so anything more than just basic operations (I do consider the above article pretty basic) and I’ve struggled to build stuff that required a lot of binary operations on values.


Your code is nowhere near as egregious as the OP's. Actually, I would even suspect your code is made better by not using hacks, because you deal with bases (radix) that are not necessarily 2. And using strings is a necessity since your output is a string.

The original article, instead, is a straightforward bit extraction, from integer to integer. For people who do know bitwise operators, the simple shift reduces mental load (at least according to OP, and I strongly agree).


You don't need bit shifting. You can just divide or multiply by powers of 2. (And most of the time the compiler will emit the corresponding bit shifting code.)


Wrong. (-1/2) is 0, (-1>>1) is -1 in C/C++/Java/C#...


Wrong. -1 >> 1 produces an implementation-defined value in C.

-1/2 used to also be implementation-defined; C99 started to require truncation toward zero.

That actually makes division unsuitable for simulating bit shifting; you generally want a right shift of a negative quantity to have predictable behavior: either extend the sign, or bring in a zero. Truncating division toward zero doesn't do either; it propagates the sign, until zero is hit.

If you can, avoid signed types if you're working with bit fields.


Yeah, the signed case needs some care (and in fact has special shift operators in instruction sets).


> At the risk of sounding stupid, I have no idea how bit-shifting and bitwise operators work, though I've worked in the field for over a decade and am now a Staff Engineer or whatever. There, I said it.

That sounds stupid.

There, I said it.


Maybe you'll end up needing it someday and then we'll have the The Big Little Guide to Bitwise Operators.

Will also take the opportunity to thank you for the message queues one, very good read!


they tend to come up when you're at the edge of your resource limitations, e.g. microcontrollers with kilobytes of ram, so you're stuffing data into custom bitfields, or quickly parsing certain bits out of a bitstream and taking atomic actions while ignoring the rest of the sample as fast as possible


Everyone goes through this phase where we laugh at non-optimal solution to a problem, and proudly cut a line or ten, or hundred from someone's solution.

But after 20 years in the industry, I find this trivial and expected even in code by people who are quite smart. Sometimes a solution evolves in a way where the initial variant wasn't subject to simplification, while the current version is. Sometimes there's an edge case that's non-obvious and solved by a complicated solution (comments help here). Sometimes the coder had their mind elsewhere so they approached the problem in a seemingly odd manner.

But none of this matters in the big picture. What matters is all this code is wrapper in a function that goes like "getFirst42Bits(int)" and it returns that. And so the only thing that matters is if this function has the right input, output for the bigger problem to solve. The implementation is a detail.

Of course, I'm excluding vulnerabilities and vast performance problems in hotspots, this matters.

But a specific instance of inefficient solution doesn't matter. A pattern of inefficient solutions may be ground for being demoted or laid off. But any instance is irrelevant. WE ALL DO THIS from time to time, but we don't realize it.

Heck, the primary reason why I hate most "web frameworks" is that they do in 2000 lines of boilerplate what I can do with my own APIs in 100.


  > Sometimes a solution evolves in a way
Yeah, forgot all about that detail -- reminded me of one the other day. I had finished up a really painful feature add, involving altering code in like 7 different classes due to a data model change only to find out that four of the seven classes were unused, and the remaining provided small bits of unrelated data needed for previous versions.

This feature had undergone six changes of the back-end that was responsible for providing the service it needed. Each time a new pipeline was added, the old pipeline was removed from the container but the code was left behind and I'd just added a 7th.


> But none of this matters in the big picture. What matters is all this code is wrapper in a function that goes like "getFirst42Bits(int)"

I disagree. Imo, `getFirst42Bits(snowflake)` is worse than `snowflake >> 22` — not only is it longer, but it also hides away the implementation in a way that could obscure bugs/require `jumpToDefinition` debugging.

Abstraction is often useful, but this seems like a textbook case of over-abstraction.


I disagree with your disagreement. I don't know what 22 is and how that is related to the initial requirements: get the first 42 bits. (Non-)ironically the requirement matches the suggested function name.

What I see is just a rare operator and a magic number.


64-42 = 22

This is quite obvious if you have spent some time using bitshift operators, just like inheritance and traits and all that are quite obvious if you have spent some time with OOP. If you have never programmed in C or C++ or the likes then you probably don't have a strong intuition about 8,16,32,64 and the bit operators. You probably don't know how to express a couple of common numbers in hexadecimal either.

Different programming. Not better or worse.

For the record, I had no fucking idea what that code snippet in the article was supposed to be doing had I not know the objective or function name beforehand.


> 64-42 = 22

I think they knew that. Their point was probably that it's better for code to directly express the requirements rather than meet the requirements through the expression of other data that isn't immediately obvious to be related at a glance. In this case, it isn't obvious that the 22 is related to the 42 in the requirements until you notice that the int size is 64. `snowflake >> ((sizeof snowflake) * 8 - 42)` is better. Even better would be putting the 42 in a descriptive constant.


Not to brag but to me it was obvious. You just kind of get an intuition for these things. It's okay to use that intuition to program more expressively.


You realize I used this placeholder name just as an example, since I have no idea what function this code was a part of.

That said, a one-liner function that's semantically specific to an app (where 42 bits is important for some reason) is absolutely harmless. The compiler will inline it, while for you it represents an opportunity to refactor this in one place. Because the assumption is you call a function at least twice, or it wouldn't be a function in the first place (some exceptions but mostly it's the rule).


The compiler may inline it, but my brain can’t.


> And sure, `snowflake >> 22` would be shorter, clearer, and far more efficient, but it wouldn't let you show off all the cool language features you have at your disposal, and isn't that what's really important?

I don't think the over-complicated code was motivated by a desire to show off — it was written by someone who doesn't understand how bit shifts work, which is still potentially troubling but very different


Or that simply has never had reason to use them so didn’t think to look for them.


I never found C style operators for bit twiddling to be intuitive (compared to assembly) even if you use the left angle brace to shift left.


While this example is a pretty simple and I agree that a bit shift is a fine solution, I’ve recently been doing work on some bit-oriented HDLC-like protocols I’m convinced Erlang is the only language that can manipulate binary data very well at the bit level.


Agreed. I was writing a port of the redis protocol to erlang for a personal project that's a server using said protocol as an interface for distributed MPSC "locking", and it was incredibly simple to implement because of erlang's binary strings and pattern matching. Same with base64/hex/etc. manipulations of strings into binary data. I've read a fair bit about how erlang is great for protocols, but hadn't experienced it myself (professionally or personally) until I decided to implement this project.


elixir too, since the underlying technology is the same (arguably it's even easier to read in elixir than erlang). This was super helpful when I was writing a dhcp server.


Out of curiosity, what makes the Erlang syntax so good?


It's super easy to construct binaries and bitstrings even when you need to work with award sizes or set bits at weird positions and IMO looks a lot cleaner than a lot of shifting and boolean operators. For example constructing a 8-bit LAPB Information transfer control field might look like:

    build_info_control(SendSeq, RecvSeq, Poll) ->
        <<0:1, SendSeq/bitstring, Poll/bitstring, RecvSeq/bitstring>>.
Where the `SendSeq` and `RecvSeq` are 3-bit sequence numbers... say one of them is `<<2:3>>` (0b010) and Poll is a single bit (you can see the format here: https://imgur.com/a/9bVBM0W). This field would then be stuck into a binary representing the entire frame.

This could probably be made a littler simpler, though, as I'm still rather new to Erlang.

Note you also do get bit shift operators for cases when they’re an appropriate solution (`bsl` and `bsr`)


(As someone strongly against adding new features to languages,) I've never understood where there would be any need for anything more complicated than list concatenation when assembling bits. In some language with list notation, (++ being concatenation of two lists) you could do something like this:

[0] ++ sendSeq ++ recvSeq ++ [poll]

say sendSeq was one bit but we still wanted it to take 3 bits:

[0] ++ pad 3 sendSeq ++ poll ++ recvSeq

given some "pad" function defined in a library

The only reason it's "hard" in C is because the operations are induced from what the machine can do efficiently and without "a sufficiently smart compiler".


C has the concept of bit fields, and this could be specified as:

    struct build_info_control
    {
      unsigned char         : 1;
      unsigned char SendSeq : 3;
      unsigned char Poll    : 1;
      unsigned char RecvSeq : 3;
    };

    struct build_info_control control;

    control.SendSeq = sendSeq;
    control.Poll    = true;
    control.RecvSeq = recvSeq;
The downside of bit fields is that they are implementation specific---you'll have to check the compiler documentation to see if the bits are specified "most significant bit to least significant bit" or "least significant bit to most significant bit" (if that is indeed, important to the problem).


https://erlang.org/doc/programming_examples/bit_syntax.html

> say sendSeq was one bit but we still wanted it to take 3 bits:

> [0] ++ pad 3 sendSeq ++ poll ++ recvSeq

In Erlang you could do this to get something padded (or truncated) to the correct size:

  << 2:3 >> ;; corresponds to the 3 bits <<010>>
So you can do something like this to ensure the precise bit size of each field:

  << 0:1, SendSeq:3, Poll:3, RecvSeq:3>>


Yaeh, my point was that it is trivial to define a pad function and call it instead of needing some syntax like Erlang. But now I see there is deconstruction too, which is more arguably useful to take the time to add syntax for.


The bit syntax in Erlang is for deconstruction and assignment (pattern matching), not just assembling.

The entire language is built around pattern matching to the point where it simply wouldn't exist otherwise, and I assume binary protocol handling was a core requirement when the language was designed since it was created for telecom switches.


I learned these operators in a college course I took while in High School[0]. I remember I had to make flash cards with examples and memorize throughout the semester. I kept mixing up & and | in comparison/setting.

It's still rare that I use anything outside of those two operators in work code, and when I do it tends to get commented as a premature optimization during code reviews.

[0] That's not a humble-brag, context provided b/c it's a time, due to 7 hours a day spent taking tests/memorizing and spent also learning algebra through calculus, I was at my absolute peek for learning this sort of thing.


> I never found C style operators for bit twiddling to be intuitive (compared to assembly) even if you use the left angle brace to shift left.

What I don't get: Are those operators present in C++ too -- wouldn't they clash with the "streaming" thingamajigs, isn't that what "<<" means nowadays?


The existence of a single pair of operators, instead of one for signed and another for unsigned shifts (with good names) make the operators impossible to remember to me.

Ok, the article just made me remember they are not signed. But it's certain that the next time I need them, I won't remember it again.


Or perhaps for a rarely called function, and which was quicker to write via an inefficient but familiar method rather than spend any time figuring out how to do it in an optimal and correct way.


My guess is this was written as a joke. It's rust, so the person writing this has to have some programming knowledge.


Looking at the most recent stackoverflow developer survey[1], most of the developers expressing interest in Rust would be moving from high level languages like Javascript and Python. It doesn't surprise me that developers with that background wouldn't be aware of systems programming concepts like bitshifts. I've seen similar code in Java (convert to boolean array, move bits down, convert back to integer) that was written in earnest.

[1] https://insights.stackoverflow.com/survey/2021#section-worke...


"some programming knowledge" ≠ "knowledge of bit operators"

Imo, Rust makes a very good second language for someone who first learned Javascript or Ruby — and someone who used those languages primarily for web dev may never have needed to learn about bit shifts


I find it hard to believe that someone simultaneously doesn't know the first thing about bit operations, but has come to the conclusion that they need to find the first 42 bits.

Also, JavaScript's handling of numbers is so weird that it's one of those things that comes up just by virtue of its weirdness. For that reason I'd expect a JS programmer at least have come across but operations.

(edit - I take the original post as a joke too, and that the humour is derived from the juxtaposition)


Both very good points -- I hadn't originally assumed it was a joke, mostly because I don't write `rust` code and I don't know how unlikely it would be to come to that solution.

I know I've seen examples at least as odd as this in the past in languages I'm familiar with -- where, say, the developer was really good with LINQ or something and made five calls instead of just accessing an array by index -- that were meant "intentionally", but were bizarre to someone not sitting in their brain at that moment.

I hadn't, however, thought about either of the points you mentioned.

You're quite the detective. :)


Most JS programmers are utterly unaware of how their "numbers" work.


The author could have been unaware that it's possible to program computers without knowledge of binary arithmetic and bit level operations.


  > which is still potentially troubling but very different 
I'm having a really difficult time remembering the last time I had to even use a bit operator at work[0]. For a lot of developers, the need to keep that operator in their head is going to rank way lower than, say, how to properly configure CORS in `ngninx`.

You always have to judge a developer's gaps/knowledge given their context. We're dealing in technologies that have sub-sub-sub-sub-disciplines that a person may have expertise in which result in them missing huge portions of the languages' features. I've learned that this hyper-specialized developer can be as effective as anyone -- the ability to rapidly acquire, integrate and use new knowledge with whatever tooling is required is far more important to me than whether or not a developer has large gaps in the tools we use[1].

I'm nit-picking a little, but it's a habit of mine to turn around discussions at my office that go something like "How can any web developer not know (x)?"[2] Ask a developer on a conference call if they're familiar with some obscure technology, and you'll hear ranges of qualified "Yes" responses and a whole bunch of clacking while each Google's various bits of information. How much easier is it if everyone just says "No, I don't" and asks the person who brought it up to explain. It's really hard to get developers past "Imposter Syndrom" and open to admitting their gaps so they can be accounted for in planning. Even where I'm at, now, I see it from time to time and this is at a shop with excellent culture and a focus on projects/products that are new inventions/creations.

And, unfortunately, it causes me to judge that individual unfairly, myself -- I've been writing code since I was 13 years old. I have written a lot of code. My goal is that the best code I wrote 6 months ago should at least slightly embarrass me, today.[3] And I guarantee, over the years, I've written code that is worthy of posting on a site like this. I can't imagine what kind of insanity my first "useful" rust program will end up looking like, but given the memory-safety rules imposed, I'm expecting to have to integrate a new mental model that will result in some comical implementations.

For me -- this work has a way of either humbling you or down-right beating you up with how often you're reminded about what you don't know. I think it gets worse as you learn more almost feeding the whole "ageism" in the industry -- middle-aged developers are far more honest about what they don't know and some are just beaten to death by the treadmill. I have watched, over the years, as developer after developer has left for "management" or leaving a shop like the one I'm at for a global multi-national to work on the most boring internal enterprise-y waste -- consistent/easy, but pointless. For the few that I was close enough to that I got more than just "PR answers", the crux of it was burn out from never feeling like you're actually "knowing" less about what you're working on[4]

[0] Outside of `enum`s in my language of choice, which perform best when flags are checked/appended using operators. But even that is an unnecessary optimization nearly always, there's a ".HasFlag()" method that I simply forget about because it was taught to me in my teens, it's implemented consistently in a lot of languages and it's muscle memory at times.

[1] I've recommended the hire of individuals who, outside of "they have been writing C# code for a decade", they hadn't touched the part of the framework the job was for -- i.e. they were Desktop devs applying to Web, Web applying to mobile, and many candidates lack even basic Docker/shell knowledge, etc. All ended up picking up what they were missing on-the-job just like the rest of us.

[2] Because, unlike every other field, when "(x)" became "old", it had been on its 5th version, 4th major API change and was born about three and a half minutes ago.

[3] Sometimes I'm unfair -- the reason I'm embarrassed is because I didn't use a very obvious feature, but at the time I'm looking at the code I'm also forgetting that said feature didn't exist back then. But often it's just because I've continued to read/study/do.

[4] These were not developers who were suffering from mental decline with age and were among the better I've worked with. I really think it's a case of "learning about something new" simultaneously adding to "discovering that your skillset does not include this massive body of knowledge around your small usage of 'something new'". You learn way more about how much you don't know than you add to your total knowledge making you feel 'net dumber'.


I think it comes down to this: as a developer, part of your job is tuning your sensor for moments of "there has to be an easier way to do this, let me look it up".

I wouldn't criticize a developer for being unfamiliar with bit shifts because I've never had to use it for work either. The part I'm incredulous about is that there wasn't a step in the middle where the developer looked up bit manipulations.


There's a lot of discussion about "should most devs be expected to know this off the top of their head these days?" that I think misses the point entirely.

A much more relevant question is: when you are working on code that eventually needs a "give me the first 42 bits of a 64-bit value" function, do you: (a) do some basic research into binary operators since you're at least aware that binary operations are a thing, or (b) just whip up the first loop implementation that comes to mind?

If you're verifying your code anyway - which I sure hope you are when writing a delightful loop like the one in the linked article - it won't take you long to check out what the bitwise operators do.


* deleted... in hindsight my silly comment doesn't add to the discussion.


Isn't this what engineers that build tangible things do?


Consider a spectrum ranging from "new hire" to "understands Duff's Device[1] at a glance".

The code needs to default toward the former, with well-socialized exceptions.

[1] https://en.m.wikipedia.org/wiki/Duff's_device


What's really remarkable about this as everyone argues whether it's good for a "software engineer" to have even the most basic understanding of how a computer works is the magnitude of extra work being done here as the computer sees it versus what the coder sees.

The coder sees one version being one line with an operator they don't know and the other version being 10 lines of code that look a little dense but it's all stuff they know.

The computer sees one instruction versus thousands or maybe tens of thousands.

Now is it silly to check if a multiplication is by 2 and then use shifts if you're writing a high level language and it's a one off/non-critical operation? Yes it is. But if you're decoding/encoding binary data as a lot of people have conjectured then you're probably doing a lot of it and this really matters.


Slightly disturbed by the article's use of the term "first bits" for describing the most significant ones.

[Sure they are the leftmost ones when writing down the number, but all CPUs are LSB-first now... Plus, I'm a loyalist to the Emperor of Lilliput :-).]


Dunno, RFC 791 states in its Appendix B: "Whenever an octet represents a numeric quantity the left most bit in the diagram is the high order or most significant bit. That is, the bit labeled 0 is the most significant bit." which I read as "left bits", numbered from 0 (i. e. the "first bits") are those most significant.


Yes but that’s network byte order. It’s pretty natural to consider (var & 0b1) the “first bit”. (especially if coming from certain HW)

Really “first bit” can mean whatever you want so the phrase just shouldn’t be used


No, I am pointing out the _bit_ order inside a byte, not _byte_ order. (The mentioned Appendix describes both, I am quoting the intra-byte part.)


If you've only ever worked with weak typed languages you might be tempted to think that 'everything is/should be/can be a character string.'

it's likely that the author doesn't know about bitwise operators. This is similar to not knowing about modulus division (like in the classic 'fizzbuzz' test). Surprisingly, many developers just don't know about these things or if they do, they don't recall these features when designing a solution.


That's... actually not that terrible for an enterprise code? Sure it's verbose and inefficient, but it does one thing, and it's trivial to fix, without rewriting anything else.

An actual experienced enterprise developer would have created BitManipulator, BitManipulatorConfig, BitManipulatorFactory, ShiftBitManipulatorImpl, XorBitManipulatorImpl, and BitManipulatorRetryWrapper, and you will be required to specify which operations to use by adding a yaml config file, which includes thread pool options and has five different versions in the git repository and three more in k8s clusters.


We cannot both lament that software has become so incredibly bloated and high-latency/unresponsive and yet applaud like there's no tomorrow that it's possible to develop for decades without knowing what bit-shifting is nor how to do it...


I was hoping that the punchline was going to be that the Rust compiler optimizes this down to a bit shift.


In the past I would use “Five Essential Phone-Screen Questions”. The last question is on bit twiddling. A sample question might be “count how many bits in this byte are 1”.

I don’t know if it is fair to assume that all developers know this stuff. When interviewing engineers who claimed experience with anything embedded (MCU programming or field busses like CAN or modbus), I don’t know how it would be possible to program anything at the lowest layers without but shifting. But even in an embedded world you spend most of your time writing application logic.

Outside of the embedded world you can get away with repenting things without worrying about bits. Particularly in higher level languages it might be more natural to have a strict with several named bools or an array of bools.

One language that makes working with bitfields much more pleasant is C#. You can apply a Flags attribute to an enum type. At runtime it represented by an int, so it’s fast. But it also has handy methods for converting to and from a string form that gives names to all the bits that are set. So it’s fairly friendly for logging and debugging.

https://sites.google.com/site/steveyegge2/five-essential-pho...


Here's an issue I have with that question.

> A sample question might be “count how many bits in this byte are 1”

Use the Popcnt assembly instruction.

> But what if the popcnt assembly instruction doesn't exist on your platform?

Popcnt exists on POWER9, ARM, x86, NVidia PTX (GPU Assembly), AMD RDNA, AMD CDNA, AMD GCN.

> But no, really this interview is about seeing if you're able to answer the question in a manner that's familiar to the interviewer... and not really how you'd solve the problem in practice.

Fine fine fine. I'll open up my "Art of Computer Programming Volume 4A, Bitwise Tricks" and copy/paste the algorithm specified therein. I don't keep the particulars of how to implement popcount in my brain anymore because its implemented in most CPUs / GPUs these days, AND I can look it up if you happen to be on an uncommon platform in pretty standard bitwise-tricks based programming books.

--------

But really, the interviewer would want to know "You do the weird "(x & 0x5555) + ((x & 0xAAAA) >> 1) (etc. etc.)" routine.

But really, the practical answer (aka: use a single-cycle popcnt instruction available on all mainstream desktop-class CPUs and GPUs I can possibly imagine in today's world) is still the "wrong answer".

---------

I think a better interview question would be: Demonstrate mastery of horizontal SIMD-programming, __such as__ implementing the popcnt instruction out of bitshifts, adds, and "AND" instructions.

Or, demonstrate said parallel programming mastery over any other similar problem (be it lookahead carry addition, or kogge-stone bishop/rook moves in chess, or whatever).

That's what you really are trying to figure out when you ask the question. Anyone can solve "popcnt" or look it up with a quickie search (either on the internet, or reading through a book). But the overall methodology generalizes to many problems in the SIMD-programming space, so its an important skill to have.

But also, its important for the interviewer to demonstrate having this skill to the interviewee. Otherwise, you lose the interviewee. (Skilled individuals want their skills recognized. If you fail to recognize their skill during the interview, they'll seek someone else)


I've never actually had an interviewer reject my solution for not being identical to the one they had in mind. Are you basing this on real experience, or assumption? If I asked you that as a "do they know about bits and bytes at all" question and you said "here's an assembly function that's available everywhere" I'd be like "great, they clearly are more familiar than I am about that space."

The closest I've run into has been the interviewer who's like "use any language you like" who then couldn't seem to grasp that `.each` in Ruby is basically the same as `for(x : things)` and kept bugging me about it until I said fuck it and just used Python instead.


Your answer is great, however the questions comes with a disclaimer:

> Please understand: what I'm looking for here is a total vacuum in one of these areas. It's OK if they struggle a little and then figure it out. It's OK if they need some minor hints or prompting. I don't mind if they're rusty or slow. What you're looking for is candidates who are utterly clueless, or horribly confused, about the area in question.

If you follow this, the way you reply is perfect. Someone using divisions would work too. Someone simply mapping all the values of a byte to the number of bits that are 1 would work too.


Man, I wish languages gave us easy access to these sorts of instructions. So many times I've written a loop to count how many times divisible by 2 a number is, when there has to be an instruction for that... yes, notionally I could drop down into assembly, but that would require learning how to do so in the relevant language!

Out of curiosity, what is this weird way of computing popcount you mention? I haven't seen it before.


> Out of curiosity, what is this weird way of computing popcount you mention? I haven't seen it before.

So lets just do 8-bits at a time, in binary. It makes it clear what's going on.

Your 8-bit number is: x = x0 x1 x2 x3 x4 x5 x6 x7

* 0x55 is 01010101 in binary.

* 0xAA is 10101010 in binary.

* (x & 0x55) is (0 & x0) (1 & x1) (0 & x2) (1 & x3)...

* (x & 0xaa) is (1 & x0) (0 & x1) (1 & x2) ...

* (x & 0x55) + ((x & 0xaa) >> 1) is: |x0 + x1|,|x2 + x3|,|x4+x5|, |x6+x7|, where |blahblah| is a 2-bit number taking up 2-bits of the 8-bit number.

* step2 = (x & 0x55) + ((x & 0xaa) >> 1)

* (step2 & 0x33) is now |00| |x2+x3| |00| |x6+x7|

* (step2 & 0xcc) is now |x0+x1| |00| |x4+x5| |00|

* (step2 & 0x33) + ((step2 & 0xcc) >> 2) is //x0+x1+x2+x3// //x4+x5+x6+x7//, where //blah// is a 4-bit number.

You probably can figure out the last step at this point. Take the bottom 4 bits and add them with the top 4 bits.

A 2-bit number (such as |x0+x1|) can only take on 3 values: 0, 1, or 2. As such, |x0+x1| will never "carry" outside of the 2-bit space you've allocated for it.

Similarly, a 4-bit number (such as //x0+x1+x2+x3//) can only take on 5 values: 0, 1, 2, 3, 4. The 4-bits you've allocated can take on values all the way up to 16, so you also never have to worry about overflow.


Deliberately more obviously 'weird' version, from memory:

  uint64_t x = ...;
  for(size_t i = 1; i<UINT_BIT ;i*=2) // loop unroll
    {
    unsigned m = UINT_MAX / ((1<<i)+1); // constant-fold
    // divide 2^(2k)-1 by 2^k+1 to get 2^k-1 in each group
    // 2k-bit group means 2^k-1 sets bottom half
    
    x = (x & m) + (x>>i & m);
    // we can save a AND w/o carry but 1-bit and 2-bit
    // need 2 ANDs to prevent carry between groups
    // x = (x + (x>>i)) & m;
    }


Rust has methods for these.

Most C/C++ compilers have some variation on __builtin_cttz or the like. C++20 adds templated methods rather than builtins in the <bit> header.

Java has Integer.bitCount, Integer.numberOfLeadingZeros, etc. since Java 5.

I haven't looked at Go, Swift, D, Zig, Nim, etc., but I suspect that all of them will have some variation on this theme.


Because ideally you want a language that can express the intent without hard coding for a specific arch


I agree with everything you say. Whether someone gets this question right or wrong does not really tell you anything about whether you should hire them.

In my defense, at the time I was asking this question question I was fresh out of college and my company gave no training or guidance on how to interview.


Oh, don't worry. I'm not necessarily criticizing you in particular (I wasn't part of that interview :-) ). Its just kind of in the abstract sorta thing.

If anything, I'm trying to agree with your original post, and just kinda ranting a bit myself on a similar subject.


even llvm will translate the popcount intrinsic into the code you want for you! just use it freely.


well these days LL PLs tend have the popcount intrinsic...


This is sort of a "if all you have is a hammer, everything looks like a nail" issue.

I have a very smart friend who doesn't code, but is a math wiz and makes a ridiculously high salary at a big tech company. She'll do things with spreadsheets that seem like a ridiculously complicated and inefficient way to do things that I could efficiently do with a couple lines of code.

But of course, she can also quickly do things with spreadsheets that would take me forever to code up. She earns that big salary.

This is like that. I guess it is unfortunate that this developer never ran into bitwise operators, but honestly, I have hardly used them in decades (I learned to code in C, now I'm mostly JavaScript and Typescript). Obviously the developer has a lot of really useful skills, and in this case was able to solve the problem, if not optimally.

I bet that developer is a lot more productive in general, than me from 25 years ago who knew bitwise operators but none of that other crazy stuff.


The problem here is not whether the programmer should know whether binary operators exist. It's that s/he couldn't ask someone or google it before implementing a terrible solution.

Imagine all the other terrible things this programmer has done in this codebase


Having written an int62_t class in C++ last year, I now feel I should return to it and verify that the bitshift version I wrote is faster than this. I mean, it's written in Rust, which apparently is the new hotness, so I should really make sure, right?

I agree with the comments that point out that not knowing about shift operators isn't really that surprising, but believing you have a reason to want 42 bits of a 64 bit value without knowing about shift operators is.


> I agree with the comments that point out that not knowing about shift operators isn't really that surprising, but believing you have a reason to want 42 bits of a 64 bit value without knowing about shift operators is.

Come to think of it, the article doesn't say why they wanted to do that in the first place. At least one comment mentions having stuff (booleans) encoded in bits and extracting those that go together, and of course that could be it, but we don't know. My own first thought, on seeing the variable name "snowflake", was something along the lines of "What, does the Snowflake database store integers as 42-bit values or something?"


Regarding practical applications, shift operators are only useful to slice and align masks and bitfields: something meaningful, simple and basic, but usually done only in specific domains (e.g. handling a certain style of compact file formats) that it is perfectly normal to avoid for a whole lifetime.

On the other hand, in theory bit shift operators are more important: if you don't know what they do and why you would want to do it, and if you don't know that they are fast and cheap to implement in a microprocessor and why, you cannot possibly have a good grasp of how computers work, and you shouldn't be allowed to use any remotely "low-level" programming language.


Wow, the thedailywtf.com sure brings back memories.

I used to read it daily (no pun intended) until I discovered HN, and I actually picked up a lot of knowledge about best practices and so on by reading the criticisms in the comments.


Obviously the author of the code didn’t know or understand the bit shift operator and came up with a less optimal solution.

I cannot believe that was done just to be cool.


> I cannot believe that was done just to be cool.

No, that was just the DWTF being sarcastic.


No problem, we just have to improve our compilers until the above actually compiles to a single machine instruction... ;)


> And sure, snowflake >> 22 would be shorter, clearer, and far more efficient, but it wouldn't let you show off all the cool language features you have at your disposal, and isn't that what's really important?

And there goes the life of a software maintainer down the drain!


I'm hoping that the Rust compiler is good enough to compile this back to a bit-shift instruction?


You are surely joking. It would have to make a static inference that a numeric variable is converted to a string variable, then processed as a string to base 2 representation, then truncated as a string that can only contain the '1' and '0' characters, then converted back into binary.


> It would have to make a static inference

Not necessarily a static inference... could be speculative.

Code folding golf is a bit of a hobby of mine. For example you can optimise through the string-integer conversion by storing a special string that remembers it was originally an integer and returning that value again instead of re-parsing. I work on a compiler that does that one for real, for example.

I don't seriously think Rust does this right now, but would be a challenge to try.


That would go way further than existing optimizers can go.


And a lot further than we should want them to attempt to go.


That would be impressive, could someone check?


No need to check, that would be almost impossible, unless they could read the programmer's mind for the intent...


Which I think is the argument behind declarative languages. If we can encode the intent precisely, then at least in theory the compiler can find the optimal solution for the given architecture and other constraints. Of course in practice that probably requires strong AI to accomplish.


Unfortunately, but expectedly, it cannot reason about it [1]. Even if it could I'm not sure it would be allowed to elide the allocation(s).

[1] https://godbolt.org/z/63xEcfaEE


The metaphysics of that are going to be interesting. "Do what I mean" programming meets all the stories about why one shouldn't accept bargains with fairies or djinn.

"I can't let you use `strcpy()` there, Dave" is the least of it.


Pity, if it ever does then I'll put aside my distaste for the whole "curl <url> | sh" thing and learn the language.


Could Graal? That would be pretty impressive.


Others have mentioned microcontrollers and using shifts for efficiency, but also often because system services are often described and accessed by individual bits. So for instance (in C) xxx |= (1 << n) will set bit n and xxx &= ~(1 << n) will clear it.


There's how things can be done and how things ought to be done. This if nothing else demonstrates a learning path.

And yes, I am shocked that the developer didn't know about bit shift operations... but hey, modern development is just different I suppose.


I assume "first" here means "highest order"? That's what the solution does.

Assuming it's an unsigned shift, of course. If it's signed you will get the most significant bit replicated, and I doubt that's what you want.


I've written shell scripts that had this level of convolution in string manipulation.


Does Rust's `:b` formatter always print leading zeroes? Otherwise the code will not only be inefficient, but wrong for half the input space.

Or perhaps that was the original goal, to find the most significant one and the 41 bits that trail it...


The snippet looks made up. It requires a programmer skilled enough to understand iterators and compose them in a rather complex operation, but not skilled enough to know about shifts or look for “bitwise operators rust” on Google.


That wouldn't be surprising, as you encounter iterators everywhere in Rust, but don't encounter bitwise operation unless you're working in a specific domain. What is surprising is to suddenly need the 42 first bits of a number.


We need to start a repo for Enterprise Right Shift.


It's funny because we all did this the first time we actually needed <<


Use AND with a const 1 42 times and 0s for the rest.

Cant do it faster :)


"But I can do it in one line."

Congratulations, you are the best coder.


I am hoping this was meant to be a joke. If not fire him ;-)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: