Hacker Newsnew | past | comments | ask | show | jobs | submit | mwsherman's commentslogin

Shameless plug, if one wishes to track down allocations in Go, an allocations explorer for VS Code: https://marketplace.visualstudio.com/items?itemName=Clipperh...


That looks nice. Going to give it a try.


I’ve found C#’s frozen dictionary to be useful: https://learn.microsoft.com/en-us/dotnet/api/system.collecti...

It’s optimized for fast reads in exchange for expensive creation.


There is mention of how len() is bytes, not “characters”. A further subtlety: a rune (codepoint) is still not necessarily a “character” in terms of what is displayed for users — that would be a “grapheme”.

A grapheme can be multiple codepoints, with modifiers, joiners, etc.

This is true in all languages, it’s a Unicode thing, not a Go thing. Shameless plug, here is a grapheme tokenizer for Go: https://github.com/clipperhouse/uax29/tree/master/graphemes


Here’s my favorite post on the subject https://adam-p.ca/blog/2025/04/string-length/


Finally an article that doesn't pretend grapheme clusters are the be-all end-all of Unicode handling.

I'm saving this one. Not exactly how I'd explain it, but it's simplified enough to share with my current co-workers without being misleading.


len() is also returning int instead of uint/uint64 in Go.

I do not use Go but ran into this when I had to write a Go wrapper for some Rust stuff the other day. I was baffled.


In Go, string effectively serves as a read-only slice, if we are talking about bytes.

ReadOnlySpan<T> in C# is great! In my opinion, Go essentially designed in “span” from the start.


Yeah I think the C# team was definitely influenced by Go with their addition of Spans..

Interesting approach regarding using strings as containers for raw bytes, but when you create one over a []byte I believe it makes a copy almost always (always?) so you can’t get a zero-cost read-only view of the data to pass to other functions.


That’s true, converting in either direction will typically allocate. Which it must, semantically.

One can use unsafe for a zero-copy conversion, but now you are breaking the semantics: a string becomes mutable, because its underlying bytes are mutable.

Or! One can often handle strings and bytes interchangeably with generics: https://github.com/clipperhouse/stringish


Like everything else in Go, spans predate it for a few decades.


System languages from 1980's already had the span concept.

One way that you will find it is that they used to be called open arrays in some of them.



Shameless plug, you may wish to do Lucene-style tokenizing using the Unicode standard: https://github.com/clipperhouse/uax29/tree/master/words


Got to admit, initial impressions, this is pretty neat, would spend sometime with this. Thanks for the link :)


I move between Go and C#. I wrote a zero-allocation package in Go [1] and then ported to C# — and the allocations exploded!

I had forgotten, or perhaps never realized, that substrings in C# allocate. The solution was Spans.

Notably, it caused me to realize that Go had “spans” designed in from the start.

[1] https://github.com/clipperhouse/uax29


Strings in C# are inmutable.

To work with strings you should use StringBuilder.


  > Strings in C# are inmutable.
Yes, but

  > To work with strings you should use StringBuilder.
It helps combine strings together. The author needed the opposite - split/slice strings.


Eric Lippert describes the difference between immutability and what he calls "persistence" and explains why C#/.NET copies the string contents to make a substring: https://stackoverflow.com/a/6750591/814422

Go's strings are also immutable and yet substrings share the same internal memory. Java/JVM also has immutable strings and yet substrings shared the char[] array of the parent string up until Java 7, when they switched to copying instead (for the same reason as .NET): https://mail.openjdk.org/pipermail/core-libs-dev/2012-June/0...


That SO link is really good, thank you for the comment.


No, slices in Go are more akin to ArraySegment but with resizing/copy-on-append. It does not have the same `byref` mechanism .NET supports, which can reference arbitrary memory (GC-owned or otherwise) in a unified way as a single (special) pointer type.


This is wrong.

Slices in Go are not restricted to GC memory. They can also point to stack memory (simply slice a stack-allocated array; though this often fails escape analysis and spills onto the heap anyway), global memory, and non-Go memory.

The three things in a slice are the (arbitrary) pointer, the length, and the capacity: https://go.dev/src/runtime/slice.go

Go's GC recognizes internal pointers, so unlike ArraySegment<T>, there's no requirement to point at the beginning of an allocation, nor any need to store an offset (the pointer is simply advanced instead). Go's GC also recognizes off-heap (foreign) pointers, so the ordinary slice type handles them just fine.

The practical differences between a Go slice []T and a .NET Span<T> are only that:

  1. []T has an extra field (capacity), which is only really used by append()
  2. []T itself can spill onto the managed heap without issue (*)
Go 1.17 even made it easy to construct slices around off-heap memory with unsafe.Slice: https://pkg.go.dev/unsafe#Slice

(*): Span<T> is a "ref struct" which restricts it to the stack (see https://learn.microsoft.com/en-us/dotnet/csharp/language-ref...); whereas, []T can be safely stored anywhere *T can


(can't respond directly and don't have the rep to vouch)

> Span bounds are guaranteed to be correct at all times and compiler explicitly trusts this (unless constructed with unsafe), because span is larger than a single pointer, its assignment is not atomic, therefore observing a torn span will lead to buffer overrun, heap corruption, etc. when such access is not synchronized, which would make .NET not memory safe

Indeed, the lack of this restriction is actually a (minor) problem in Go. It is possible to have a torn slice, string, or interface (the three fat pointers) by mutably sharing such a variable across goroutines. This is the only (known) source of memory unsafety in otherwise safe Go, but it is a notable hole: https://research.swtch.com/gorace


Go pointers can point at the stack or inside objects just fine, they are exactly as expressive as C# unsafe pointers (i.e. more expressive than `ref`).

What Go can't do is create a single-element slice out of a variable or pointer to it. But that just means code duplication if you need to cover both cases, not that it's not expressible at all.


> What Go can't do is create a single-element slice out of a variable or pointer to it.

  var x int
  s := unsafe.Slice(&x, 1)
  fmt.Println(&x == &s[0])
  // Output: true


Good catch! That takes care of the unsafe pointer case, but not the safe ref case.

There's no reason for this to be unsafe - you're asking for a 1-element slice, and the compiler knows that the variable is always going to be there as long as the reference exists.

In C#, `Span<T>` has a (safe) constructor from `ref T`.


Putting on eng manager hat, the problem to solve is that this regression went undetected, not that Safari is slow.

The solution is a test that fails when Chrome and Safari have substantially different render times.


> The solution is a test that fails when Chrome and Safari have substantially different render times.

That test will be disabled for being flaky in under a week because the CI runners have contention with other jobs, causing them to randomly be slower and flake, and the frontend team does not want to waste time investigating flakes.

"Just have dedicated runners with guaranteed CPU performance", but that's the CI platform team's issue, the frontend and testing teams can't fix it, and the CI infra team won't prioritize it for a minimum of 5 years.


C# has implemented some SIMD for IndexOf: https://github.com/dotnet/runtime/pull/63285


Here’s a way to write tests next to the code: https://clipperhouse.com/go-test-csharp/

(Whether I recommend it, not sure! I did it and then undid it, with suspicion that tests were taking longer due to, perhaps, worse caching of build artifacts.)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: