Hacker Newsnew | past | comments | ask | show | jobs | submit | vaastav's commentslogin

How is this different from/similar to Barrelfish?


mainline vs abandoned.


Is it tho? It's only understandable to me if I also read the subtitles. Otherwise it just sounds as random noise.


Garbage collector in Go is optional. You can switch off garbage collection by setting the environment variable GOGC=off.

More info about GOGC: https://dave.cheney.net/tag/gogc


Only very theoretically - in Go, you don't control whether memory goes on your stack or the heap, and heap escape analysis is notoriously unpredictable. There is no explicit free. You would have to write Go in a completely crazy way to be able to turn off the GC and have the program not grow unbounded.

You might think "I'll just use static buffers everywhere", but allocations can occur in unexpected places. The compiler does some very basic lifetime analysis to eliminate some obvious cases (loops...), but it's really hard to avoid in general.


Ok so garbage collection is optional, how about garbage generation? Is there any way to manually clean up resources if GOCG=off, or will memory usage continue to grow unbounded as new objects are created?


Grows unbounded. I wasn't recommending that one should set GOGC=off. Just making a remark that one could should they choose to do so.

EDIT: Sorry, I misunderstood part of your question. The memory grows unbounded unless you call runtime.GC() which triggers garbage collection. But this is a blocking call and essentially block your whole program.


I think in theory you can write code (or do some tricks) to avoid all heap allocation.


"In theory" being the operative words there. Turn on heap escape analysis sometime, you'll be surprised how hard it is to avoid.


You mean not growing the heap or literally no allocation?


Arguably, one can't truly say the GC is optional, unless the language and its libraries were designed to work without it. In languages like Vlang, that's the case, as the GC was added later. If turning the GC off cripples the functionality and usefulness of the language, then there is little point in using the option or claiming it as optional.

Probably a better argument for Go (and other languages like Java) is how "tweakable" the GC is or at least describe it as the GC can be turned off, but it's not designed or useful to do so.


Well, then so is in Java, with the Epsillon “GC”. But not collecting garbage is very different than manual memory management.


Not sure if this really is required. Most cases in Go are served well by GoRoutines and for yield/resume semantics, 2 blocking channel are enough. This seems to add complexity for the sake of it and not sure it actually adds any new power to Go that already didn't exist.


Goroutines + channels add an enormous amount of overhead. Using them as iterators is basically insane.


Why? Channels are already iterable using the range keyword.

    ch := make(chan int)
    go func() {
        for i := 0; i < 100; i += 1 {
            ch <- i
        }
        close(ch)
    }()

    for i := range ch {
        fmt.Println(i)
    }
That is very simple.


"Enormous amount of overhead" is the operative phrase. In general, you want your concurrency operations to be significantly smaller than the payload of the operation. In the case of using channels as iterators with goroutines behind them, it works fine for something like a web page scraper, where the act of fetching a web page is enormously larger than a goroutine switch, but as a generalized iteration mechanism it's unusably expensive because a lot of iteration payloads are very small compared to a channel send.

I've encountered a lot of people who read that on /r/golang and then ask "well why are send operations so expensive", and it's not that. It's that a lot of iteration operations are on the order of single-digit cycles and often very easy to pipeline or predict. No concurrency primitive can keep up with that. A given send operation is generally fairly cheap but there are enough other things that are still an order or two of magnitude cheaper than even the cheapest send operation that if you block all those super cheap operations on a send you're looking at multiple factor of magnitude slowdowns. Such as you would experience with your code.


But if your iteration is so fast then you don’t need a channel at all. Just use a plain for loop.


Yes, that's exactly what the article is about: how we can model "plain for-loop" levels of performance in the presence of complex (intricate state at multiple levels and high potential for nesting iterators) code that is supplying the loop(s)


Then you lose the separation of iteration from the looping construct.

You can use a function or a method, but you lose the nice continuation/generator ability to write a function that can do complicated yields without having to write the state machine yourself, plus you run a risk that the call won't inline in which case you're incurring non-trivial function call overhead.

The problem with iteration in Go isn't that you can't solve any given individual problem, the problem is that you can't solve all of them simultaneously the way you can in Rust or Python. (Though one of these days I want to get around to benchmarking Python's iteration versus Go channel-based iteration, I'm not actually sure which would win. What Go considers dangerously slow can still be baseline performance for other languages.) So you can get a defeat-in-detail sort of thing where a person cites a problem, and someone posts as solution to that, and then they cite another problem, and there's a solution for that, and then there's another problem, and a solution is posted for that, and all the solutions do indeed more-or-less solve the given problem... but you can't combine them into one.


How do you solve the tree fringe problem with a for loop?


Since all recursive programs can be converted into iterative programs then you can "simply" (not always simple) convert recursive solutions like McCarthy's Lisp solution to a loop: https://dl.acm.org/action/showFmPdf?doi=10.1145%2F1045283 (page 5)

  (DE SAMEFRINGE (X Y)
         (OR (EQ X Y)
             (AND (NOT (ATOM X))
                  (NOT (ATOM Y))
                  (SAME (GOPHER X) (GOPHER Y)))))

  (DE SAME (X Y)
         (AND (EQ (CAR X) (CAR Y))
              (SAMEFRINGE (CDR X) (CDR Y))))

  (DE GOPHER (U)
         (COND ((ATOM (CAR U)) U)
               (T (GOPHER (CONS (CAAR U)
                                (CONS (CDAR U) (CDR U)))))))
Coroutines are not necessary for solving that problem (though they do offer a neat solution to it).


The results of that conversion are not fun


Aside from the massive performance penalty, cache thrashing and context switching, this code will also leak a goroutine (and so, memory) if you don't finish receiving from `ch`. It's more brittle, longer to write, less local and in every other way worse than a for loop. Why would you ever do it?


Race conditions. With coroutines you’re not supposed to deal with race. If i am understanding the motives


You can write to a channel concurrently.


And to make the concurrency safe you have to pay the price of synchronization.

https://go.dev/tour/concurrency/8

In A Tour of Go, concurrency section, "Equivalent Binary Trees" is an example of paying the price when you don't need it.


its not that you cant, it's that its very expensive.


It’s not insane at all. How did you come to that conclusion?

* Mutex lock+unlock: 10ns

* Chan send buffered: 21ns

* Try send (select with default): 3.5ns

Missing from here is context switches.

In either case, the overhead is proportional to how fast each iteration is. I have channels of byte slices of 64k and the channel ops don’t even make a dent compared to other ops, like IO.

You should absolutely use channels if it’s the right tool for the job.

Fwiw, I wouldn’t use channels for “generators” like in the article. I believe they are trying to proof-of-concept a language feature they want. I have no particular opinion about that.


> Missing from here is context switches.

Exactly. From rsc's previous post[1]:

> On my laptop, a C thread switch takes a few microseconds. A channel operation and goroutine switch is an order of magnitude cheaper: a couple hundred nanoseconds. An optimized coroutine system can reduce the cost to tens of nanoseconds or less.

[1]: <https://research.swtch.com/pcdata>


Yeah I 100% understand wanting to optimize this for something like generators if we imagine them as first-class constructs. But they’re not at all a replacement for channels – they would be an addition, or specialization. I’ve never seen real world Go code that has needed it but maybe this will change with generics. It’s worth keeping an eye on, at least.

Channels otoh are very versatile: everything from spsc to mpmc with buffering and starvation protections, fast cancelation and notifications, etc etc. They’re not perfect, but it’s a helluva bang-for-the-buck for a single primitive. Literally all you have to do for performance is add buffering and coalesce “units of work”, and you’re good to go.


Sure but that's what the implementation of the coro in this post uses under the hood. Not sure how this is any better wrt overheads.


Where did you get this '..it uses same under the hood'? The article clearly says:

..Next I added a direct coroutine switch to the runtime, avoiding channels entirely. That cuts the coroutine switch to three atomic compare-and-swaps (one in the coroutine data structure, one for the scheduler status of the blocking coroutine, and one for the scheduler status of the resuming coroutine), which I believe is optimal given the safety invariants that must be maintained. That implementation takes 20ns per switch, or 40ns per pulled value. This is about 10X faster than the original channel implementation.


The runtime switch was buried in the last paragraph of the article. All of the code was using goroutines and channels....


It does mention in an earlier section:

... That means the definition of coroutines should be possible to implement and understand in terms of ordinary Go code. Later, I will argue for an optimized implementation provided directly by the runtime,..


> Not sure how this is any better wrt overheads.

At the end he implements an experimental runtime mechanism that permits a goroutine to explicitly switch execution to another goroutine rather than using the generic channel scheduling plumbing.


Its in the last paragraph of the article...Very easy to miss given the code uses goroutines and channels.



This is covered in the article.


So basically a normal phone from the mid-00s


Always remember to clean up the .tex files before submitting to arxiv


You can export Jaeger traces via OTel as well. I am assuming the question here is why is there a different SDK rather than just re-using the standardized OTel APIs + libs for tracing and then providing a simple exporter for Traceo.


As a potential user, I am quite unsure as to why I would use this over something like TexStudio or even VSCode enabled with some LaTeX plugins. Could you tell me/us some pro/cons of this?


I am the co-author of a very similar application (CoCalc for LaTeX) and our landing page https://cocalc.com/doc/latex-editor.html lists many of the reasons people use it. A number of the reasons apply also to JupyterLab LaTeX, or will soon. A quick summary: realtime collaboration, having the paper you're writing and the data you're computing in the same place, having a very high-resolution history of edits, using latex in course management, using a Chromebook or other lightweight client, and zero configuration support for PythonTex and R (knitr). Note that some of these reasons for using JupyterLab or CoCalc are things that https://www.overleaf.com/ doesn't provide.


Having your data and your reports all in one place seems like the big selling point and I see this as pretty exciting.


> realtime collaboration, ... having a very high-resolution history of edits

It's my understanding that JupyterLab doesn't support concurrent editing or version control, unless these are baked into this extension?


Coming soon - they just closed their concurrent editing ticket! :-). https://github.com/jupyterlab/jupyterlab/issues/5382#event-4...


Same here. It's the most painful limitation of Jupyter, in my experience.


May be you could host it like overleaf? Eliminates the need to install latex which can be quite space consuming.


Using texlive package manager plus a python auto-package installer wrapper for latexmk I was able to get my latex install down to a 300MB.


And observing their technology might switch it from working to non-working


That kind of thing happens more often than I'm comfortable with.


What about false positives? How did you account for that?


You make your peace with the fact that you'll have a certain rate of false positives, where you'll intentionally lose also some legitimate business in order to keep most of the "ecosystem" cleaner. Perhaps an unsatifying answer, but that's it.

It's not a situation like putting someone in prison where "beyond all reasonable doubt" is the appropriate mark; you can refuse to do business based on mere suspicion that may be mistaken. There's a limit where extra investigation or appeals is too costly compared to just accepting the lost revenue, and for small-scale customers, that limit is quite low. With fraud detection, you have to balance the tradeoff between false positives and false negatives, but you'll certainly have both.


In Google’s case, this is not enough. They exert too much control over the online advertising industry that it’s simply unfair to ban anyone with no explanation and recourse. It should be illegal. It’s almost impossible to effectively monetize an app or website using ads without including various Google technologies and services, and that’s Google’s own doing; they’re the ones who purchased all of those companies and integrated their own products in a way that makes them inseparable.


I also worked on adtech.

We viewed them as a cost of doing business. Some small accounts got nuked. :shrug: If we had to have humans investigate everything, and produce reports / interpretations that nuked customers found satisfactory, we wouldn't have been willing to service accounts under probably $40k/year.

And keep in mind the ecosystem is filthy with fraud, particularly on the low end. There very much are groups of organized thieves actively exploiting adtech.

And as @PeterisP says... look, we're not a court. We're a private business that is refusing to do further business with someone. Our right to do this was very clearly explained before the beginning of any relationship, and agreed to by that someone. If that someone doesn't like it, their recourse is to not do business with us.


> And as @PeterisP says... look, we're not a court. We're a private business that is refusing to do further business with someone. Our right to do this was very clearly explained before the beginning of any relationship, and agreed to by that someone. If that someone doesn't like it, their recourse is to not do business with us.

So you are basically justifying Google behavior. You are not a court, that's right. But every ban process should be easily and quickly prosecutable to settle the issue right in a court. Obviously real fraudsters will never appeal like that, because they know they would incur in even bigger problems.


> Obviously real fraudsters will never appeal like that

I now know you have no experience at all fighting online fraud. People, um, lie.

On a serious note, if you require a prosecutable ban process -- whatever that means, because prosecuting is something the government does -- where you'll end is my original point. Ad companies will refuse to do business with publishers that aren't above some minimum threshold. My guess is $40k a year. Because remember eg google or whoever keeps about 1/3 of that money, so a $1k/mo minimum to staff humans and deal with arguing feels ballpark reasonable.

Separately, I'm not justifying anything. I'm explaining the economics driving behavior. If you want to be mad at me for behavior I don't control or influence... :shrug:


Their recourse is to not do business, full stop, because Google has monopolized the ad space.


People have this misconception that ad networks some how get joy from turning people off for no reason. Every ad shown is a penny in their pocket, even if its 100% fraud. When advertisers start asking for money back is when investigations are launched and accounts are terminated.

There are literally no false positives. It may be fraud, it may be the ad is too close to a back button and gets accidently clicked, it could be the ads don't display right. But at the end of the day, it is a revenue decision.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: