Doesn't this depend a lot on private vs company usage? There's no way I could spend more than a few hundreds alone, but when you run prompts on 1M entities in some corporate use case, this will incur costs, no matter how cheap the model usage.
My take is quite cynical on this.. This post reads to me like a post-justification of some strange newly introduced behaviour.
Please order the answer in the order the resolutions were performed to arrive at the final answer (regardless of cache timings). Anything else makes little sense, especially not in the name of some micro-optimization (which could likely be approached in other ways that don’t alter behaviour).
The DNS specification should be updated to say CNAMES _must_ be ordered at the top rather than "possibly". Cloudflare was complying with the specification. Cisco was relying on unspecified behavior that happened to be common.
The only reasonable interpretation of "possibly prefaced" is that the CNAMEs either come first or not at all (hence "possibly"). Nowhere the RFC suggests that they may come in the middle.
Something is broken in Cloudflare since a couple of years. It takes a very specific engineering culture to run the internet and it's just not there anymore.
I feel like the the data to drive the really interesting capabilities (biological, chemical, material, etc, etc, etc) is not going to come in large part from end users.
It's the other way around. You gather user data so that you can better capture the user's attention. Attention is the valuable resource here: with attention you can shift opinions, alter behaviors, establish norms. Attention is influence.
Okay, something just tweaked in my brain. Do higher temperatures essentially unlock additional paths for a model to go down when solving a particular problem? Therefore, for some particularly tricky problems, you could perform many evaluations at a high temperature in hopes that the model happens to take the correct approach in one of those evaluations.
Edit: What seems to break is how high temperature /continuously/ acts to make the model's output less stable. It seems like it could be useful to use a high temperature until it's evident the model has started a new approach, and then start sampling at a lower temperature from there.
Decaying temperature might be a good approach. Generate the first token at a high temperature (like 20), then for each next token multiply temperature by 0.9 (or some other scaling factor) until you reach your steady-state target temperature
I think yes. Recently I was experimenting with NEAT and HyperNEAT solutions and found this site. At the bottom it explains how novelty yields far more optimal solutions. I would assume that reasonably high temperature may also result more interesting solutions from LLM
It depends on the business for sure. Kube is overkill until you have someone on your team whose specialization is infra. Then that person will probably be spearheading kube anyway :)
reply