Hacker Newsnew | past | comments | ask | show | jobs | submit | xrendan's commentslogin

I would absolutely love if you brought Canadian maps to this.

I work for Build Canada and I would love to see some maps from the fur trade and early exploration to tell stories.

If you want to chat my email is brendan at buildcanada.com


Hey Brendan! I also would love to add Canadian maps, it's been a huge request from my users and something I've been wanting to focus on all year. A big challenge I have in bringing the service to new regions is just data access, both to raw hi-res map imagery as well as to satellite, LiDAR, etc so this is on my todo list to begin digging into what the Canadian government offers. Brave new world for me

Will absolutely reach out to connect!


There are maps of Toronto


One thing that has surprised me (and I should've known that it wasn't great at it), but it is terrible at creating bounding boxes around things it's not trained on (like bounding parts on a PCB schematic.)


So this tells us that it does not _understand_ what it is doing, really. No real intelligence here. Might as well use an old-school YOLO network for the task.


It's just behaving like a child. A child could draw a bounding box around a dog and a cat, but would fail if you told them to draw a box around the transistors of a PCB. They have no idea what a transistor is, or what it looks like. They lack the knowledge and maturity. But you would never claim the child doesn't _understand_ what they're doing, at least not to imply that they're forever incapable of the task.


Yeah, but a child does one-shot learning much better. Just tell it to find the black rectangles and it will draw boxes around the transistors of a PCB, no extra training required.


Perhaps. But I think you'll find there are a lot of black rectangles on a PCB that aren't actually transistors. You'll end up having to teach the child a lot more if you want accurate results. And that's the same kind of training you'll have to give to an LLM.

In either case, your assertion that one _understands_, and the other doesn't, seems like motivated reasoning, rather than identifying something fundamental about the situation.


Then you explain transistors have three wires coming of them.


I mean, problem solving with loose specs is always going to be messy.

But at least with a child I can quickly teach it to follow simple orders, while this AI requires hours of annotating + training, even for simple changes in instructions.


Humans are the beneficiaries of millions of years of evolution, and are born with innate pattern matching abilities that we don't need "training" for; essentially our pre-training. Of course, it is superior to the current generation of LLMs, but is it fundamentally different? I don't know one way or the other to be honest, but judging from how amazing LLMs are given all their limitations and paucity of evolution, I wouldn't bet against it.

The other problem with LLMs today, is that they don't persist any learning they do from their everyday inference and interaction with users; at least not in real-time. So it makes them harder to instruct in a useful way.

But it seems inevitable that both their pre-training, and ability to seamlessly continue to learn afterward, should improve over the coming years.


> It's just behaving like a child.

No it's not.


Amazon - Ecommerce (1994), AWS (2006)

Microsoft - Programming Language (1975), Operating Systems (1981), Office Suite (1983)

Meta - Facebook (2004), Instagram (2010)

I would argue microsoft is unique because of how badly IBM screwed up.


Microsoft also has Azure and its gaming division.


I read that as the size of file it's transferring so each operation would be bigger and therefore slower


I have some really old code that pretty much does this, I'll see if I can find it.


Ugh, I don't have it. It was from before I used git.

Basically to do this you have a cups server that exposes itself as a network printer that prints to a specified PDF directory and then you have a program watching that directory for new files and if there's a new one it opens up whatever pdf viewer you want in full screen.

Setup a shared pdf printer: https://askubuntu.com/questions/1310867/how-to-set-up-shared...


I love the Blueprint Understanding.

One thing I've been thinking about is if you could use a model like this as the first pass for permitters (Like a GitHub Actions CI/CD) who review blueprints.

Many developers use the regulatory side of various engineering approval processes as a quality control check which costs money and time for the regulator who is tasked with enforcing a standard.

It would also be good to speed up the workflow for developers saying hey, this thing looks weird did you really mean to do this?

And then further on, you could add a way to check it for constructability. My framer friends often get annoyed at whatever engineer because the way the structure is designed is materially inefficient or hard to construct.


Absolutely! That's our end-goal, to remove the painful back-and-forth of permitting. Once you solve blueprint understanding, the possibilities are enormous, from spell-check to material efficiencies, etc.


You're misreading it, there's two different runs, a low and a high compute run.

The number for the high-compute one is ~172x the first one according to the article so ~=$2900


What's extra confusing is that in the graph the runs are called low compute and high compute. In the table they're called high efficient and low efficiency. So the high and low got swapped.


Azure charges differently based on deployment zone/latency guarantees, OpenAI doesn't let you pick your zone so it's equivalent to the Global Standard deployment (which is the same cost).

[0] https://azure.microsoft.com/en-us/pricing/details/cognitive-...


I'd be interested in knowing if anyone is seriously using the assistants API, it feels like such a lock in to OpenAIs platform when your can alternatively just use completions that are much more easily interchanged.


I do and built Assistants API compat layer for Groq and Anthropic: https://github.com/supercorp-ai/supercompat I’d argue that Assistants API DX > manual completions API.


Aye, but your FinOps will be comolaining even with simple use.


Assistants API use in prod used to suck because it would send full convo on each message. But last month they added an option to send truncted history so its no longer 2$ a pop thankfully. Also Grok, Haiku and Mistral is cheap


Are you using Assistants API v2 with streaming?


Yeah, I do both in prod and in the lib. In the lib I even ported Anthropics streaming API to be OpenAI compatible. Will write the docs over the coming days if interested.


I've indeed refused to work with some providers giving only a chat interface and not a completion interface because it made the communication "less natural" to the model (like adding new system messages in between for function calling on models which don't officially does it, or adding other categories than system/user/assistant)


Great points. Dont even get me started about how function calling in other LLMs costs me tokens. Something OpenAI provides OOTB. I'm also not a big fan of OpenAI's lock in. Right now I'm on a huge Claude 3 Haiku kick. That said, OpenAI does seem to get the APIs right and my hunch is the new Assistants API is going to potentially disrupt things again. Time will tell.


I would love to be using Claude, but you can't get API access (beyond an initial trial period) in the EU without providing a European VAT number. They don't want personal users or people to even learn and experiment I guess.


You can use the Claude APIs via OpenRouter with a pre-paid account.


Thanks, this did the job!


Interesting, would Amazon Bedrock be an alternative? That's how I use Claude.


I'd guess it's more likely about the additional programming needed to meet GDPR compliance requirements.


Opus is really cool. I’ve found it to have a few persistent bugs in what I initially assumed is tokenization but now wonder if might be more fundamental, but modulo a few typographical-level errors, I personally think it’s the most useful of the API-mediated models for involved sessions.

And there are some serious people at Anthropic, they’ll get the typo thing if they haven’t already (been a busy week and change, they easily could have shipped a fix and I overlooked it).


> Dont even get me started about how function calling in other LLMs costs me tokens. Something OpenAI provides out of the box.

Not sure what you mean by this.


I have some assumptions/guesses on how billing works. Gonna do a post on this on my unremarkable.ai blog, please do signup for posts there, no spam. I could be right or wrong but need to do some experiments and publish later.


I'm not sure you're talking about the same thing: OpenAI specifically has a "Assistants API" that manages long term memory and tool usage for the consumer: https://platform.openai.com/assistants

I'd guestimate 99% of people using LLMs are using instruct-based message interfaces that have a variation of system/user/assistant. The top models mostly only come as a completion models, and even Anthropic has switched to a message based API


I've used it and in some cases it's taking days and weeks of development away to get to testing the market.

In some cases the lock in is what it is for now because a particular model in reality is so far ahead, or staying ahead.

It doesn't mean other options won't become available, but it does matter to relate your need to your actions.

Getting something working consistently for example might be the first goal, and then learning to implement it with multiple models might be secondary. The chances of that increase the later other models are explored in some cases.

It should be possible to tell pretty quickly if something works in a particular model that's the leader, how others compare to it and how to track the rate of change between them.


I know at least one team is at work is using the Assistants API, and I'm talking with another team that is leaning pretty heavily towards using it over building a custom RAG solution themselves, or even over other in-house frameworks.


I use it mostly exclusively (I've even developed a Python library for it, https://github.com/skorokithakis/ez-openai), because it does RAG and function calling out of the box. It's pretty convenient, even if OpenAI's APIs are generally a trash fire.


This generally resonates with what we've found. Some colour based on our experiences.

It's worth spending a lot of time thinking about what a successful LLM call actually looks like for your particular use case. That doesn't have to be a strict validation set `% prompts answered correctly` is good for some of the simpler prompts, but especially as they grow and handle more complex use cases that breaks down. In an ideal world

> chain-of-thought has a speed/cost vs. accuracy trade-off a big one.

Observability is super important and we've come to the same conclusion of building that internally.

> Fine-tune your model

Do this for cost and speed reasons rather than to improve accuracy. There are decent providers (like Openpipe, relatively happy customer, not associated) who will handle the hard work for you.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: