More

theboat · on May 13, 2024

I love how this comment proves the need for audio2audio. I initially read it as sarcastic, but now I can't tell if it's actually sincere.

xanderlewis · on May 13, 2024

It’s completely sincere. I’m surprised by the downvotes. Greg Brockman needs no introduction.

latexr · on May 14, 2024

> Greg Brockman needs no introduction.

Even if that were true¹, it doesn’t mean everyone would know their HN user name.

¹ Greg may be well known within a select group of people but that’s way smaller than even users of ChatGPT.

xanderlewis · on May 14, 2024

I clicked through to see his bio; I didn’t know his username.

bombcar · on May 13, 2024

And here I thought it was just a GNU debugger fan or something.

pas · on May 13, 2024

debugger, no?

bombcar · on May 13, 2024

Aye, I have databases on the brain for some reason ... fixed.

theboat · on Feb 22, 2024

does this work for diffusion models, e.g. stable diffusion xl?

jurgenaut23 · on Feb 22, 2024

A quick look at the codebase seems to indicate that it doesn't. This seems very targeted at LLMs.

theboat · on Feb 20, 2024

looks neat. it would help if you hosted the demo apps rather than expecting the user to install and run them themselves to get a feel for it

https://github.com/hyperdiv/hyperdiv-apps/tree/main

mondrian · on Feb 20, 2024

Thanks! I plan on putting the demo apps and docs app online.

theboat · on Jan 25, 2024

I didn't see this in the blog post, but did you train this from scratch or finetune an existing base model?

If from scratch, quite impressive that the model is capable of understanding natural language prompts (English presumably) from such a small, targeted training set.

theboat · on Aug 20, 2023

How does datasette work with unstructured data?

I work with large text datasets, and I typically have to go through hundreds of samples to evaluate a dataset's quality and determine if any cleaning or processing needs to be done.

A tool that lets me sample and explore a dataset living in cloud storage, and then share it with others, would be incredibly valuable, but I haven't seen any tools that support long-form non-tabular text data well.

simonw · on Aug 20, 2023

There are a few things you can do here.

SQLite is great at JSON - so I often dump JSON structures in a TEXT column and query them using https://www.sqlite.org/json1.html

I also have plugins for running jq() functions directly in SQL queries - https://datasette.io/plugins/datasette-jq and https://github.com/simonw/sqlite-utils-jq

SQLite's FTS search is surprisingly decent, and I have tools for quickly turning that on both from a CLI: https://sqlite-utils.datasette.io/en/stable/cli.html#configu... and as a Datasette Plugin (available in Datasette Cloud): https://datasette.io/plugins/datasette-configure-fts

I've been trying to drive the cost of turning semi-structured data into structured SQL queries down as much as possible with https://sqlite-utils.datasette.io - see this tutorial for more: https://datasette.io/tutorials/clean-data

This is also an area that I'm starting to explore with LLMs. I love the idea that you could take a bunch of messy data, tell Datasette Cloud "I want this imported into a table with this schema"... and it does that.

I have a prototype of this working now, I hope to turn it into an open source plugin (and Datasette Cloud feature) pretty soon. It's using this trick: https://til.simonwillison.net/gpt3/openai-python-functions-d...

theboat · on Jan 11, 2023

I was hoping this would be a lot more targeted. What if I want to focus on my rear delts? Subscapularis? Serratus anterior? Yes, I have bad posture.

w0ts0n · on Jan 11, 2023

An "advanced" bodymap is coming soon. We are just figuring out the best way to do it on mobile.

stemlord · on Jan 12, 2023

Been lifting for 5 years now, going to use this site it's great nice work!

theboat · on April 12, 2022

Thank you for open sourcing this. More competition in the budding metrics ecosystem is good for end users.

It seems like you think MetricFlow should be the data mart layer and not just the metrics layer. If that's true...why? Why would I join my fact and dimension tables in metricflow instead of in dbt? One of the value adds of dbt is that it centralizes business logic in a single place. Joins are business logic. The industry seems to be moving towards creating very wide data mart tables in dbt and surfacing them to the semantic layer 1:1, or building the metrics layer on top of them.

tlento · on April 12, 2022

I'd say we think MetricFlow should be able to provide consistent, correct answers to reasonable queries end users of the metric model might ask. To do this across the various data warehouse layouts our users are likely to encounter we must necessarily provide support for dimensional joins. This doesn't mean MetricFlow should displace data mart services - to the contrary, I contend MetricFlow works best when layered on top of a warehouse built on centralized logic for managing its data layout. As an example, we generally push our customers to rely on the sql_table data source definition and push any sql_query constructs down to whatever warehouse management layers they have in place.

That said, you need to support joins, at least in some limited scope, in the semantic metric layer for it to be broadly useful. Consider this scenario - you have your dbt models producing wide tables for reasonable measure/dimension queries, and you have MetricFlow configs for the metric and dimension sets available in your data mart. Now imagine you've also got your finance team hooked up to a Google Sheets connector, and they're looking at revenue and unique customers by sales region. Cool, your wide table has that built in, no joins needed.

But what if they want something new? Let's say they want to know how they're doing against the target addressable market in each country. Should they have to submit a ticket to the data engineering team to add customer.country.market_size to your revenue table? Or should they be able to do "select revenue by customer__country__market_size" and get the report they need?

Our position is that we want to facilitate the latter - people getting what they need and knowing, as long as it's been defined properly in the model, that it's going to produce reasonable results. If your particular organization wants all of those joins run through a data mart ticket queue and surfaced as fully denormalized underlying tables that's fine by us, but most likely that's not what you want. You'd rather have some visibility into the types of joins people are requesting and then build out your data mart to more efficiently serve the requests people have on the ground, while still allowing them to ask new questions of the data without a long development feedback loop.

theboat · on March 4, 2022

For any dbt users, their reliability package has the best and most comprehensive way to upload artifacts directly to the warehouse after a dbt invocation.

https://github.com/elementary-data/dbt-data-reliability

Maayansa · on March 4, 2022

Thank you! We believe that this upload is super valuable and could unlock a lot of additional use cases. We are already working on some of these and will release in the next few weeks.

theboat · on Nov 27, 2021

> And I wonder if a person free of cognitive distortions would even be referred to as human, as the quote goes:

There's a difference between emotional intuition and emotional reasoning (the cognitive distortion in OP's example).

Emotions are extremely valuable for decision-making (e.g. this house ticks all my boxes but do i love it?) and making judgements (e.g. this situation does not feel right to me).

Emotional reasoning is when people distort reality in favor of their (often self-destructive) emotional impulses, discarding physical evidence in favor of their emotions.

theboat · on Nov 11, 2021

I've been a hightouch customer for nearly a year now, and I have to say the team and product are both great.

That being said, isn't it a bit late for a Launch HN post? :P

dang · on Nov 11, 2021

Well...

Launch HN: Rainforest QA (YC S12) – No-Code UI Test Automation - https://news.ycombinator.com/item?id=28947689 - Oct 2021 (88 comments)

Launch HN: RescueTime (YC W08) – Redesigned for wellness, balance, remote work - https://news.ycombinator.com/item?id=28683597 - Sept 2021 (141 comments)

The basic rule is that each YC startup gets one. We've made a couple exceptions in cases of complete reinventions.

kashishg · on Nov 11, 2021

Thanks so much for your support from the very beginning! We've only been in market publicly for a little over a year now actually. We figured better late than never (and it still feels early for us!) :)