More

taroth · 2026-01-14T19:16:42 1768418202

https://andrew.zone

taroth · 2025-09-25T23:37:55 1758843475

One-liner for electron developers to fix the issue:

browserwindow.setHasShadow(false)

taroth · 2025-09-25T23:36:34 1758843394

One-liner for electron developers to fix the issue:

`browserwindow.setHasShadow(false)`

inDigiNeous · 2025-09-29T11:51:32 1759146692

Thank you for this!

Could possibly just hotpatch my existing app, add this to the packed in javascript .asar resource file and not having to make a new build with updated Electron version.

taroth · 2025-05-28T20:53:04 1748465584

Here's a great comparison, updated two weeks ago. https://github.com/Elanis/web-to-desktop-framework-compariso...

Electron comes out looking competitive at runtime! IMO people over-fixate on disc space instead of runtime memory usage.

Memory Usage with a single window open (Release builds)

Windows (x64): 1. Electron: ≈93MB 2. NodeGui: ≈116MB 3. NW.JS: ≈131MB 4. Tauri: ≈154MB 5. Wails: ≈163MB 6. Neutralino: ≈282MB

MacOS (arm64): 1. NodeGui: ≈84MB 2. Wails: ≈85MB 3. Tauri: ≈86MB 4. Neutralino: ≈109MB 5. Electron: ≈121MB 6. NW.JS: ≈189MB

Linux (x64): 1. Tauri: ≈16MB 2. Electron: ≈70MB 3. Wails: ≈86MB 4. NodeGui: ≈109MB 5. NW.JS: ≈166MB 6. Neutralino: ≈402MB

FINDarkside · 2025-05-29T00:45:33 1748479533

The benchmark also says Tauri takes 25s to launch on Linux and build of empty app takes over 4 minutes on Windows. Not sure if those numbers are really correct.

KronisLV · 2025-05-29T08:07:30 1748506050

A few months ago, I experimented with Wails and Tauri on Windows. The builds did indeed take unreasonably long with the Rust option and were way faster with Go, no idea why but I ditched Tauri because of that since Wails did more or less the same thing.

Abishek_Muthian · 2025-05-29T12:02:03 1748520123

Did you manage to publish/ship your WAILS app? What was your biggest hurdle with it?

KronisLV · 2025-05-29T16:59:59 1748537999

It was an internal app, a GUI to configure a CLI tool in a user friendly manner. For that use case, I essentially built a local SPA with Vue that can also call some endpoints on server side software that we also host. There, the rendering differences between the web views didn't really matter but the small distribution size was a major boon, plus being able to interface with Go code was really pleasant (as is that whole toolchain). No complaints so far, then again, not a use case where polish would matter that much.

I'd say that the biggest hurdle for that sort of thing is just the documentation or examples of how to do things online - because Electron is the one everyone seems to use and has the most collective knowledge out there.

Abishek_Muthian · 2025-05-30T03:51:30 1748577090

That seems like a perfect use case for a WAILS app, nicely done!

samtheprogram · 2025-05-29T06:25:56 1748499956

This. Reminds me of Casey Muratori warning people to not trust benchmarks made by random people on the internet.

There’s absolutely no way Tauri apps take 25s to launch. Source: I’ve played with Tauri on Linux. This is an order of magnitude off.

Abishek_Muthian · 2025-05-29T11:59:09 1748519949

I wonder the reason Tauri does great in Linux and Electron is at its worst is because of Optimization or lack thereof respectively.

taroth · 2025-05-26T16:25:00 1748276700

Excited for this!

taroth · 2025-05-26T05:50:57 1748238657

I get background anxiety while waiting for long-running terminal commands. Nowadays that nagging feeling extends to LLM calls too. Seems like as AI spreads, the pain will only get worse

So I’m working on a universal progress bar HUD

- inspired by World of Warcraft raid mods

- fun sound effects for job start, end, error, and milestones

- can quick jump back to relevant app/tab

- starting with terminal commands and Claude code, cursor agent next

https://youtu.be/6pk7KGOh60A?si=75FRq4kjhDHNSdE9

maxrimue · 2025-05-26T07:25:32 1748244332

Very cool idea! Never thought of making use of the notch for anything.

devgoth · 2025-05-26T06:14:25 1748240065

this is awesome. is this on github? i would love to use it.

taroth · on Oct 23, 2024

Great idea Kyle! I read through the source code as an experienced desktop automation/Electron developer and felt good about trying it for some basic tasks.

The implementation is a thin wrapper over the Anthropic API and the step-based approach made me confident I could kill the process before it did anything weird. Closed anything I didn't want Anthropic seeing in a screenshot. Installed smoothly on my M1 and was running in minutes.

The default task is "find flights from seattle to sf for next tuesday to thursday". I let it run with my Anthropic API key and it used chrome. Takes a few seconds per action step. It correctly opened up google flights, but booked the wrong dates!

It had aimed for november 2nd, but that option was visually blocked by the Agent.exe window itself, so it chose november 20th instead. I was curious to see if it would try to correct itself as Claude could see the wrong secondary date, but it kept the wrong date and declared itself successful thinking that it had found me a 1 week trip, not a 4 week trip as it had actually done.

The exercise cost $0.38 in credits and about 20 seconds. Will continue to experiment

jrflowers · on Oct 23, 2024

> The exercise cost $0.38 in credits and about 20 seconds

I am intrigued by a future where I can burn seventy dollars per hour watching my cursor click buttons on the computer that I own

bastawhiz · on Oct 23, 2024

Amazingly my employer continues to pay me hundreds of dollars an hour to search Kagi and type on a computer they paid for and own!

jrflowers · on Oct 23, 2024

And to think they could be paying you to supervise the buttons clicking themselves instead! The past where the lack of a human meant a lack of input is over, all hail the future where a lack of a human could mean wasteful and counterproductive input instead

bastawhiz · on Oct 23, 2024

What I'm hearing is that now they can fire my manager

tylerchilds · on Oct 24, 2024

i think you’d get fired and your boss will be demoted to your position.

ionwake · on Oct 24, 2024

a smart take

urbandw311er · on Oct 23, 2024

You wouldn’t sit there watching your paid human assistant work would you? So why would you sit watching your paid AI assistant?

I think the general idea is that you’re off doing something more productive, more relaxing or more profitable!

jrflowers · on Oct 23, 2024

> why would you sit watching your paid AI assistant?

> it kept the wrong date and declared itself successful

urbandw311er · on Oct 24, 2024

This is the worst it’s ever going to be, though. Probably a better use of time to make plans and preparations based on its fifth iteration or similar.

jrflowers · on Oct 24, 2024

I like the idea of seeing an app that charges me electrician rates to move my cursor around to book me on the wrong flight and thinking “I should plan for the day that I wake up and simply have to mumble ‘do job’ in the general direction of a device”

nkrisc · on Oct 23, 2024

A human assistant would have been fired already.

tylerchilds · on Oct 24, 2024

i don’t think anyone is going to fire anyone willing to work for 38 cents for any reason.

jrflowers · on Oct 24, 2024

Seventy dollars per hour equates to paying a full time employee roughly $145k per year

urbandw311er · on Oct 24, 2024

We can probably assume this will come down by at least an order of magnitude.

KronisLV · on Oct 24, 2024

Aren't a lot of the current LLMs and AI technologies heavily subsidized to the point where turning a profit sometime in the next decade or so might actually mean increasing the prices?

https://techcrunch.com/2024/09/27/openai-might-raise-the-pri...

> The New York Times, citing internal OpenAI docs, reports that OpenAI is planning to raise the price of individual ChatGPT subscriptions from $20 per month to $22 per month by the end of the year. A steeper increase will come over the next five years; by 2029, OpenAI expects it’ll charge $44 per month for ChatGPT Plus.

> The aggressive moves reflect pressure on OpenAI from investors to narrow its losses. While the company’s monthly revenue reached $300 million in August, according to the New York Times, OpenAI expects to lose roughly $5 billion this year. Expenditures like staffing, office rent, and AI training infrastructure are to blame. ChatGPT alone was at one point reportedly costing OpenAI $700,000 per day.

jrflowers · on Oct 24, 2024

You can assume literally anything

sourcepluck · on Oct 24, 2024

I see you missed yesterday, when Tog's Paradox was discussed https://news.ycombinator.com/item?id=41913437

urbandw311er · on Oct 24, 2024

I did - thanks for the link!

bigs · on Oct 24, 2024

Imagine the finger wear and tear you’ll avoid though.

kcorbitt · on Oct 23, 2024

(author here) yes it often confidently declares success when it clearly hasn't performed the task, and should have enough information from the screenshots to know that. I'm somewhat surprised by this failure mode; 3.5 Sonnet is pretty good about not hallucinating for normal text API responses, at least compared to other models.

InsideOutSanta · on Oct 23, 2024

I asked it to send a message in WhatsApp saying that "a robot sent this message," and it refused, because it didn't want to impersonate somebody else (which it wouldn't have).

Next, I asked it to find a specific group in WhatsApp. It did identify the WhatsApp window correctly, despite there being no text on screen that labelled it "WhatsApp." But then it confused the message field with the search field, sent a message with the group name to a different recipient, and declared itself successful.

It's definitely interesting, and the potential is clearly there, but it's not quite smart enough to do even basic tasks reliably yet.

arijo · on Oct 23, 2024

We could maybe chose the target window as the screenshot capture source instead of the full screen to prevent it to be hidden buy the Agent:

``` const getScreenshot = async (windowTitle: string) => { const { width, height } = getScreenDimensions(); const aiDimensions = getAiScaledScreenDimensions();

  const sources = await desktopCapturer.getSources({
    types: ['window'],
    thumbnailSize: { width, height },
  });

  const targetWindow = sources.find(source => source.name === windowTitle);

  if (targetWindow) {
    const screenshot = targetWindow.thumbnail;
    // Resize the screenshot to AI dimensions
    const resizedScreenshot = screenshot.resize(aiDimensions);
    // Convert the resized screenshot to a base64-encoded PNG
    const base64Image = resizedScreenshot.toPNG().toString('base64');
    return base64Image;
  }
  throw new Error(`Window with title "${windowTitle}" not found`);

}; ```

taroth · on Oct 23, 2024

Yup that could help, although if the key content is behind the window, clicks would bug out. I'm writing a PR to hide the window for now as a simple solution.

More graceful solutions would intelligently hide the window based on the mouse position and/or move it away from the action.

arijo · on Oct 23, 2024

I think you can use nut-js desktop automation tool to send commands straight to the target window

```

import { mouse, Window, Point, Region } from '@nut-tree-fork/nut-js';

async function clickLinkInWindow(windowTitle: string, linkCoordinates: { x: number, y: number }) {

try {

    // Find window by title (using regex)
    const windows = await Window.getWindows(new RegExp(windowTitle));
    if (windows.length === 0) {
      throw new Error(`No window found matching title: ${windowTitle}`);
    }
    const targetWindow = windows[0];

    // Get window position and dimensions
    const windowRegion = await targetWindow.getRegion();
    console.log('Window region:', windowRegion);

    // Focus the window
    await targetWindow.focus();

    // Calculate absolute coordinates relative to window position
    const clickPoint = new Point(
      windowRegion.left + linkCoordinates.x,
      windowRegion.top + linkCoordinates.y
    );

    // Move mouse to target and click
    await mouse.setPosition(clickPoint);
    await mouse.leftClick();

    return true;
  } catch (error) {
    console.error('Error clicking link:', error);
    throw error;
  }

}

```

jazzyjackson · on Oct 23, 2024

Maybe instead of a floating window do it like Zoom does when you're sharing your screen, become a frame around the desktop with a little toolbar at the top, bonus points if you can give Claude an avatar in a PiP window that talks you through what it's doing

taroth · on Oct 23, 2024

The safety rails are indeed enforced. I asked it to send a message on Discord to a friend and got this error:

> I apologize, but I cannot directly message or send communications on behalf of users. This includes sending messages to friends or contacts. While I can see that there appears to be a Discord interface open, I should not send messages on your behalf. You would need to compose and send the message yourself. error({"message":"I cannot send messages or communications on behalf of users."})

taroth · on Oct 23, 2024

Gave it a new challenge of

> add new mens socks to my amazon shopping cart

Which it did! It chose the option with the best reviews.

However again the Agent.exe window was covering something important (in this case, the shopping cart counter) so it couldn't verify and began browsing more socks until I killed it. Will submit a PR to autohide the window before screenshot actions.

rossjudson · on Oct 24, 2024

How many sockets got delivered? Did it use a referral link?

stefan_ · on Oct 23, 2024

Why on earth would that be a "safety rail"?

ceejayoz · on Oct 24, 2024

Sending spam?

TechDebtDevin · on Oct 23, 2024

So the assistant I could pay to book me incorrect flights would cost $68.00 and hour. This makes me feel a little better about the state of things.

pants2 · on Oct 23, 2024

Presumably every step has to also read the tokens from the previous steps, so it gets more expensive over time. If you run it on a single task for an hour I would not be surprised if it consumed hundreds of dollars of tokens.

vineyardmike · on Oct 23, 2024

I’m curious how many tokens this used, and what the actual effective maximum duration it has due to the context window.

IanCal · on Oct 23, 2024

Per hour of computer execution is a poor measure.

Imagine it did this twice as fast, and cost the same. Is that worse? A per hour figure would suggest so. What if it was far slower, would that be better?

sigh_again · on Oct 23, 2024

>Imagine it did this twice as fast, and cost the same. Is that worse?

Yes. It could do it ten times as fast. A hundred times as fast. It could attempt to book ten thousand flights, and it would still be worthless if it fails at it. The reason we make machines is to replace humans doing menial work. Humans, while fallible, tend to not majorly fuck up hundreds of times in a row and tell you "I did it boss!" after charging your card for $6000. Humans also don't get to hide behind the excuse of "oh but it'll get better." As long as it has a non zero chance to fuck up and doesn't even take responsibility, it means ithat it's wasting my money running, _and_ wasting my time because I have to double check its bullshit.

It's worthless as long as it is not infinitely better. I don't need a bot to play music on Spotify for me, I can do that on my own time if it's the only thing it succeeds at.

malfist · on Oct 23, 2024

Yeah, but that assistant won't book the wrong flights.

delusional · on Oct 23, 2024

I'd say correctness would be worth another 40 bucks an hour.

MacsHeadroom · on Oct 23, 2024

GenAI costs go down 95% per year.

So next year it will be $3.40/hr and more reliable.

TechDebtDevin · on Oct 23, 2024

wanna bet?

computeruseYES · on Oct 23, 2024

Thanks so much, valuable information, sounds much faster than we heard about, maybe cost could be brought down by sending some of the prompts to a cheaper model or updating how the screenshots are tokenized

taroth · on Aug 7, 2023

My uncle had ALS and was slowly losing his ability to speak. I visited and in order to hear him, we had to gather around him closely. It was very fatiguing for him to project his voice.

I went to a few audio stores and jerryrigged a portable mic-speaker setup that could attach to his wheelchair. No software, just the right series of devices and adapters. It worked well and provided a huge relief for him and our family. Nothing impressive technically, but definitely the physical thing I'm most proud of making.

His blog, for any curious about his ALS journey: http://cheeseaisle.blogspot.com/

archagon · on Aug 7, 2023

Thank you for that link. What a funny and engaging writer.

“My statue will be made of guano, highly compressed and polished to resemble marble, commemorating the victory of Bad Taste over Common Sense and Decency.”

Sounds like another great project for this thread!

taroth · on Dec 1, 2022

Up-to-date reviews of keyboard latency: https://www.rtings.com/keyboard/tests/latency

The Corsair K65 achieved the fastest latency of 0.1ms. By comparison the Apple Magic Keyboard with TouchID had a latency of about 27ms, both wired and bluetooth. Pretty wild that the Apple keyboard is 270x slower!

Now I personally use the Logitech G915 TKL, low-profile. The 1.3ms latency is excellent and I love the key feel.

323 · on Dec 1, 2022

> The 1.3ms latency is excellent and I love the key feel.

Anyone can explain how is that possible, given that plenty of experiments showed that the human (conscious) reaction time is above 150 ms?

fredrikholm · on Dec 1, 2022

Reaction as in responding to an unexpected event?

Typing in Vim/Sublime feels instant compared to your run of the mill IDEs. It's painful having to work in those behemoths, esp. considering the fact that I'm literally waiting for them to put text into a buffer.

That difference is less than 150ms, and I hate it.

EDIT:

Here's a video depicting latency. The difference between 10ms and 1ms is monumental.

https://youtu.be/vOvQCPLkPt4?t=56

markdestouches · on Dec 1, 2022

150 ms is the time it takes for a person to see the input and then do something (like pressing a key or blinking). That's a two way communication with processing (thinking) time included. The actual input reaction, as the time it takes for your brain to register something, is faster.

In addition to that, the reaction time does not actually matter. You would be able to see a sub-reaction-time delay because your brain has a way of timing and synchronizing events. Look at it this way. You send a letter on March 1. You receive a reply on March 10. It doesn't matter how much later you actually read the reply, on March 11, 15, or in April - you still would know that it took 10 days to get the reply.

munificent · on Dec 1, 2022

Reaction time is a different measurement than perception time. The linked article goes into this.

You can absolutely tell the difference between 150ms and 1.3ms. Hell, people can easily tell the difference between a 60 FPS framerate and 30 FPS. That's a difference of only 16 ms per frame.

ilyt · on Dec 1, 2022

Reaction time is time to react to (randomly generated) stimulus. So it measures the delay of our inputs, and delay of our outputs.

You are not reacting to stimulus. You are producing event (keypress) and waiting to see result.

Brain sends the message, it hits the fingers some miliseconds later but brain already knows what to expect and is already watching. So the net effect is "okay I knew I pressed the key, why nothing changed?"

thefaux · on Dec 1, 2022

Reaction time and perception time are not the same. I can perceive latency at a much finer resolution than 150ms and find high latency bothersome.

taroth · on Feb 1, 2018

I’ve found good use of Polymail and pay $20/m for team pro. It’s a love-hate relationship though due to small but disruptive UX patterns. Some examples:

The MacOS app does not follow normal keybinding conventions. Specifically, ESC causes the app to exit full screen and cmd+shift+f doesn’t enter full screen. No option to customize either.

The iOS app will instantly show notifications for new emails, but upon opening up the app you have to wait 5-10 seconds for the emails to appear (while gmail is instant).

That said I enjoy the inbox zero image, snoozing of messages, and overall style.