Congratulations on the new release! I've seen some forum discussions on this in the past, and I'd imagine it's a frequently debated topic. However, I'd like to ask about the technical feasibility of implementing a feature similar to Ableton's 'Warp' within Ardour. I understand that Ardour and Ableton have fundamentally different architectures and that different DAWs can prioritize different workflows. Given the current state of the codebase and the development roadmap, I'm curious how realistic the implementation of BPM-synced time-stretching actually is or if it remains significantly outside the project's scope.
The biggest issue here is that the best library for doing audio warping (ZPlane) is not available to us. We already do realtime audio warping for clip playback, just like Ableton, using RubberBand (and might consider using Staffpad at some point, which we have available for static stretches).
However, following the tempo map is a very different challenge than following user-directed edits between warp markers, and neither RubberBand nor Staffpad really offer a good API for this.
In addition, the GUI side of this poses a lot of questions: do you regenerate waveforms on the fly to be accurate, or just use a GUI-only scaling of an existing waveform, to display things during the editing operation.
We would certainly like to do this, and have a pretty good idea of how to do it. The devil, as usual, is in the details, and there are rather a lot of them.
There's also the detail that having clips be bpm-synced addresses somewhere between 50% and 90% of user needs for audio warping, which reduces the priority for doing the human-edited workflow.
>do you regenerate waveforms on the fly to be accurate, or just use a GUI-only scaling of an existing waveform, to display things during the editing operation
just use GUI scaling, and only IF the prior is too challenging
You often want sample accurate waveform visualization when tuning samples that are time or pitch warped to set start and loop points at zero crossings to avoid clicks without needing fades.
Overwhelmingly, there's no such thing as a zero crossing. Your closest real world case is a point in time (between samples) where the previous sample is positive and next one is negative (or vice versa). However, by truncating the next sample to zero, you create distortion (and if the absolute value of the preceding sample is large, very significant distortion.
Zero crossings were an early myth in digital audio promulgated by people who didn't know enough.
Fades are always the best solution in terms of limiting distortion (though even then, they can fail in pathological situations).
There's definitely such thing as a zero crossing, it's where sign(x[n-1]) != sign(x[n]) (or rather, there's "no such thing as a zero crossing" in the same way there's no such thing as a peak). Picking a suitable `n` as a start/end point for sample editing is a judgement call, because what you're trying to minimize is the difference between two samples since it's conceptually a unit impulse in the sequence.
I don't think people who talk about zero crossings were totally misguided. It's a legitimate technique for picking start/end points of your samples and tracks. Even as a first step before BLEP or fades.
Theoretically, it makes sense (go look at any of the diagrams of what a "zero crossing" is online, and it totally does.
The problem is that sign(x[n-1]) != sign(x[n]) describes a place where two successive samples differ in sign, but no sample is actually has a value of zero. Thus, to perform an edit there, if your goal is to avoid a click by truncating with a non-zero sample value, you need to add/assign a value of zero to a sample. This introduces distortion - you are artifically changing the shape of the waveform, which implies the introduction of all kinds of frequency artifacts.
Zero crossings are not computed by finding a minimum between two consecutive samples - that would almost never involve a sign change. And if they are computed by finding the minimum between two consecutive samples that also involves a sign change, there's a very good chance that you'll be long way from your desired cut point, even if you ignore the distortion issue.
It really was a completely misguided idea. If the situation was:
sign(x[n-2) != sign(x[n]) && x[n-1] == 0
then it would be great. But this essentially never happens in real audio.
> Thus, to perform an edit there, if your goal is to avoid a click by truncating with a non-zero sample value, you need to add/assign a value of zero to a sample.
No, you (the editor, not an algorithm) look at the waveform and see where the amplitude begins to significantly oscillate and place the edit at a reasonable point, like where the signal is near the noise floor and at a point where it crosses zero. There's no zero stuffing.
This kind of thing isn't computed, a human being is looking at the waveform and listening back to choose where to drop the edit point. You don't always get it pop-free but it's much better than an arbitrary point as the sample is rising.
I mean, you could use an algorithm for this. It would be a pair of averaging filters with like a VAD, but with lookahead, picking an arbitrary point some position before activity is detected (peak - noise_floor > threshold)) which could be where avg(x[n-N..n]) ~= noise_floor && sign(x[n]) != (sign(x[n-1]).
> You don't always get it pop-free but it's much better than an arbitrary point as the sample is rising.
I agree with this, but that doesn't invalidate anything I've said. When you or a bit of software decide to make the cut at x[n], you are faced with the near certainty that the x[n] != 0. If you set it (or x[n+1]) to zero, you add distortion; if you don't, the risk of a pop is significant.
By contrast, if you apply a fade, the risk of getting a pop is negligible and you can make the cut anywhere you want without paying attention to 1 sample-per-pixel or finer zoom level and the details of the waveform.
Thanks very much, this sub-thread has been illuminating for me, and has the compelling quality of being obvious-in-retrospect. I now wonder what my MPC is doing, exactly, when I make an action at what appears to be a zero point. Thanks.
I had an excellent experience using niri to manage visuals for a university event recently. I was handling background loops, videos, and slides, and since it was a last-minute setup I had to improvise the entire workflow. The scrollable layout was perfect for this, I organized the windows horizontally in the exact sequence they needed to appear, which effectively turned a mix of an image viewer, video player, browser, and PDF readers into a seamless presentation (I could just have used the browser but it would have felt clumsy in comparison). The ability to keep everything in fullscreen and instantly exchange windows between displays via keyboard shortcuts made the transitions almost invisible to the audience.
PS although the add-on was removed from Mozilla’s add-on store (AMO) (because of DMCA Takedown Notice) it’s still signed and manually checked for security by Mozilla (hence the delay in signing).
That explains why I couldn't find it. I believe this is the most comprehensive and up-to-date paywall bypasser out there.
What is the actual complaint here? Are people demanding commercials be beautiful? Before being AI slop, it is marketing slop. Why are they demanding 'soul' from an ad in 2025? Everything in this late-stage capitalist landscape is slop. They could have filmed it with real actors (or just reprised a spot from 15 years ago) and it wouldn't make any difference.
Because it's the TV yelling at you something along the lines of "Hey look, we replaced the creativity of dozens of people with this shitty result from a prompt, your job is next."
The fact that it's so bad that it obviously doesn't adhere to any sort of quality standards we expect from humans is just adding an insult to injury. It tells people "AI doesn't even need to be better at your job than you to replace you."
Companies generally want their ads to not be horribly off-putting.
> They could have filmed it with real actors (or just reprised a spot from 15 years ago) and it wouldn't make any difference.
I mean, it was conceptually bad to start with, but also it has a lot of unsettling AI video stuff (in particular, broken physics) that you wouldn't get with a real ad.
"In 2025, YouTube started rolling out a new streaming protocol, known as SABR, which breaks down the video into smaller chunks whose internal URLs dynamically change rather than provide one whole static URL. This is problematic because it prevents downloaders (such as yt-dlp) from being able to download YouTube videos at resolutions higher than 360p due to only detecting format code 18 (which is the only format code available that doesn't use SABR). So far, this issue has only affected the web client, so one workaround would be to use a different client, such as tv_embedded (where SABR has not yet been rolled out to), so for instance in yt-dlp you could add --extractor-args "youtube:player_client=tv_embedded" to use that client. It is not known how long this workaround will work as intended, as YouTube rolls out SABR to more and more clients."
Sorry, I saw the submission (no votes and aging), upvoted it and left the comment thinking the post would die. But someone thankfully did what I should have.
reply