What Does a Privacy-First AI Translation Stack Look Like?

Pbiz@lemmy.world · 22 days ago

What Does a Privacy-First AI Translation Stack Look Like?

Sergio@piefed.social · 22 days ago

lel I worked on a couple speech interface projects back in the 00s before all these corporate spyware platforms emerged. Naturally, it was all on-device (or a local server we controlled). This was more R&D/prototype stuff so it wasn’t as robust as systems nowadays, but the software is still out there:

Speech Recognition: https://cmusphinx.github.io/
we weren’t doing translation so idk about that
Text-To-Speech: https://github.com/festvox/festival

Pbiz@lemmy.world · 16 days ago

That’s really interesting. Sometimes it feels like local AI is a new idea, but a lot of the foundations were already there years ago.

The difference now is that the models have become good enough that these kinds of workflows are practical for everyday users, not just research projects.

Sergio@piefed.social · 16 days ago

All AI was local until recently. (late 2010s maybe?) It’s important not to let the cloud providers gaslight us.

these kinds of workflows are practical for everyday users

Kind of. A good system will still have a lot of design to it. If you just take an off-the-shelf LLM and do the minimal tuning for it to do the job, then you’ll get just another crappy system.

Pbiz@lemmy.world · 15 days ago

That’s a fair point. A good user experience usually comes from the engineering around the model, not just the model itself.

The AI gets most of the attention, but things like latency, workflow design, context handling, and reliability often make the difference between something people try once and something they actually use.