Here's a call out to see if anyone is interested in working together to
create a homebrew language translation tool. Using emerging powerful
free software tools like OpenNMT (https://opennmt.net/
), we can actually
provide a concrete alternative to Google Translate and non-friends that
is self-hosting, configurable and extendable by mere mortals!
Proprietary, walled garden language translation services like Google
Translate, DeepL Translator and SYSTRAN are all built on public
research, data and investment. https://publiccode.eu/
makes the argument
against these services remaining privately owned and controlled.
Let's build our own! We're not starting from scratch. I propose to start
with a English<->Dutch (surprise!) translation tool. Rough steps might be:
1) Build a Dutch prototype from the official OpenNMT German translation tutorial.
2) Using OpenNMT TensorFlow integration, run a model training LAN party!
To build a useful translation tool, we need to train a model from really
huge amounts of data. We can surely find the data but the computing
power, not so easy (read €€€ necessary). So, using (ironically, a Google
tool, but FLOSS) TensorFlow integration with OpenNMT, we can build a
local network of computers to run a model training in some actually
short amount of time.
If this is possible, who says we can't build a decentralised network of
of computing resources to run a training to help improve the translation
3) Make It Homebrew-able
At this point, we'll want to experiment with how to package the tool,
document it, make it configurable, self-hostable, extendable etc.
If we get this far, we're doing very well :). However, further ideas:
You may be familiar with memrise.com/duolingo.com/reverso.net
. All these
tools rely on some translation but also on some way to analyse the
language. Luckily for Dutch, we have the amazing
We could start to build some common language learning tools. We could
also try to cooperate with both: http://cls.ru.nl/languagemachines/
for further experimenting.
I'm no NLP hacker. I'm willing to learn though! Drop me a line if interested :)