Loomio

Language filter for diaspora - as a gsoc project

PP Pirate Praveen Public Seen by 115

We should detect the language of a post (also give an option for user to manually specify) and allow everyone to filter content based on languages they know.

We suggested it as a google summer of code idea and one student is interested in this. We have one mentor who knows ror but if someone from the community can also support this, it would be awesome.

The discussions happened so far http://lists.smc.org.in/pipermail/student-projects-smc.org.in/2014-March/000076.html
http://wiki.smc.org.in/SoC/2014/Project_ideas#Language_filter_for_diaspora

M

Maciek Łoziński Thu 20 Mar 2014 10:20AM

Maybe language detection (and translation?) could be done client-side? Are there any JavaScript libraries out there? Or maybe some sort of third-party online service could be used?

JH

Jonne Haß Thu 20 Mar 2014 12:43PM

I still heavily oppose mixing these two together. Language detection and filtering is an already discussed topic and I sense a global consensus for it.

However post translation is a completely independent feature, that has very high implications on user experience and the daily usage of diaspora. On a technical level it also has quite a huge impact on the federation protocol.

It just makes no sense at all to mix these together and I like to see much more throughout discussion in our community about post translation before I see time spent on implementation details of how to do it justified.

KS

Karthik Senthil Thu 20 Mar 2014 2:21PM

Due to the disadvantages of translation_tables in globalize, I have decided to replace the globalize_gem with other language translation gems(which
rather use APIs like Google or Bing). I have hereby listed out a few of the gems that I have explored:

1) to_lang ( https://github.com/jimmycuadra/to_lang )
2) easy_translate ( https://github.com/seejohnrun/easy_translate )
3) language-translator

I am sure that apart from these gems, external API calls can also be made for the same.

There are many client side language translators as well supported by jQuery (using Google API). However these might be 3rd party softwares and can cause concerns related to security or robustness of Diaspora.

I am not completely sure as to if these would relieve the server from load, of course translation can be kept as a secondary idea for implementation after implementing the language detector feature and testing the same rigorously. Any feedback for this suggestion ?

FS

Florian Staudacher Thu 20 Mar 2014 7:46PM

I agree with what @jonnehass said.
As the initial step the focus should be on detecting the language. That is not a small feature by itself.

Meanwhile we can keep thinking about a way to implement translation that is practical, but also aligns with security and privacy concerns that form the base of what Diaspora stands for.

Those are two separate features and should not be munged into one task.

KS

Karthik Senthil Thu 20 Mar 2014 8:03PM

Hey,
Yes, the initial focus of this GSoC project will be the implementation of language detection and tagging relevant posts(as planned in the previous comments). However there will be a parallel discussion on how to integrate the translation feature as well.

R

Ryuno-Ki Wed 26 Mar 2014 9:49AM

I'm against a automatic translation through a third party API.
If I'm interested, I can always paste the content into translate.google.com on myself without sending data to it. I dislike the exposing …

G

Globulle Mon 26 Jan 2015 8:42PM

Hi, is there any news on this project ? I feel like it would be very useful to be able to filter posts that are not comprehensible to the user.

P

Pierre-Yves Tue 7 Apr 2015 7:13PM

How about allowing user to specify known languages in their profile so at least it could be used to exclude from the streams the posts of the other users who don't have any language in common.

For example if I choose english and french and someone else has defined english and italian, I might still see posts in intalian from that person if we use the same hastag, but at least I wouldn't see posts from somebody who has defined dutch, german and spanish as favorite language.

This is far from being perfect (unless you only select one language), but still it would improve the streams without using a third party API (like google translate) which is a problem for some people and I suppose it shouldn't be a very complex development ?

MS

Mikaela Suomalainen Sat 23 Apr 2016 6:35AM

I would like to have option to filter languages other than Finnish and English as I don't understand other languages.

Currently I can unaspect people who mainly write in languages I don't understand, but there are still followed hashtags which aren't restricted to one specific language as either the same word exists in both languages or everyone just uses the word instead of whatever the word for their language is.