Maneno
RSS
l
write     admin
Subsaharska

Translating the technology words

Available in: English
03 03 2010
Countries:
AFRICA

The Kamusi Project has just tossed their hat in to the ring of folks who are working to get African languages adapted to 21st century technology terminology. They made the official announcement about a site they've set up to try and further the goals of getting Swahili words adapted to computers. I mean, after all, according to Google Translate, 'computer' in Swahili is 'kompyuta' which for some reason I don't really buy as being a terribly Swahili word.

This frustration is further belied by Rebecca in a recent tweet:

Just done an interview abt @iHubNairobi with #BBCSwahili; I need to practice Swahili more....whats a domain in Swahili? More research needed

Again, according to Google Translate, 'domain' (as in web domain name) is 'miliki', which does sound like a proper Swahili word, but I'm assuming it has a completely different meaning probably having to do with rule of land or something. And that's the issue, do you adopt some "3rd party" loan word for these purposes or do you come up with a new word because let's face it, no one is probably going to call the 'web', 'mtandao' as it's just too long and everyone knows the word web now.

Let me emphasize that this doesn't just affect African languages. The problem exists everywhere which is why a word like 'web' is just 'web' in Spanish despite there officially being no 'w' in the alphabet. It's also why Speakers of Croatian will say SAD instead of Sjedinjene Američke Države for the United States of America due to the length.

I suppose that in the end there needs to be a balance of ease with authenticity when it comes to adding new words to a language. I just hope that efforts like the ones from Kamusi and ANLoc (site appears to be down?) gain some traction because it's a problem that isn't going to go away and will only get worse as time goes on. Just look at German, which currently has 8,000 loan words from English. At what point is your language (and thusly, your identity) no longer yours?

When language reflects life

Available in: English
19 02 2010
Countries:
AFRICA
COTE D'IVOIRE

"Je suis low batt." In French, this literally means, "I am low battery." It doesn't make much sense on its own, but in the context of how Michaela Wrong talked about it in her book, "In the Footsteps of Mr. Kurtz" it is a phrase that she uses to sum up the great wealth of issues which plague the Congo. It harks back to when the first mobiles arrived in the country, which had a very annoying tendency to die after some 20 minutes of talking. Thus, the speaker would typically always have to tell the listener that they were out of battery and had to either switch phones or go charge. The term took on something of a life of its own and came to mean that something in general had run down.

We are at a point now here in Abengourou where the power cuts (or délestage if you will) have become regimented in that they're from 01:00 to 09:00 every day which coincides with no water during that time plus another 3-5 hours afterward as the system rebuilds pressure. Living life around this schedule is not what I would call choice, but it is doable, especially as you know that it's coming.

It just so happens that for the first time today, I heard of someone being asked how late he was out last night to which he replied, "Oh, we were out past-cut." meaning past 01:00. I'm sure that others are saying to make sure not flush the toilet until post-cut as well. As this way of life has become unfortunately ordinary (which is a shame as the resources do indeed exist) we have taken to incorporating it in to everyday language. No one probably even notices this, but happens all the time like when we say, "go Google it" when we mean to look something up online or "grab a Kleenex" when we mean a tissue.

I don't have the perception that people in Africa do this any more or any less than anywhere else in the world, but I find it more noticeable given that when it happens, it's usually a bending of pre-existing words or phrases, whereas in North America or Europe, it's the straight up adoption of a product name given the constant media and marketing blasts that permeate those societies a good deal more. Of course, many people here in Côte d'Ivoire keep insisting that the word for pen is "bic" instead of "stylo" or to grab a "Lotus" (a local brand) instead of a "mouchoir" so, I suppose the jury is out to some degree even still. We humans do enjoy our products; the power cuts, not so much.

The Gate2Home Virtual Keyboard: Awesome in Typing Form

Available in: English
08 01 2010
Countries:
AFRICA

It appears that this site has been registered since 2006, but I'm not sure how long gate2home has been in its current incarnation, which quite honestly, kicks ass. This site is an onscreen, virtual keyboard that allows you to type with the characters you need in your language and then paste them in to whatever text program you need. And while it may initially look like it's one of those hunt and peck things that you can use in other systems, it's not. You can actually type from your computer keyboard and it maps the characters accordingly. The fellow who created it did so out of personal need, which is where I find a great deal of the best projects come from.

The reason that this caught my attention was the fact that when you open up the initial dropdown to choose your language, right there, bam, at the top is Akan. But, the African languages don't just stop there. There is also Bambara, Bemba, Fulfulde, Ga, Hausa, seSotho, Lingala, Yoruba, and a wonderful slew of others. And naturally there are a lot of other language beyond the African ones, such as the developer's original Hebrew.

Having dealt with installing keyboard language packs and dealing with all the issues around the fact that American operating system manufacturers don't really care about languages in general, this is a godsend. Obviously, I probably wouldn't use it on an incredibly long-term basis (language pack, you are there for me on that for now...) but for short things, or maybe even decently long ones, this is really, really cool.

For anyone who writes and speaks in a language with characters beyond the extended Latin set, I really recommend checking this out and see how well it works for you. Or just keep it in mind if you, like the developer find yourself in an internet cafe trying to use the local keyboards. I'm still traumatized by the Belgian French layout.

The Gate2Home Virtual Keyboard: Awesome in Typing Form

Help out this world language list

Available in: English
05 01 2010
Countries:
AFRICA

Following on my prognostication for 2010, I came across a page on Wikipedia for the total number of language speakers in the world. I applaud the fact that this list was created as it is interesting to see. It's just a shame that some of the figures are insanely inaccurate, which is probably why it has been proposed to delete the article.

One of my barometers on anything to deal with world languages is Croatian or Serbo-Croatian if you will. On this chart, it lists it as the 50th most spoken language in the world. That really doesn't seem correct and a great deal of the numbers are out of whack. Digging deeper, I see that some of the African language totals are worse than a stab in the dark. For the most obvious starter, take a look at Kiswahili. It's listed with 5 million native speakers and 80 million secondary speakers. Most accounts I've seen have it listed at 100-150 million speakers. Some documentation is needed there.

It's little things like this that make this list need a great deal of love and it's unfortunate to see that despite all the activity on it, so many of the figures are quite inaccurate. So, I ask of anyone out there with some language knowledge to document and contribute to this list in order to make it something a great deal more respectable, at least on the African front if nothing else.

Help out this world language list
Nifty language distribution map from here

2010: The year of language

Available in: English
02 01 2010
Countries:
AFRICA

When it comes to web technology trends, there is typically one that is the sexiest one for that year. For example, "mobile" was the one for 2009.

I'm going to go out on what I feel to be a rather thick limb and say that 2010 is going to be the year of language. We've been seeing multi-lingual efforts grow by leaps and bounds over the past years and it seems that we're getting to a point where most people I know say, "Hey, Google Translate doesn't just simply translate literally, but it's actually quite good." The web has matured in the possibilities it allows in being able to cross the borders formed by language.

Nowhere is this more the case than in Africa. I see 2010 as a pivotal year in African languages getting online. Jimmy Wales wants more African languages in Wikipedia and there has been a good deal of push by Google in this department with their Kiswahili Wikipedia Challenge that the Google Africa blog covered two weeks after it was over--how timely. But, the fact is that while all kinds of money and effort can be tossed at getting more African languages in to a digital format, if it doesn't come from Africans, it's not going to take root.

While a great many African languages were alphabetized in to Latin character sets a century ago by missionaries, it's unfortunate to see that despite this, so many languages, while spoken, as not able to be read or written (Kiswahili and a handful of others are indeed working to buck this trend.) I would posit that while these alphabets exist, for the most part, they weren't created by those speaking the languages from birth. They were an artificial, external force that didn't stay around.

By comparison, a bit before the time that missionaries were traipsing about Africa, putting these historically oral languages to text, the Romantics in Europe were busy standardizing their languages. Pompeu Fabra, Vuk Karadžić, Ferenc Kazinczy, Alessandro Manzoni, and a slew of others were refining the languages that they had grown up with. But, instead of formalizing their languages in order to spread religion, they were doing so in order to spread the language.

It needs to be said that Amharic and other languages in Africa did indeed have established alphabets, but compatriots of these European Romantics were busily trouncing African languages through Colonialism. While enforcing English, French, Portuguese, and Spanish as lingua francas may have been practical (yet brashly inhumane) in the artificially created borders of a colony that may have had upwards of 100 or more languages and dialects, it set up a system that we still see in place today. This is especially in Anglophone or Francophone African countries where the local languages are spoken on familiar, yet not official terms. There have been strides made to try and stem this linguistic undertow of the last century as seen in Tanzania, Mali, and others, where education in the local languages is either being proudly enforced or at least investigated.

The problem in all of this is that spreading a language in an official capacity is expensive and English has (like it or not) become the business language of the world. Dictionaries are not cheap to print and institutions are not set up overnight, let alone the fact that you need people able to read and write in these languages in the first place who are in constantly dwindling numbers. Taking on the creation of language institutions for an entire country to function are not easy to propose, especially if there are several languages to consider.

So enters the internet and more importantly, the point where we are at with language on the web in 2010. Wikipedia, Google, Facebook, Twitter, and others (such as this site) are all taking the fact seriously that any 21st century web business model now needs to include a multi-lingual environment to reach the maximum number of users.

Kiswahili has been the golden child in all of this, making use of many of these crowd-sourced technologies to bolster its online presence. While Google is trying to promote competitions, these linguistic efforts can be self-started and homegrown. In fact, to truly succeed, I think that they have to be, as people need to convince themselves first and everyone else second. Of course, many people will very well be asking, why bother?

Google doesn't need to destroy all the data it can't index because it's going to reach a point where if it isn't online, then it will disappear from our collective knowledge. We're at a pretty crucial tipping point where all the languages that are going to be carried forward with us need to get online now, or they will simply cease to exist due to the original speakers dying off or a language like English or French supplanting them. While a monolingual culture may seem easier for people, the fact of the matter is that your identity is tied up in your language and if you lose your language, you lose your culture. The global corporations would love for us all to have the same language and buying habits, but I'm of the opinion that losing the languages and cultures which define us, we basically lose us.

So, let's keep the languages jumping as this new decade takes on the digital preservation of all our languages.

2010: The year of language

Balkans meet Africa (again)

Available in: English
05 12 2009
Countries:
CAMEROON

A couple of days ago, McAfee released the information that .cm was the most dangerous domain on the internet currently. It's not so much that Cameroon has more internet scammers, but more that the internet scammers of the world have turned to .cm domains to create malware sites when people mistype a .com address. Unfortunately I think that a lot of people are now going to associate the country of Cameroon with being full of nefarious net thugs, which is quite unfortunate, as I say it's simply not true. You can read a full breakdown of all of this in a PDF on McAfee's site. (Yeah, a PDF is kinda like a fax machine, if you were wondering...)

The funny thing in this is that as I state on my about page, rarely is it the case that I am able to combine my Croatian lineage with my interests in African tech. Well, it appears that this is one instance (again) where there is actually an overlap.

As a complete opposite to .cm, it turns out that the .hr domain for Croatia is one of the safest in the world--Croatia in Croatian is Hrvatska thus the HR. Again, this doesn't mean that there are less scammers in Croatia, it just means that of those sites using the .hr domain name, there are less that are harmful on the web. Why is this?

To start out, a Croatian domain is considerably more expensive to register than a .cm, so that does play in to things to some extent. Then there is the fact that unless you have a Croatian website, in Croatia, .hr sucks as a domain extension. No one in their right mind would bother to register that for typo mistakes because really, there are none to be had. This all makes it a safer domain purely due to being less desirable.

But beyond these two points, there is something else that plays in to this in that you have to be a Croatian citizen or have a Croatian company (incorporated in Croatia) to purchase one of these very expensive domains. This in effect limits the possible buyers to a maximum of about 5 million. While that sounds like a lot, think about the fact that from what I found, it seems that anyone can register a .cm domain. This creates a potential pool of billions of buyers. Obviously your chances to have a couple of bad apples in the bunch rises a great deal in this.

I think that when saying .cm is the most dangerous domain on the internet (or .cn or .hk or whatever) there needs to be a total given along with this to state how many of these sites are actually registered by the people of that country. I'd bet good money that if you did that, you'd see that nearly none of the malware idiots are Cameroonians because the overall penetration of the internet in Cameroon is around 3% currently. So, people just don't have the access to go about creating some nefarious site when there are much better things like email, Facebook, or other communication tools to use when one has limited and very expensive net time.

In all honesty, I think we screwed up (or rather the US with ICANN screwed up) in creating non-country specific domains in the first place such as .com, .net, .org, .info, .travel, .biz, etc. I think that if we only had domain extensions per country to date and you had to be a citizen of that country to get one, things would look a great deal different in internet land and I have absolutely no idea who on the net would have the "most dangerous domain" honors.

The words are flying at Google

Available in: English
22 11 2009
Countries:
AFRICA

Google has really been busy on the language side of things lately. This wouldn't be news to anyone except translators and multilingual folks except for the fact that they introduced more African languages to their mix of available languages for translation and so it's suddenly become a good deal more important for Africa as it is boosting cross-communication abilities on many fronts.

First off is the new Google Translate. I use this system quite often, so I noticed right away when they made the switchover a couple of days ago. There were some bumps in the transition which I'm assuming were due to the work being done at off peak hours in the US, but very much on peak hours for those of us on UTC or UTC+1.

In general, I like the new format. It's definitely snappier overall for quick translations. What I don't like is that it's quite heavily AJAX driven (as are most things these days) and I'm curious as to how well it would perform in a low bandwidth setting. I'm hoping that someone can give that a go as try as I might to throttle my connection, I can't seem to get it to downscale to to point where I feel is properly representative of a low bandwidth connection.

Something that's also rather new is the speaking voice for English target translations. This is really quite important as the English alphabet is complete garbage when it comes to writing how the language is spoken and I'm sure that non-English speakers will get no end of enjoyment out of wondering how on earth through, threw, and thru all sound the same. What would be nice is that in addition to the Roman alphabet transliteration for languages like Chinese is if they did this for English as well...

Oh course the big news in translation land are the automatic captions for YouTube. These are huge and quite frankly, it's about time that a major video platform finally added in some proper subcaptioning abilities. Sadly, it will probably mean the death of dotSUB which is a platform that I like to varying degrees, but it's easy to understand why people were lax to add in subtitles as it was a great deal of work to create and then translate the text. Google takes the approach of "machine bash in to shape. human refine. everyone love." and I think that it will work quite well overall. Obviously once they fully deploy the system and people start to use it more, we'll see it refined a great deal. But it's good that Google's YouTube brand has finally started making good use of the Google abilities such as machine translation.

My only wish is that Vimeo would do something similar and maybe because of this, they will. Honestly, they should just buy out dotSUB or something to that effect. Their interface, video quality, and overall ease of use if vastly superior to YouTube with YouTube being kinda like a Spanish croissant in that it's okay overall, but once you dig in to it, it kinda sucks a great deal...

Out of ICANN: Đómàíñš in extended characters

Available in: English
02 11 2009
Countries:
AFRICA
Tags:
icann, language

One of the big chunks of news to come out of the ICANN meeting in Seoul, Korea was a final timeline and implementation guideline to have internet domains in non-Latin characters. Honestly, I wasn't even going to write about it as I am much more interested in seeing how the implementation comes about and how it shakes down. But, in poking around for news about it, I came across this Pros & Cons article. I am nearly amused by the con comments as I'd really like to know if the people making them are a) English speakers and b) monolingual. They're just not well thought-out and such incredible straw man arguments that I would laugh if it wasn't the case that comments like these could derail the whole process of creating a proper multilingual internet.

Expanding beyond Roman characters also increases potential for site rip-offs that use homoglyphs, characters with identical or indistinguishable shapes.

Pfft. Then we should just shut down the internet and resolutely solve the problem. I mean, people die in car accidents every year. Should we not create new cars because people could die in the new cars when they're currently dying just fine in the old cars? This reasoning is not logical and sounds like a veiled attempt to excuse laziness in making this switch because hey, it works now, so why change?

Adding support for 100,000 international characters would make traditional keyboards insufficient input devices for accessing the entire Internet. As fellow PC World writer Jacqueline Emigh pointed out, it would be next to impossible to produce a keyboard that could support characters from every language under the sun.

Really? Are you serious? Depending on what I'm working on, I typically have up to four keyboards installed on my machine: English ISO, Spanish ISO (which also has the French characters), Croatian, and Cyrillic. I can probably type at least 1,000 different characters by easily swapping the active keyboard. I'm using Windows XP, which is old. Windows Vista and Mac OS X are even better in this department. We've had this "amazing" technology around for over a decade. It's easy to switch and it works fine. And really, if I need to go to a domain that has French characters in it, wouldn't I be probably be using a keyboard that supported the French characters already? Also, the English QWERTY keyboard was designed to have you type slower, so isn't it about time to update it anyway?

I realize that people are shuddering to think that this could establish "language silos" on the internet. Only an English speaker would think this because currently, imagine how it is for a Russian typing with a Cyrillic keyboard to have to switch all the time to Latin characters just to enter a domain? The silos will develop no matter when and if they're going to develop. I think that due to all the language work that's going on these days, we are actually entering an age of far better cross-communication than ever before.

All of this doesn't effect Sub-Saharan Africa as much as other countries due to the fact that African languages (with the exception of Amharic) were alphabetized using Latin-based alphabets. But the one thing that would be great out of this is that a language such as Lingala, which was created with accented characters, doesn't get "Anglicized" as often when written on the internet and the characters actually stick around.

If you don't currently have it, I recommend for anyone out there to switch to the US International Keyboard if an English speaker. It doesn't ship as default with operating systems for some insane reason, but it offers up a huge swath of other characters to access just by using one additional key.

Out of ICANN: Đómàíñš in extended characters
Keyboard layout from Wikipedia.

Why Francophone Africa is less dynamic than Anglophone

Available in: English
01 11 2009
Countries:
AFRICA

I realize that's a rather brash title and I don't like it at all, but it's not mine. It's actually from Neo's blog called, Carnets d'un étudiant africain...exilé en Europe (Notebooks of an African student... exiled in Europe) who is a French-speaking African. He writes really well and if you don't speak French, but have the patience to use the Google Translate (it does have "hiccups") I highly recommend reading his blog.

For some time, I have indeed wondered why it is that the English speaking countries seem to be doing better overall than the French speaking countries. Obviously, this hasn't always been the case, but currently countries like Ghana, Kenya, Tanzania, South Africa, and yes, Nigeria are either up and coming or in general, doing quite well. Then on the other hand, countries such as all the Guineas, both Congos, CAR, Cote d'Ivoire, and others still have a ways to go in a great number of issues. There are exceptions to this gross generalization of course in that Senegal and Benin are doing rather well and Zimbabwe is not. But overall it begs the question: were the British a better Colonial power than the French, Belgians, Germans, Italians, Portuguese, or Spaniards? The answer is no, not really because that's a lot like asking, which terminal cancer you think is the best to have. Amongst the field of choice, there are maybe slightly better options, but overall, it's all the same damned thing.

It's clear that Neo has thought a great deal about this as he's brought points that I had never considered, which basically comes down to the work ethic of Protestant vs. Catholic. Yes, European religions are still having an affect on 21st century Africa.

Je pense que c'est du au fait que les anglophones ont mieux intégré, en même temps que la colonisation, la culture économique néo-libérale... Cette culture économique est entièrement basée sur des concepts et une idéologie anglo-saxonne (anglaise du 18e et 19e siècle, puis américaine au 20e) et donc protestante.

I think it's the fact that the English were better integrated, along with colonization, and a culture of economic neo-liberalism... This economic culture is based entirely on concepts and Anglo-Saxon ideology (English 18th and 19th century and American 20th) of the Protestant.

...la France comme les autres pays latins s’inscrit dans une tradition chrétienne catholique qui a toujours encouragé les individus à privilégier la pauvreté et l'humilité (pour entrer plus facilement dans le royaume de Dieu), du moyen-Âge jusqu’au dix-neuvième siècle, où le capitalisme était vivement critiqué...

...France, like other Latin countries as part of a Catholic Christian tradition has always encouraged people to focus on poverty and humility (for easier entry into the kingdom of God), the Middle Ages until the nineteenth century, when capitalism was strongly criticized...

He talks in more depth about these issues, but it's a great argument to show that when countries in African were nonchalantly carved between various European powers, they not only took on the common language of the Colonist, but also a great deal of their religious and cultural ideology. Again, a very good read and very mighty HT to Twiga for pointing me to this article in this first place.

Why Francophone Africa is less dynamic than Anglophone
Francophone on the left and Anglophone on the right.

Hands on with Twitter Translate

Available in: English
09 10 2009
Countries:
AFRICA
Tags:
language
Hands on with Twitter Translate

At Twitter, they've apparently they've gone public with the fact that they're opening up their translation system to the general public. The details are sparse other than to show that they're creating versions for German, French, Spanish, and Italian at the moment which will augment the English and Japanese versions that are already there.

Just stepping in to French is going to go a long way towards opening up Twitter to more Africans, where it is also quite popular with those who use it. But one thing that should be noted if you click on their signup link is that in the form for language, there is already the option to translate Twitter in to Swahili. Very interesting and it goes to show that Twitter is most likely betting on that being a key language or they wouldn't have bothered to put the option in until people asked for it.

But there aren't many details beyond this. Thankfully, I was given a chance to see a demo of what they've got going last week, which I'm sure a great many other people are going to see shortly. It's an interesting system in that it presents the user with a phrase that they then need to translate. Text is nice and plain, although currently there is no mention where that text it used. The user translates that and apparently a lot like Facebook, translations will be voted upon and those that are the most accepted or reliable versions will get promoted in to actual usage. Overall, it very much sticks to Twitter's simple design mentality.

I can see that it has a long way to go towards refinement, but at the same time, I can understand why they're releasing it to a wider audience in that it looks like it's time to let it out in the wild and see where the pieces fall. Of course, given the immediate nature of Twitter, I'm rather curious to see if "Translate Wars" will erupt where people keep trying to trump each other or get in to a bitch fight on Twitter over certain phrases (things like color vs. colour come to mind immediately) although languages where this could be a real problem (namely US vs. Original English and New World vs. Original Spanish) are not currently options for people to sign up to translate.

It's probably going to get Twitter a lot of press and rightly so as this push by companies to stretch in to multiple languages is rather a noteworthy advance and they're including Africa from the start as Afrikaans is in there as well as Swahili.

(1)  2  3    >>