Monthly Archives: August 2014

WHO FAQs on Ebola in N’ko and Vai; CDC radio spots in other languages

The ongoing outbreak of Ebola does not seem to be letting up anytime soon.  For those who are affected in the region, effective public communication about resources and preparedness appears to be one of the major challenges.  Although communicating a message can only go so far, there are efforts being made to serve populations that don’t readily understand English or French.  Examples of those can be found here:

Update (10/25/15):  Here is a copy of a more extensive article about Ebola, translated from Wikipedia material into Vai:

Ebola Article in Vai

As yet, there are not enough editors for a fully-fledged Vai Wikipedia to materialize (five are needed), but things are getting closer on that front.

Categories: Uncategorized | 1 Comment

Introduction to Inputting Amharic

For those of you who missed the Spring Africana Librarians Council (ALC) meeting, or are curious about how to input Ethiopic script into MARC library records, this may help serve as an overview.  It would work best for a user who has had some preliminary exposure to Amharic, or for someone working alongside a native speaker.  Ethiopic (or Ge’ez) script is also used for the Tigre and Tigrinya languages, among others.

Amharic Lesson

Update #1 (4/12/16):  Some of the more difficult distinctions to make visually are between, for example, syllables like ሳ (sā) and ላ (lā); ሰ (sa), ስ (se) and ለ (la); or ጻ (ṣi) and ጾ (ṣo).  It is also important to keep in mind the distinction between transliterated glottal vowels like ʼa (አ) and pharyngeal ones like ʻa (ዐ).  Glottals are romanized using the alif, while pharyngeals are romanized using the ayn character.  It is also worth noting here that the Ethiopic calendar has thirteen months (one is very short), and is offset from the Gregorian calendar by seven to eight years.

Update #2 (5/25/16):  There is some movement in the direction of developing OCR (optical character recognition) for Amharic and Tigrinya, using the open source OCR engine Tesseract.  Look for the language packs listed here.

Euan Cochrane helped me find a free front-end product that works with Tesseract; I will be looking to pull the pieces together over the next couple of weeks to do some testing.

Update #3 (5/26/16):  Preliminary testing of Tesseract in Amharic is moving ahead.  I started by giving it what should have been an easy test, a Wikipedia page on South Africa.  A sample line or two is as follows:

Input:  በደቡብ አፍሪቃ ሕገ መንግሥት መሠረት 11 ልሳናት በእኩልነት ይፋዊ ኹኔታ አላቸው።

Output: በየበበ አፍረፆሐገመገማሥን መሠረን ገገ ልስናን በአከልነን ርፋዊ ጤል አሳኘውዞ

The accuracy rate is running at about 42% so far.  More work is needed.


Categories: Uncategorized | Leave a comment

Blog at