These days i'm working on my final year project to complete my Software Engineering degree @ APIIT.
The topic of my FYP is "A Universal Translator for Mobile Phones". As soon as you here about the topic you may remember Start Wars because in those films there were inter-galactic translators to communicate with each aliens.
Ok, if i get back to my FYP, it will work as Speech -> Speech translator. It will be helpfull when you are in abroad and you can't speak or understand the local language.
Let's assume that you are in France and you only know how to speak/understand English but you need to talk with a French guy. So all you have to do is,
- Install my application in a mobile phone.
- Talk to the phone in 'English'.
- Then the phone will recognize what you spoke by using it's Speech Recognition Engine.
- So the sentence you spoke will be in text format.
- Then the phone will translate that sentence into 'French'.
- After that the translated text will be spoken by the phone in 'French' by using it's Text-to-Speech engine, so that the other guy can understand.
- This can be done vise-versa too so that you can understand French in English.
I think now you have a slight idea that it functions like,
Speech Recognition ==> Language Translation ==> Text to Speech.
There are many challenges I have to face in developing this,
- I'm building the application to work in an average mobile phone which is still not done yet. All the available solutions are for PDA's which runs Windows Mobile OS and has higher hardware specifications.
- Homophone Detection - A homophone is a word that is pronounced the same but differs in meaning (Eg. to,too,two) The speech recognition engine is not able to detect those errors. Eg. It may recognition "I need to go home" as "I need two go home" which is incorrect. So I came up with a 'Homophone Detection and Correction' algorithm which is AI based and use 'part of speech' tagging.
- Sentence recognition - Current speech recognition API's are only capable of recognizing a single word (a command). So i have to enhance the speech recognition to recognize sentences.
- Pronunciation modeling - A person can understand a sentence only if it is pronounced correctly. But still there are gaps in pronouncing (Text to Speech) in mobile computing. So I have to come up with a better and user understandable pronunciation mechanism.
Currently I have finished the research part and now in the design phase. Hope I'll complete it successfully.
No comments:
Post a Comment