Version 4 (modified by 5 years ago) ( diff ) | ,
---|
Translator
Translator is a tool to convert words and sentences S from one language ("A") to another language ("B"). The tool bases its translation on sentence pairs (A, B) that are stored in a database (usually a .csv file
. It finds the best matching sentence in A and then returns the sentence B.
The sentences in A and B can also contain variables. If S can be matched under an assignment of variables, then the translation is B with the same variables assigned. For example suppose we have a tuple ("Where did you meet [X]?","Waar heb je [X] ontmoet?"). To translate the sentence S="Where did you meet Jan?" it is noticed that S matches the first of the tuple under the assignment [X/Jan]. Therefore the translation becomes "Waar heb je Jan ontmoet?".
Sometimes it is desirable to have the substitutions for the match also be translated. For instance consider the tuple ("Where did you find this <X>?","Waar heb je deze <X> gevonden?"). And suppose we have another tuple in the database ("car","auto") Now the sentence S="Where did you find this car?" matches with the first of the tuple under the assignment [X/car]. Because the variable is inside pointy brackets, the variable itself is now attempted to be translated, giving [X/auto]. To get the translated sentence, this is applied to the second of the tuple to get "Waar heb je deze auto gevonden?".
So we have two types of variables in the tuples:
variable | translation |
[Variable] | direct substitution into B |
<Variable> | substitution of translated variable in B |
Reverse translation
The translator offers functionality to reverse the translation, to translate sentences and words back from B to A
Fragment handling
Another option of the translator is to translate fragments. Consider we have a paragraph containing multiple sentences. Each of these sentences can be translated one of the available tuples in the database. The fragment translator repeatedly tries to apply the following steps until the entire paragraph is translated:
if text starts with any of " .,:{}-\n\t": literally copy that character into the translation else find tuple that matches most characters of text translate this part of the text using that best match