|
About the SEAlang Library Buginese Bitext Corpus
This bitext resource is based on material taken from Gene Ammarell's unpublished Buginese - English - Indonesian database. Using the bitext corpus Any bitext search can rely on either or both of two targets: |
-- an SEA-language target (try engka), |
-- an English target (try house). |
Alternative targets can be given as shown; e.g. A|B means "match either A or B," while A/B means "match either AB or BA" (which be very helpful in Southeast Asian languages). The "near" search, "~", is helpful in finding split and/or phrasal constructions, which may have intermediate words inserted. |
The SEA and English targets can also be made to require or exclude one another, using the various Match buttons. This is helpful for finding less-common translations in a natural context; for example, a word with a common surface translation of "house" may also have secondary meanings like "domestic, local, tame". The and not search can exclude "house" and reveal the others. |
Note that a return mode setting lets entries be returns in plain text
format, and is intended to allow easy cutting and pasting of examples.
All entries are return in order of length.
About bitext corpora A bitext corpus shows words, phrases, and sentences in translation. Insofar as possible, translated texts are aligned sentence-by-sentence. Bitext corpora have many applications: |
- in education bitexts can markedly increase student reading and comprehension in a second language. Because the raw volume of text they read jumps so dramatically, students are exposed to a much wider vocabulary Moreover, when text is easier to read, students can begin to understand large-scale features of style and grammar. Bitexts have long been a mainstay of second-language education for European languages, and are equally valuable for students of English and Southeast Asian languages. |
Bitext search tools are a cornerstone of data-driven learning. Calling up a dozen examples of a word, phrase, or construction helps students understand and retain subtle distinctions of meaning and usage. It is even more helpful in teaching writing than reading, because bitext searches let real-world experts - writer and translators - provide on-the-spot advice and examples. |
-
in research
bitexts are an essential part of
research in translation, word-sense disambiguation, and lexicography.
Because they let us leverage tools and techniques from other
languages, particularly English, they are extremely important
for learning how to build search engines, summarize documents,
align texts, and so on for SEA languages.
Thanks Thanks to Prof. Gene Ammarell (Ohio University) for sharing his database with the project. |