TY - EJOU
AU - AnJian-CaiRang,
AU - Song, Dawei
TI - Tibetan Sorting Method Based on Hash Function
T2 - Journal on Artificial Intelligence
PY - 2022
VL - 4
IS - 2
SN - 2579-003X
AB - Sorting the Tibetan language quickly and accurately requires first identifying the component elements that make up Tibetan syllables and then sorting by the priority of the component. Based on the study of Tibetan text structure, grammatical rules and syllable structure, we present a structure-based Tibetan syllable recognition method that uses syllable structure instead of grammar. This method avoids complicated Tibetan grammar and recognizes the components of Tibetan syllables simply and quickly. On the basis of identifying the components of Tibetan syllables, a Tibetan syllable sorting algorithm that conforms to the language sorting rules is proposed. The core of the Tibetan syllable sorting algorithm is a hash function. Research has found that the sorting of all legal Tibetan syllables requires eight components of information. The hash function is based on this discovery and can be assigned corresponding weights according to different sorting verify the effectiveness of the Tibetan sorting algorithm, we established an experimental corpus using the Tibetan sorting standard document recognized by the majority of Tibetan users, namely the New Tibetan Orthographic Dictionary. Experiments show that this method produces results completely consistent with standard reference works, with an accuracy of 100%, and with minimal computational time.
KW - Hash function; Tibetan; component element; priority
DO - 10.32604/jai.2022.029141