Xiaoyu1,2, Tian Zhenzhen2, Xin Zihao2, Liu Suolan2, Chen Fuhua3, Wang Hongyuan2,*
Journal of New Media, Vol.4, No.3, pp. 137-143, 2022, DOI:10.32604/jnm.2022.027890
Abstract Recent advances in OCR show that end-to-end (E2E) training pipelines including detection and identification can achieve the best results. However, many existing methods usually focus on case insensitive English characters. In this paper, we apply an E2E approach, the multiplex multilingual mask TextSpotter, which performs script recognition at the word level and uses different recognition headers to process different scripts while maintaining uniform loss, thus optimizing script recognition and multiple recognition headers simultaneously. Experiments show that this method is superior to the single-head model with similar number of parameters in end-to-end identification tasks. More >