UM-CAT Development Team Shares Their Research Experiences

Chinese Text | Kelvin U & UM Reporter Lyra Qian
Photo | Jack Ho with some photos provided by the interviewee

Over the past two decades, a team of researchers have worked tirelessly to continuously update and improve UM-CAT in order to meet the needs of the translation industry.

Putting the Users First

Ao Chi Hong, a research assistant in the NLP2CT, is one of the core members of the team behind UM-CAT. The neural-based machine translation system, in which he was involved, won the top three awards at the 13th China Workshop on Machine Translation. The award-winning system is one of the core technologies used in developing UM-CAT. Currently, Ao is responsible for system setup and management of UM-CAT as well as ensuring good user experience.

UM-CAT Translation System

Ao joined the NLP2CT as a research assistant after completing his bachelor’s and master’s degrees in computer sciences at UM. He has been working in the lab for nearly eight years. He says, ‘After working on the system for so many years, I am most happy to see its launch into the market. We put great emphasis on the quality of Chinese-Portuguese translation and aim for a high degree of accuracy, especially with terms commonly used in Macao and Portuguese-speaking countries. We also make sure that it is easy to use. So it is a valuable product for our target markets. It can even create a tailor-made management mode for a translation company to streamline the work flow and improve the quality of translation. These functions are lacking in many other translation platforms currently available on the market.’

UM-CAT research team

Improving UM-CAT’s Machine Learning Algorithms

Liu Xuebo, a PhD student of computer sciences in FST, is also involved in the development of UM-CAT. He is mainly responsible for improving the translation quality of UM-CAT by improving its machine learning algorithms, which is also the subject of his PhD research. He explains, ‘Machine translation includes two steps: understanding the source text and generating a translation. My work is to study how to improve the machine’s ability to understand the text. Only by ensuring a high degree of accuracy will the system become a truly useful tool for the translators. And only then will they be happy to use it.’

According to Dr Liu, like the majority of machine translation systems in the world, UM-CAT is based on the architecture developed by the University of Montreal in 2015. However, in 2017, Google launched a new architecture, which greatly improved the quality of translation. So UM-CAT must be updated to keep pace with the changing technology. Liu says, ‘In terms of improving the algorithms, only by using the latest architecture and method can we ensure that the quality of the translation meets translators’ requirements for accuracy and readability.’ In the same year, the team from the NLP2CT, led by Prof Wong, visited Tsinghua University. As a member of the team, Liu studied the latest architecture developed by Google with researchers from Tsinghua. ‘The biggest gain for me was that we thoroughly studied and mastered the architecture,’ he says.

In November 2018, the team returned to Macao and began to gradually integrate the latest technologies in UM-CAT to improve the system’s ability for deep learning. Dr Liu explains: ‘In the past, training machine translation systems relied heavily on parallel corpora. We are talking about millions of Chinese texts and corresponding Portuguese translations. But now, we constantly update the parameters of each neuron in the neural network. In other words, each time we develop a better neural network, the quality of the translation will greatly improve as a consequence, either by reducing the occurrence of overlooked words, or by improving the readability of the translation.’ The team in theNLP2CT and their counterparts from Tsinghua University have submitted two jointly authored papers to international journals. A patent is also pending for approval.

Liu Xuebo works with researchers at Tsinghua University

The Market Potential of Machine Translation Systems

The current version of UM-CAT was launched in December 2018, and the team is already working to develop a new generation of the system (G4). UM alumni who have been involved in the development of the system are shining in different professions, some in tech giants like Alibaba, Baidu, and Tencent, while others CEOs of their own businesses, contributing to the development of Macao and the Greater Bay Area. These alumni all feel excited to see the fruit of their labour being finally put to use.

UM alumnus Tian Liang worked in the NLP2CTas a research assistant for one year after graduating from the master’s degree programme in computer sciences in 2012. He was involved in the development of various machine translation systems. In the process, he learned the huge potential in the application of artificial intelligence. In 2013, he left the lab and registered a science and technology company in the mainland that is focused on developing AI voice and image technologies. He thinks that as a product made in Macao, UM-CAT has local characteristics and advantages. More importantly, it gives priority to the growing demand in Portuguese-speaking countries for translation services in Portuguese, English, Cantonese, and Mandarin. In recent years, Tian started to concentrate on the Macao market. ‘This is a time when machine translation technology is developing rapidly, I hope to launch the related technologies and products in Macao and Portuguese-speaking countries. Also, UM is my alma mater, I hope to launch UM-CAT on the mainland market through collaboration with the university and companies.’

Tian Liang hopes to launch UM-CAT on the mainland market through collaboration

A Breeding Ground for Tech Talent

UM alumnus Zeng Xiaodong, who graduated from the master’s degree programme in e-commerce technology in 2012, was involved in some interdisciplinary projects in the NLP2CT while still studying at UM. In 2018, Zeng entered the 35Innovators Under 35 China list released by MIT Technology Review. Former honorees on this list include Google co-founders Larry Page and Sergey Brin, and Facebook founder Mark Zuckerberg.

Zeng Xiaodong was included in 35 Innovators Under 35 China list by MIT Technology Review

Zeng says, ‘I’m very excited to see the project in which I was involved finally being put into use. UM-CAT removes language barriers in communication.’ Zeng notes that the innovative technologies he acquired in the NLP2CT and his experience as an exchange student in Portugal provided a powerful boost to his self-confidence when he later decided to start his own business after graduation. He learned from these experiences how important it is to honour one’s passion and pursue a career following that passion. Currently, Zeng is working in Ant Financial’s technology laboratory, where he devotes himself to exploring new technologies based on the Internet of Things that can transform the retail industry. He founded Tao Cafe, the first cashier-less shop in China, which enables customers to leave the shop with their items in hand without needing to head to a register, landing him on the MIT Technology Review’s list. Zeng is grateful to Prof Wong Fai and Dr Chao Sam from FST for their guidance. ‘Studying at UM gave me the opportunity to meet people from different countries and regions. Such an open environment helps cultivate an open mind. In terms of educational resources accessible to every student, UM is arguably one of the best in China,’ he says.

ISSUE 20 | 2019

Also in this issue