Hidden properties identification and text diversity translation of people’s names
MetadataShow full item record
Developments in the ability to analyse online data of people’s names have provided breakthroughs for social research, however, there are several existing challenges. This thesis proposes several novel approaches for identifying hidden properties and text diversity translation of people’s names. We start by studying the hidden properties of people’s names and found that there are limited existing methods to identify more than one hidden property of people’s names at one time. We, therefore, propose a ’Hidden Property Bayes’ model that achieves identifying more than one hidden property of people’s names in Kanji and Hanzi at one time. In addition, our model performs better than an existing system on name origin identification. We then moved on to text diversity translation and found that translating romanised names to the original language is a challenge. Therefore, we propose two novel models to translate Pinyin names to Hanzi names. These two novel models perform better than ’google translate’ on Mandarin name translation. We next investigated gender prediction of people’s names and found that limited existing tools can predict and analyse the data in one process. Therefore, we propose a ’Name-Gender’ tool that achieves predicting the gender of people’s names and also provides a statistical graph directly. In addition, our tool has better performance than an existing system on predicting the genders of people’s names in Latin and Hanzi characters. We also provide novel findings of gender analysis in computer science using our ’Name-Gender’ tool approaches. Overall, our contributions provide effective novel approaches to support social researchers analysing online data sources of people’s names to aid them in understanding real-world events.