Extreme labeled scientific text classification is an important and challenging task in research management and technology transfers. Based on the practical needs in research and innovation, we have proposed an artificial intelligent (AI) approach to assist users in classifying scientific text (e.g., research projects, papers and patents) into extreme labels (e.g., 1,000 discipline area codes).
As shown in the above Figure, the proposed AI approach uses research knowledge graph, natural language processing (NLP) and deep learning methods to classify scientific text with extreme multi-labels in the following steps.
Using over 100,000 scientific text (e.g., projects and publications) with labeled discipline code classifications, we have conducted experiments to evaluate the performance of scientific text classification in terms of three evaluation metrics, i.e., precision, recall, and F1-score. The experiment results have shown that the proposed approach is superior to the current methods of TF-IDF, KNN, SVM, LSTM, etc. The proposed method can find wide applications in research management and technology transfers.
Funded by the National Science Foundation of China (NSFC), the project team have applied the proposed AI approach to solve the peer reviewer assignment problem, where both research proposals and reviewers are classified into detailed research discipline codes (over 1,000 discipline codes in NSFC) so as to maximise the research similarities under the same discipline area.
Funded by the Shenzhen Hong Kong Innovation and Technology Fund, the project team is also applying the proposed AI approach to classify patents and companies into detailed industry sectors so as to match their expertise for technology transfers.