Mining Time Information in Digital Data
By Prof. Jonathan ZHU
Department of Media and Communication
City University of Hong Kong
Time is the single common denominator across all types of digital data from the internet, mobile networks, and IoT (internet of things). However, the rich information embedded in timestamps of digital data has often been taken for granted. As such, this gold wine has remained to be adequately exploited. Conceptually, there is lack of well-developed social science theory to guide mining time information. Operationally, there exist a wide range of difficulties and challenges to be solved. To illustrate the complicities, challenges and opportunities of time data mining, let us focus on a simple question – unit of time. Time can be measured at different (up to several hundreds) units such as fixed-interval units (e.g., hourly, daily, etc.), variable-interval units (time segments, clusters of segments, etc.), frequency-domain units (e.g., cycles), and so on. What is the “right” unit of for your time data and why? Unfortunately, there has been limited (if any) systematic research on it. Data science should address such simple but not trivial issues.
Jonathan Zhu (Ph.D. in Mass Communication, Indiana University, 1990) is a Chair Professor of Computational Social Science in Department of Media and Communication, where he teaches communication theories, quantitative methods, social network, and data mining and directs an interdisciplinary research group (Web Mining Lab, http://weblab.com.cityu.edu.hk), focusing on the structure, content, use, and impact of internet and mobile media. He has published in major journals across communication, economics, computer science, physics, engineering, and medical informatics. He has trained one of the largest groups of Ph.D. students in the world on computational communication research, most of whom are currently faculty members at major universities in the United State, Hong Kong, mainland China, and elsewhere.
[ Back ]