In search of alternative data

By Dr Alvin Leung
Dr Alvin Leung
Associate Professor
Department of Information Systems

Dr Alvin Leung is Associate Professor at the Department of Information Systems where his research interests lie at the intersection of IS and Finance. Alvin's excellence in research and teaching work was recently recognised through his receiving the Innovative CityU-Learning Award 2021, Dean's Research Excellence Award, and The President's Award. Here he charts the way in which alternative data is giving us ever more precise views of corporate performance.

Conventional corporate financial data has historically provided investors with insights on past performance and prospects for the future. In the old days, such data was useful to investors particularly when business growth was relatively steady. Accounting statements showed the momentum of growth and to some extent foreshadowed future trends. However, in the modern era markets are more dynamic. The development of new technologies such as artificial intelligence and the advent of pandemics means business environments are full of uncertainty. Investors have begun to doubt whether records of past activities are an adequate guide to the future.

Alternative data for turbulent times

Fortunately, alternative data is emerging to give investors a fuller picture of corporate performance. These go beyond corporate filings, earnings releases, and fundamental datasets. They include various sources of non‐financial data, which provide timely new signals to investors in understanding corporate performance. Such sources may include, web scraped data, geolocation data from mobile phones, and even weather forecasts1. In times of turbulence, alternative data are more useful than conventional financial data.

From dot-com to data analytics

In the late 1990s investors were crazy for internet stocks. Companies recruited many IT people and established websites with the hope of jumping onto the dot‐com bandwagon. If investors simply relied on conventional financial data, it was hard to distinguish peaches from lemons among the numerous dot‐com companies because stock markets were irrational and financial statements could not predict future trends. The failure to evaluate dotcom companies properly led to subsequent dot‐com bubbles. The burst of dot‐com bubbles made people think more carefully about whether we needed alternative data. Complementary to conventional financial data such as quarterly sales, number of offline stores, and number of employees, alternative data such as web visits, number of unique visitors, length of stay on e‐commerce websites, click‐through rate of online advertisements, etc. were seen as increasingly useful to evaluate firm performance2. With the emergence of data analytics, such kinds of data became easily accessible on web servers, and business intelligence tools made data analysis much easier and faster.

Variety, volume, velocity

Big data are released not just by firms, but come from various sources such as third‐party e‐commerce websites, suppliers, social media, and across the Internet of Things. Such data is characterised by the 3Vs of variety, volume, and velocity, and these days are abundant. The critical question is how and what data are useful for investors to evaluate corporate performance?

Exploiting search data

As a part of FinTech research, one research objective is to identify such alternative data and demonstrate their business value. We were among the first to explore alternative data to understand investors' stock preferences3. The intuition came from the fact that people search before they transact. Search volume may reflect investors' intention to a stock4. By analysing co‐search data from infomediaries such as Yahoo! Finance, it was possible to identify investors' collective preferences. The granularity of the co‐search data was more precise than traditional transactional data from brokerage firms. And its timeliness provided infomediaries and brokerage firms opportunities to give suitable recommendations to investors based on their past search preferences.

The significance of co-movement

We also showed that co‐search data could reveal investment habitats that demonstrate returns co-movement, that is stock returns moving in the same direction. Such habitats were first discovered by using conventional historical stock data. Current research shows that alternative data can achieve a more timely, precise, and granular analysis. Co‐attention data also reveals new business opportunities. By observing the amount of coattention intensity among listed companies with economic linkages such as supplier‐buyer relationships, we show that it is possible to use current partner stock returns to predict future returns of focal stocks if there is a lack of co‐attention. The intuition comes from the slow information propagation among stocks with low volume of coattention 5. Therefore positive (or negative) events diffuse slowly among economically linked stocks, giving rise to opportunities for using partner stock returns to predict future stock returns of focal firms.

Social media increasingly important

So, knowing the economic linkages between the two companies allows us to develop a prediction model. If Intel announces huge sales in processors, we can foresee that Lenovo is likely to have positive earnings news because Intel is its major suppliers of processors. If Lenovo investors generally lack co‐attention to Intel, Lenovo's stock returns may not rise immediately. Instead, it appears some time later when Lenovo also makes positive earnings announcements. Co‐attention volume is one important element of stock predictions. We also show that other alternative data such as reads, shares, comments, and news from social media play equally important roles6. Using the concept of eigenvector centrality, we develop a composite measure called Eigen Attention Centrality (EAC) to predict future stock returns. Note that EAC is an important model behind PageRank, which is a patented algorithm used by Google Search in their search engines. We find that EAC outperforms conventional financial data in prediction accuracy. The critical success factors lie in the diversity of data sources, not in co‐attention volume alone. Furthermore, they show that social media play a more important role for information diffusion than conventional data such as financial news. Through discussion on social media, information flows become faster and the collective wisdom on social media make investors more judicious in investing.

The wisdom of crowds

Over time, we observe exponential growth in financial social media such as Seeking Alpha and StockTwits. They are important channels for investors to exchange ideas and develop collective intelligence, which is also known as the “wisdom of crowds.” In the era of big data, there are numerous sources of alternative data. The implementation of 5G networks, the launch of Open API (Application Programming Interface) Framework released by the Hong Kong Monetary Authority, and the development of Hong Kong as a Smart City are all providing academic researchers and investors with new opportunities to exploit new sources of alternative data. In the near future, we expect more alternative data to come into existence to transform the business world. As the co‐founder of Paypal, Max Levchin, said, “The world is now awash in data and we can see consumers in a lot clearer ways.”

Note: The article is based on the papers "Network Analysis of Search Dynamics: The Case of Stock Habitats" published in Management Science, "Cosearch Attention and Stock Return Predictability in Supply Chains" published in Information Systems Research, and "Developing a Composite Measure to Represent Information Flows in Networks: Evidence from a Stock Market" published in Information Systems Research.