About me

I am currently the Dean and a Professor at School of Data Science and Engineering (DaSE), East China Normal University. I was a Lecturer at Department of Computer Science and Engineering, Fudan University, from 2004 to 2006. I got my Ph.D. from Fudan University.

My research interests include scalable transaction processing, and management & mining of massive datasets.

You may contact me via: wnqian AT dase DOT ecnu DOT edu DOT cn .

I have an official homepage which has few but my contact information here.

Posted in about | Tagged | Comments Off on About me

A demonstration for online collective behavior analysis

We have developed a demonstration system for interactive analysis of online collective behavior. The demonstration is based on about 200 hot spots in Sina Weibo from year 2009 to 2013. You may analyze the popularity, info. diffusion, mood of users, interactively.

The demonstration can be accessed here:


It is in its early stage. We are continuously polishing it. Please feel free to let me know your suggestion or advice (And I’m happy to hear from you). If you find any bug in it, pls. also let me know.

Posted in research | Comments Off on A demonstration for online collective behavior analysis

A follow-up page on WISE 2012 Challenge

I’ve created a page on follow-up information on WISE 2012 Challenge, which includes information on the testing data for T2, links to updated documents (with some errors corrected), and a link to a summary report slides.

Any comments are welcome!

Posted in research | Tagged , , | Comments Off on A follow-up page on WISE 2012 Challenge

Social media data management and analysis reading list

(To Be Revised)


  • Eytan Bakshy, Jake M. Hofman, Winter A. Mason, Duncan J. Watts. Everyone’s an influencer: quantifying influence on twitter. WSDM 2011 (influence, model)
  • Haewoon Kwak, Changhyun Lee, Hosung Park, Sue B. Moon. What is Twitter, a social network or a news media? WWW 2010
  • Neil Savage. Twitter as medium and message. Commun. ACM 2011
  • Shaomei Wu, Jake M. Hofman, Winter A. Mason, Duncan J. Watts. Who says what to whom on twitter. WWW 2011
  • Fred Douglis. It’s All About the (Social) Network. IEEE Internet Computing 2010
  • Kevin Makice. Twitter API – Up and Running: Learn How to Build Twitter Applications. 2009
  • Bernardo A. Huberman, Daniel M. Romero, Fang Wu. Social networks that matter: Twitter under the microscope CoRR 2008

Data management (storage, indexing, query, retrieval, etc.)

  • Adam Silberstein, Jeff Terrace, Brian F. Cooper, Raghu Ramakrishnan. Feeding frenzy: selectively materializing users’ event feeds. SIGMOD Conference 2010 (materialized view, optimization)
  • Chun Chen, Feng Li, Beng Chin Ooi, Sai Wu: TI: an efficient indexing mechanism for real-time search on tweets. SIGMOD Conference 2011 (centralized, indexing)
  • Tianyin Xu, Yang Chen, Xiaoming Fu, Pan Hui. Twittering by cuckoo: decentralized and socio-aware online microblogging services. SIGCOMM 2010
  • Michael Mathioudakis, Nick Koudas. TwitterMonitor: trend detection over the twitter stream. SIGMOD Conference 2010
  • Miles Efron. Hashtag retrieval in a microblogging environment. SIGIR 2010
  • Jan Pöschko. Exploring Twitter Hashtags CoRR 2011
  • Josep M. Pujol, Vijay Erramilli, Georgos Siganos, Xiaoyuan Yang, Nikolaos Laoutaris, Parminder Chhabra, Pablo Rodriguez. The little engine(s) that could: scaling online social networks. SIGCOMM 2010
  • Yi Fang. Entity information management in complex networks. SIGIR 2010
  • Polychronis Ypodimatopoulos, Andrew Lippman. ‘Follow me’: a web-based, location-sharing architecture for large, indoor environments. WWW 2010
  • Johan Bollen, Alberto Pepe, Huina Mao. Modeling public mood and emotion: Twitter sentiment and socio-economic phenomena CoRR 2009
  • Jon Crowcroft. The internet of ideas. SIGCOMM 2009

Mining and analysis

  • Changhyun Lee, Haewoon Kwak, Hosung Park, Sue B. Moon. Finding influentials based on the temporal order of information adoption in twitter. WWW 2010
  • Takeshi Sakaki, Makoto Okazaki, Yutaka Matsuo. Earthquake shakes Twitter users: real-time event detection by social sensors. WWW 2010
  • Bharath Sriram, Dave Fuhry, Engin Demir, Hakan Ferhatosmanoglu, Murat Demirbas. Short text classification in twitter to improve information filtering. SIGIR 2010
  • Owen Phelan, Kevin McCarthy, Barry Smyth. Using twitter to recommend real-time topical news. RecSys 2009
  • Anlei Dong, Ruiqiang Zhang, Pranam Kolari, Jing Bai, Fernando Diaz, Yi Chang, Zhaohui Zheng, Hongyuan Zha. Time is of the essence: improving recency ranking using Twitter data. WWW 2010
  • Anish Das Sarma, Atish Das Sarma, Sreenivas Gollapudi, Rina Panigrahy. Ranking mechanisms in twitter-like forums. WSDM 2010
  • Ravi Kumar, Mohammad Mahdian, Mary McGlohon. Dynamics of conversations. KDD 2010
  • Jianshu Weng, Ee-Peng Lim, Jing Jiang, Qi He. TwitterRank: finding topic-sensitive influential twitterers. WSDM 2010
  • Maxim N. Grinev, Maria P. Grineva, Alexander Boldakov, Leonid Novak, Andrey Syssoev, Dmitry Lizorkin. Sifting micro-blogging stream for events of user interest. SIGIR 2009
  • Sebastien Ardon, Amitabha Bagchi, Anirban Mahanti, Amit Ruhela, Aaditeshwar Seth, Rudra M. Tripathy, Sipat Triukose. Spatio-Temporal Analysis of Topic Popularity in Twitter CoRR 2011
  • Lars Kai Hansen, Adam Arvidsson, Finn Årup Nielsen, Elanor Colleoni, Michael Etter Good Friends, Bad News – Affect and Virality in Twitter CoRR 2011
  • Nigel Collier, Son Doan. Syndromic classification of Twitter messages CoRR 2011
  • Roberto Gonzalez, Rubén Cuevas Rumín, Ángel Cuevas, Carmen Guerrero Where are my followers? Understanding the Locality Effect in Twitter CoRR 2011
  • Wayne Xin Zhao, Jing Jiang, Jing He, Yang Song, Palakorn Achananuparp, Ee-Peng Lim, Xiaoming Li. Topical Keyphrase Extraction from Twitter. ACL 2011
  • Bo Han, Timothy Baldwin. Lexical Normalisation of Short Text Messages: Makn Sens a #twitter. ACL 2011
  • Long Jiang, Mo Yu, Ming Zhou, Xiaohua Liu, Tiejun Zhao. Target-dependent Twitter Sentiment Classification. ACL 2011
  • Saptarshi Ghosh, Gautam Korlam, Niloy Ganguly. Spammers’ networks within online social networks: a case-study on Twitter. WWW (Companion Volume) 2011
  • Liangjie Hong, Ovidiu Dan, Brian D. Davison. Predicting popular messages in Twitter. WWW (Companion Volume) 2011
  • Brendan Meeder, Brian Karrer, Amin Sayedi, R. Ravi, Christian Borgs, Jennifer T. Chayes. We know who you followed last summer: inferring social link creation times in twitter. WWW 2011
  • Ana-Maria Popescu, Marco Pennacchiotti, Deepa Paranjpe. Extracting events and event descriptions from Twitter. WWW (Companion Volume) 2011
  • Carlos Castillo, Marcelo Mendoza, Barbara Poblete. Information credibility on twitter. WWW 2011
  • Daniel M. Romero, Brendan Meeder, Jon M. Kleinberg. Differences in the mechanics of information diffusion across topics: idioms, political hashtags, and complex contagion on twitter. WWW 2011
  • Michael J. Welch, Uri Schonfeld, Dan He, Junghoo Cho. Topical semantics of twitter links. WSDM 2011
  • Baichuan Li, Xiance Si, Michael R. Lyu, Irwin King, Edward Y. Chang Question identification on twitter. CIKM 2011
  • Barbara Poblete, Ruth Garcia, Marcelo Mendoza, Alejandro Jaimes. Do all birds tweet the same?: characterizing twitter around the world. CIKM 2011
  • Xiaolong Wang, Furu Wei, Xiaohua Liu, Ming Zhou, Ming Zhang. Topic sentiment analysis in twitter: a graph-based hashtag sentiment classification approach. CIKM 2011
  • Arkaitz Zubiaga, Damiano Spina, Víctor Fresno, Raquel Martínez. Classifying trending topics: a typology of conversation triggers on Twitter. CIKM 2011
  • Marco Pennacchiotti, Ana-Maria Popescu. Democrats, republicans and starbucks afficionados: user classification in twitter. KDD 2011
  • Aron Culotta. Detecting influenza outbreaks by analyzing Twitter messages CoRR 2010
  • Jianshu Weng, Ee-Peng Lim, Qi He, Cane Wing-ki Leung. What Do People Want in Microblogs? Measuring Interestingness of Hashtags in Twitter. ICDM 2010
  • John Hannon, Mike Bennett, Barry Smyth. Recommending twitter users to follow using content and collaborative filtering approaches. RecSys 2010
  • Cindy Xide Lin, Bo Zhao, Qiaozhu Mei, Jiawei Han. PET: a statistical model for popular events tracking in social communities. KDD 2010
  • Tianyi Wang, Yang Chen, Zengbin Zhang, Peng Sun, Beixing Deng, Xing Li. Unbiased sampling in directed social graph. SIGCOMM 2010
  • Kyumin Lee, James Caverlee, Steve Webb. Uncovering social spammers: social honeypots + machine learning. SIGIR 2010
  • Kyumin Lee, James Caverlee, Steve Webb. The social honeypot project: protecting online communities from spammers. WWW 2010
  • Vivek K. Singh, Ramesh Jain. Structural analysis of the emerging event-web. WWW 2010
  • Zhijun Yin, Manish Gupta, Tim Weninger, Jiawei Han. LINKREC: a unified framework for link recommendation with user attributes and graph structure. WWW 2010
  • Nilanjan Banerjee, Dipanjan Chakraborty, Koustuv Dasgupta, Sumit Mittal, Anupam Joshi, Seema Nagar, Angshu Rai, Sameer Madan. User interests in social media sites: an exploration with micro-blogs. CIKM 2009
  • Lei Tang, Huan Liu Scalable learning of collective behavior based on sparse social dimensions. CIKM 2009


  • Eysenbach G. Can Tweets Predict Citations? Metrics of Social Impact Based on Twitter and Correlation with Traditional Metrics of Scientific Impact. J Med Internet Res 2011;13(4):e123
  • Vasileios Kandylas, Ali Dasdan. The utility of tweeted URLs for web search. WWW 2010
  • Tom Rowlands, David Hawking, Ramesh Sankaranarayana. New-web search with microblog annotations. WWW 2010
  • Vivek K. Singh, Mingyan Gao, Ramesh Jain Situation detection and control using spatio-temporal analysis of microblogs. WWW 2010
  • Onook Oh, Manish Agrawal, H. Raghav Rao. Information control and terrorism: Tracking the Mumbai terrorist attack through twitter. Information Systems Frontiers 2011
  • Marc Cheong, Vincent C. S. Lee. A microblogging-based approach to terrorism informatics: Exploration and chronicling civilian sentiment and response to terrorism events via Twitter. Information Systems Frontiers 2011
  • Johan Bollen, Huina Mao, Xiao-Jun Zeng. Twitter mood predicts the stock market. J. Comput. Science 2011
  • Huina Mao, Scott Counts, Johan Bollen Predicting Financial Markets: Comparing Survey,News, Twitter and Search Engine Data CoRR 2011
  • Kristen Lovejoy, Richard Waters, Gregory D. Saxton Engaging Stakeholders through Twitter: How Nonprofit Organizations are Getting More Out of 140 Characters or Less CoRR 2011
  • Son Doan, Bao-Khanh Ho Vo, Nigel Collier An analysis of Twitter messages in the 2011 Tohoku Earthquake CoRR 2011
  • Clodoveu Augusto Davis Jr., Gisele Lobo Pappa, Diogo Rennó Rocha de Oliveira, Filipe de Lima Arcanjo. Inferring the Location of Twitter Messages Based on User Relationships. T. GIS 2011
  • Ana-Maria Popescu, Alpa Jain. Understanding the functions of business accounts on Twitter. WWW (Companion Volume) 2011
  • Jessica Elan Chung, Eni Mustafaraj. Can Collective Sentiment Expressed on Twitter Predict Political Elections? AAAI 2011
  • Kristina Lerman, Rumi Ghosh. Information Contagion: an Empirical Study of the Spread of News on Digg and Twitter Social Networks CoRR 2010
  • Ed H. Chi. Augmented social cognition: using social web technology to enhance the ability of groups to remember, think, and reason. SIGMOD Conference 2009
Posted in research | Comments Off on Social media data management and analysis reading list

Database benchmark links and references

The Benchmark Handbook, 1993:


The 007 Benchmark:


XMark: An XML Benchmark Project:


Technical report (The link to cwi ftp is broken):


XBench: A Family of Benchmarks for XML DBMSs:


TPC series:


Sort Benchmark:


Posted in research | Tagged , , | Comments Off on Database benchmark links and references

On review final reports of courses

1.  input: D
2.  output: score
3.  begin
4.    A:=random_pick_a_sentence(D);
5.    B:=random_pick_another_sentence(D);
6.    if ((Ac:=is_copy_wo_ack(A)) && (Bc:=is_copy_wo_ack(B))) score:=60;
7.    // based on search engine
8.    else if ((Ac && !Bc) || (!Ac && Bc)) score:=75;
9.    else if (!Ac && !Bc) score:=80;

10.   if (!related(D)) {score:=score-10; return score;}
11.   if (interesting(D)) {score:=score+10; return score;}
12.   return score;
13. end

Posted in teaching | Tagged , , , | Comments Off on On review final reports of courses