Wasim Ahmed is a regular guest blogger, and his personal research blog has received interest from the mainstream media. Additionally, he provides advice to security research teams, projects looking to analyze Twitter data, and queries from Masters and Ph.D. students. He recommends the use of DiscoverText and has cited DiscoverText in two conference publications one comparing Twitter APIs, the other on the challenges of research on Twitter and DiscoverText receives regular mentions in his guest blog posts as the go-to tool for academic research.
1. Tell us a little about yourself.
I am studying for a PhD at the Information School, at the University of Sheffield. My research is focused on examining user-generated content on platforms such as Twitter to gain a better understanding of how users respond to epidemics and pandemics. Currently I have been examining how health, more generally, is communicated on Twitter. My research also identifies free open source software as well as industry-specific software that researchers can use to gather and analyze data from Twitter. I am an active tweeter, and my research blog has proven to be very popular. Some posts have appeared in Google Scholar, and the mainstream media has picked others up. I am the Twitter manager for NatCen Social Research’s New Social Media New Social Science network (@NSMNSS). I was featured in a White Rose DTC newsletter in the PhD Researchers In The News section, which highlighted a prize awarded from DiscoverText. I receive regular invitations to academic and industry events and have tutored students across continents related to retrieving and analyzing Twitter data. I regularly analyze emerging hashtags and trending topics for my research blog. I have provided advice to security research teams and my results are likely to be of potential interest to the World Health Organization (WHO), and the United Nations (UN), as well as to the wider public.
3. What was your first impression of DiscoverText?
I thought that it might take a while to learn how to use, however, by watching the video tutorials by Stuart Shulman, and via his excellent user support, I was off and running in a matter of hours. For social media data and for social scientists, DiscoverText is one of a kind. It really allows you to drill into large datasets with no prior data skills.
4. How do you use DiscoverText?
I use all the features of DiscoverText, and regularly upload existing Twitter datasets of mine, for further analysis. Having looked into machine learning methods for Twitter, I have found DiscoverText to the most simplest to use to automatically classify a large number of tweets. So I use DiscoverText to make sense of my data by filtering, de-duplicating, and classifying Twitter data. I either capture or upload a set of Twitter data, and then I filter, de-duplicate, cluster, search, human code and machine classify large numbers of tweets in a matter of hours. The entire process is streamlined.
5. Tell me about some of the projects you have used with DiscoverText?
I have used DiscoverText for my PhD work which looks at pandemics and epidemics. One of my largest case studies is based on the Ebola outbreak of 2014, and I have used DiscoverText to make sense of my data. That is, by examining the types of information that is shared during infectious disease outbreaks, allows health organizations to disseminate accurate information, or monitor public concerns appropriately. Health researchers, and those from health informatics may not necessary hold advance data science or machine learning skills. DiscoverText, therefore is a valuable tool, as it allows non-data scientists to run complex data science queries on sets of Twitter data. I have collaborated internationally on various projects analysing health-related data on Twitter for the real-time monitoring of public views and opinions. For example, I have analysed targeted health campaigns such as Word Autism Awareness Day, to more specific health-related conditions such as Norovirus. I have also worked on more general social science projects attempting to understand public views and opinions on key events, such as the Sydney Siege.
6. What has surprised you most about working with DiscoverText?
The machine learning capabilities and the ease of training classifiers was really surprising. Overall, my first impressions were that DiscoverText really offers a powerful set of features. The ability to cluster duplicates, and near-duplicates, and search for tweets to allow for list-wise coding, as well as keystroke coding really improves the speed of coding compared to traditional social science software.
7. What are two of your favorite aspects of DiscoverText?
Keystroke Coding and Machine Learning Capabilities
Research Blog: https://wasimahmed1.wordpress.com/
Ahmed, W. (2015) Using Twitter as a data source: An overview of current social media research tools LSE Impact of Social Sciences blog http://blogs.lse.ac.uk/impactofsocialsciences/2012/04/19/blog-tweeting-papers-worth-it/ 10 July 2015
Ahmed, W. (2015) Challenges of using Twitter as a data source: An overview of current resources LSE Impact of Social Sciences blog http://blogs.lse.ac.uk/impactofsocialsciences/2015/09/28/challenges-of-using -twitter-as-a-data-source-resources/
Report on the Studies in Social Data at Twitter HQ London published on the Information School’s blog. http://information-studies.blogspot.co.uk/2015/04/report-on-studies-in-social-data-event.html?utm_source=twitterfeed&utm_medium=twitter%20
Ahmed, W., & Bath, PA. (2015). The Ebola epidemic on Twitter: challenges for health informatics. The Seventeenth International Symposium for Health Information Management (pp 289-289). York, 24 June 2015 – 26 June 2015.
Ahmed, W., & Bath, PA. (2015). Comparison of Twitter APIs and tools for analysing Tweets related to the Ebola Virus Disease. iFutures. Sheffield, 07 July 2015.