Senior Research Scientist, ISI Foundation, Turin, Italy
Yelena Mejova is a Senior Research Scientist at the ISI Foundation in Turin, Italy, working in the area of Data Science for Social Impact and Sustainability. Her research concerns the use of social media in health informatics, especially in lifestyle diseases and mental health, as well as for tracking political speech and other cultural phenomena. Since 2023, she is a co-Editor-in-Chief of EPJ Data Science. Previously as a scientist at the Qatar Computing Research Institute, Yelena was a part of the Social Computing Group working on computational social science, especially as applied to tracking real-life health signals. As a post-doctoral researcher at Yahoo Research Barcelona, she was a part of Web Mining and User Engagement groups.
Improved Geolocation in Humanitarian Documents
Enrico Belliardo, Kyriaki Kalimeri
ACM GoodIT [Best Paper Award]
We develop annotated resources to fine-tune the popular Named Entity Recognition (NER) tools Spacy and roBERTa to perform geotagging of humanitarian texts. We then propose a geocoding method FeatureRank which links the candidate locations to the GeoNames database. We find that not only does the humanitarian-domain data improves the performance of the classifiers (up to F1 = 0.92), but it also alleviates some of the bias of the existing tools, which erroneously favor locations in the Western countries.
Populist Advertising on Facebook in Europe
Arthur Capozzi (UniTo), Gianmarco De Francisci Morales (CENTAI), Yelena Mejova, Corrado Monti (CENTAI), Andre Panisson (CENTAI)
This study compares how populist parties advertised on Facebook during the 2019 European Parliamentary election. By using data from Meta (previously Facebook) Ad Library, we analyze 45k ad campaigns by 39 parties, both populist and mainstream, in Germany, United Kingdom, Italy, Spain, and Poland. While populist parties represent just over 20% of the total expenditure on political ads, they account for 40% of the total impressions–most of which from Eurosceptic and far-right parties
Global misinformation spillovers in the online vaccination debate before and during COVID-19
Jacopo Lenti, Kyriaki Kalimeri, André Panisson, Daniela Paolotti, Michele Tizzani, Yelena Mejova, and Michele Starnini
We leverage 316 million vaccine-related Twitter messages in 18 languages, from October 2019 to March 2021, to quantify misinformation flows between users exposed to anti-vaccination (no-vax) content. We find that, during the pandemic, no-vax communities became more central in the country-specific debates and their cross-border connections strengthened, revealing a global Twitter anti-vaccination network. U.S. users are central in this network, while Russian users also become net exporters of misinformation during vaccination roll-out.
Mapping Urban Socioeconomic Inequalities in Developing Countries through Facebook Ads
Simone Piaggesi, Serena Giurgola, Márton Karsai (Central European University), André Panisson, and Michele Tizzoni
We collect advertising audience estimates from Facebook to predict socioeconomic conditions of urban residents, at a fine spatial granularity, in four large urban areas: Atlanta (USA), Bogotá (Colombia), Santiago (Chile), and Casablanca (Morocco). We find that behavioral attributes inferred from the Facebook marketing platform can accurately map the socioeconomic status of residential areas within cities, and that predictive performance is comparable in both high and low-resource settings.
Effects of Social Isolation on Self-disclosure of Loneliness
Anya Hommadova Lu (Sam Houston State University)
This study explores the effect of unprecedented mass isolation during COVID-19 lockdowns through the lens of self-disclosure of loneliness on Twitter. Using a dataset of 30 million public tweets, we use machine learning to identify tweets that contain self-disclosure of loneliness. We propose an updated typology of loneliness that captures the possibilities offered by the affordances of social media. In a second study, we examine the quality of response to such loneliness disclosures.
Modeling Political Activism around Gun Debate via Social Media
Gianmarco De Francisci Morales (ISI Foundation), Jisun An & Haewoon Kwak (Singapore Management University)
In this study, we employ social media signals to examine the predictors of offline political activism, at both population and individual level. We show that it is possible to classify the stance of users on the gun issue, especially accurately when network information is available.
Tracking Echo Chambers around Vaccination during COVID-19
Giuseppe Crupi, Michele Tizzani, Daniela Paolotti, Andre Panisson (ISI Foundation)
In this study we ask, how has the Italian discussion around vaccination changed during the COVID-19 pandemic, and have the unprecedented events of 2020-2021 been able to break the echo chamber around this topic? We find the stark division between vaccine supporters and hesitant individuals to continue throughout the vaccination campaign. However, we find an increasing commonality in the topical focus of the vaccine supporters and vaccine hesitant, pointing to possible avenues of engagement.
Search Engine Mediation of Abortion Accessibility in the United States
Yelena Mejova, Tatiana Gracyk (Cleveland State University), Ronald E. Robertson (Northeastern University)
We ask, to what degree does Google Search provide quality responses to users searching for an abortion provider, specifically in terms of directing them to abortion clinics or crisis pregnancy centers (which aim to dissuade women from abortion). To answer this question, we considered the scenario of a woman searching for abortion services online, and conducted 10 abortion-related queries from 467 locations across the United States.
Media Sharing Behavior Change as Proxy for Mobility during COVID-19
Nicolas Kourtellis (Telefonica Research, Spain)
We examine the usefulness of video sharing on social media as a proxy of the amount of time Internet users spend at home. In particular, we focus on the number of people sharing YouTube videos on Twitter before and during COVID-19 lockdown measures were imposed by 109 countries. We find that the media sharing behavior differs widely between countries, in some having immediate response to the lockdown decrees - mostly by increasing the sharing volume dramatically - while in others having a substantial lag.
Vaccine hesitancy on a parenting forum BabyCenter
Lorenzo Betti, Gianmarco De Francisci Morales, Laetitia Gauvin, Kyriaki Kalimeri, Daniela Paolotti, Michele Starnini (ISI Foundation)
We leverage more than one million comments of 201,986 users posted from March 2008 to April 2019 on a public online forum BabyCenter US to learn more about parents who choose to modify the vaccination schedule of their children. We find that such users have distinctly different interests than those who choose to follow the recommended schedule, and that they are more likely to comment on posts by users who also chose to change their schedules.
Developing Annotated Resources for Internal Displacement Monitoring
Fabio Poletto, Andre Panisson, Daniela Paolotti (ISI Foundation), Yunbai Zhang (Columbia University), Sylvain Ponserre (Internal Displacement Monitoring Centre)
We design an annotation framework for Internal Displacement and create a benchmark dataset for information extraction, as the outcome of a collaboration with the Internal Displacement Monitoring Centre, aimed at improving the accuracy of their monitoring platform IDETECT. The schema includes multi-faceted description of the events, including cause, quantity of people displaced, location and date.
Politics of Migration in Italy via Facebook Ads Library
Arthur Capozzi, Gianmarco De Francisci Morales, Corrado Monti, Andre Panisson, Daniela Paolotti (ISI Foundation)
We employ Facebook Ads Library to examine advertising concerning the issue of immigration in Italy. We find evidence of targeting by the parties both in terms of geography and demographics (age and gender). For instance, Five Star Movement reaches a younger audience when advertising about immigration, while other parties’ ads have a more male audience.
Competing Agendas around a Public Health Crisis of COVID-19
Kyriaki Kalimeri (ISI Foundation)
In the midst of coronavirus epidemic of 2019-2020, or COVID-19, World Health Organization is warning of a possible "infodemic" of fake news. In this study, we examine the alternative narratives around the coronavirus outbreak through advertisements promoted on Facebook. Using the new Facebook Ads Library, we discover advertisers from public health and non-profit sectors, alongside those from news media, politics, and business, incorporating coronavirus into their messaging and agenda. Among these, we find several instances of possible misinformation, ranging from bioweapons conspiracy theories to unverifiable claims by politicians.
Falling into the Echo Chamber: the Italian Vaccination Debate on Twitter
Alessandro Cossard, Gianmarco De Francisci Morales, Kyriaki Kalimeri, Daniela Paolotti, Michele Starnini (ISI Foundation)
In this study we examine the extent to which the vaccination debate on Twitter is conductive to potential outreach to the vaccination hesitant. We focus on Italy, one of the countries most affected by the latest measles outbreaks. We discover that the vaccination skeptics, as well as the advocates, reside in their own distinct "echo chambers". The structure of these communities differs as well, with skeptics arranged in a tightly connected cluster, and advocates organizing themselves around few authoritative hubs. At the center of these echo chambers we find the ardent supporters, for which we build highly accurate network- and content-based classifiers (attaining 95% cross-validated accuracy).
Facebook Ads as a Demographic Tool to Measure the Urban-Rural Divide
Daniele Rama, Kyriaki Kalimeri, Michele Tizzoni (ISI Foundation), Ingmar Weber (QCRI)
We examine the usefulness of the Facebook Advertising platform, which offers a digital "census" of over two billions of its users, in measuring potential rural-urban inequalities. We focus on Italy, a country where about 30% of the population lives in rural areas. First, we show that the population statistics that Facebook produces suffer from instability across time and incomplete coverage of sparsely populated municipalities. To overcome such limitation, we propose an alternative methodology for estimating Facebook Ads audiences that nearly triples the coverage of the rural municipalities. The findings of this study illustrate the necessity of improving existing tools and methodologies to include under-represented populations in digital demographic studies.
Effect of Values and Technology Use on Exercise
Kyriaki Kalimeri (ISI Foundation)
UMAP'19 [Best Paper]
In this study, we present a unique demographically representative dataset of 15k US residents that combines technology use logs with surveys on moral views, human values, and emotional contagion. Combining these data, we provide a holistic view of individuals to model their physical exercise behavior. We show which values determine the adoption of Health & Fitness mobile applications, we then achieve a weighted AUROC of .673 in predicting whether individual exercises, and find a strong link of exercise to respondent socioeconomic status, as well as the value of happiness. Informed by these findings, we propose actionable design guidelines for persuasive technologies targeting health behavior modification
Fake Cures: User-centric Modeling of Health Misinformation in Social Media
Amira Ghenai (University of Waterloo)
This work examines the individuals on social media that are posting questionable health-related information, and in particular promoting cancer treatments which have been shown to be ineffective (making it a kind of misinformation, willful or not). Using a multi-stage user selection process, we study 4,212 Twitter users who have posted about one of 139 such "treatments", and compare them to a baseline of users generally interested in cancer. Considering features capturing user attributes, writing style, and sentiment, we build a classifier to identify users prone to propagate such misinformation, providing a potential tool for public health officials to identify such individuals for preventive intervention.
Information Sources and Needs in the Obesity and Diabetes Twitter Discourse
We examine 1.5 million tweets mentioning obesity and diabetes in order to assess (1) the quality of information circulating in this conversation, as well as (2) the behavior and information needs of the users engaged in it. The analysis of top cited domains shows a strong presence of health information sources which are not affiliated with a governmental or academic institution. On the user side, we estimate over a quarter of non-informational obesity discourse to contain fat-shaming. We also find a great diversity in questions asked in these datasets, spanning definition of obesity as a disease, social norms, and governmental policies.
Online Health Monitoring using Facebook Advertisement Audience Estimates in the United States
Ingmar Weber, Luis Fernandez-Luque (QCRI)
We use the Facebook Marketing API to correlate estimated sizes of audiences having health-related interests with public health data. Using several study cases, we identify both potential benefits and challenges in using this tool. We introduce the use of placebo interest estimates to control for background level of user activity on the platform. Some Facebook interests such as plus-size clothing show encouraging levels of correlation (r=.74) across the 50 US states; however, we also sometimes find substantial correlations with the placebo interests such as r=.68 between interest in Technology and Obesity prevalence.
#Halal on Social Media: Religion, Commerce, Health
Benkhedda Youcef (Ecole Nationale Supérieure d’Informatique, Algeria), Khairani (University of Indonesia)
Halal is a religious term, a cultural staple, and huge market. Here, we investigate the meaning of halal in three global communities speaking Arabic, Bahasa Indonesian, and English. All three have a unique perception of this concept, identifying it more with trade, food, or cosmetics. Showing a complicated relationship with both religious and governmental authority, the concept of halal has its own life on the social media, redefining its traditional and market space.
Tracking Health Misinformation on Twitter: case of Zika
Amira Ghenai (University of Waterloo)
Misinformation and rumors in the health domain may not only cause inconvenience, but may increase medical care costs and even lead to the loss of life. Here, we build a pipeline for tracking Zika misinformation during the first half of 2016 when its incidence spiked in South America, incorporating crowdsourcing with machine learning.
Using Facebook Ads Audiences for Global Lifestyle Disease Surveillance
Ingmar Weber (QCRI), Matheus Araújo (Federal University of Minas Gerais), Fabricio Benevenuto (Federal University of Minas Gerais)
In this series of studies we explore the use of demographically rich Facebook Ads audience estimates for tracking non-communicable diseases around the world, and especially in the Middle East. We compute the audiences of health "marker" interests, and evaluate their potential in tracking health conditions associated with lifestyle-related health conditions associated with tobacco use, obesity, and diabetes, as well as compare these to the performance of "placebo" interests.
Revisiting the American Voter
Huyen Le, Bob Boynton, Zubair Shafiq, Padmini Srinivasan (University of Iowa)
The American Voter – a seminal work in political science – uncovered the multifaceted nature of voting behavior which has been corroborated in electoral research for decades since. In this work, we leverage The American Voter as an analysis framework in the realm of computational political science, employing the factors of party, personality, and policy to structure the analysis of public discourse on online social media.
Kissing Cuisines: Exploring Worldwide Culinary Habits on the Web
Sina Sajadmanesh, Sina Jafarzadeh, Seyed Ali Ossia, Hamid R. Rabiee, Hamed Haddadi,
Yelena Mejova, Mirco Musolesi, Emiliano De Cristofaro, Gianluca Stringhini
A large-scale study of recipes published on the Web and their content. Using a database of more than 157K recipes from over 200 different cuisines, we analyze ingredients, flavors, and nutritional values which distinguish dishes from different regions, and use this knowledge to assess the predictability of recipes from different cuisines. We then use country health statistics to understand the relation between these factors and health indicators of different nations, such as obesity, diabetes, migration, and health expenditure.
Cultural Pluralism: case of Charlie Hebdo
Jisun An (QCRI), Haewoon Kwak (QCRI), Sonia Alonso Saenz De Oger (Georgetown University, Qatar), Braulio Gomez Fortes (Deusto University)
We ask whether the stances on the issue of freedom of speech can be modeled using established sociological theories, including Huntington’s culturalist Clash of Civilizations, and those taking into consideration social context, including Density and Interdependence theories. At an individual level, we find social context to play a significant role, with non-Arabs living in Arab countries using #JeSuisAhmed (“I am Ahmed”) five times more often when they are embedded in a mixed Arab/non-Arab (mention) network.
Privacy and Twitter in Qatar: Traditional Values in the Digital World
Norah Abokhodair (University of Washington), Sofiane Abbar (QCRI), Sarah Vieweg (QCRI)
We explore the meaning of "privacy" from the perspective of Qatari nationals as it manifests in digital environments. Our mixed-methods analysis of 18K Twitter posts that mention "privacy" focuses on the online and offline contexts in which privacy is mentioned, and how those contexts lead to varied ideologies regarding privacy.
Crowdsourcing Health Labels
Ingmar Weber (QCRI)
Is it feasible to use profile pictures to infer a user's health, such as weight? We show that this is indeed possible and further show that the fraction of labeled-as-overweight users is higher in U.S. counties with higher obesity rates. As obesity-related conditions such as diabetes, heart disease, osteoarthritis, and even cancer are on the rise, this obese-or-not label could be one of the most useful for studies in public health.
#Foodporn and Health Around the World
Sofiane Abbar (QCRI), Hamed Haddadi (Queen Mary University of London)
How is food redefined in social media? Does food fetishizing via a plethora of images shifting our understanding of food? Our international study of #foodporn hashtag on Instagram shows the obsession with chocolate and sweets, but also reveals a tendency of the communities to like and comment more on healthier content, suggesting a new avenue for healthy lifestyle interventions.
Health in Qatar
Hamed Haddadi (Queen Mary University of London), Ingmar Weber (QCRI), Sofiane Abbar (QCRI), Azadeh Ghahghaei (Freie Universitat Berlin)
Using a near-complete dataset of Instagram checkins in Qatar, we examine the behavior of Arabic- and English-speaking populations. We find behavior changes around major religious holidays, including Ramadan, which affects the dietary patterns of this highly diverse country.
View on Obesity through Instagram
Ingmar Weber (QCRI), Hamed Haddadi (Queen Mary University of London), Anastasios Noulas (University of Cambridge)
Using millions of Instagram posts in locations all over US, we examine the social media signals surrounding obesity. Our analysis reveals a relationship between small businesses and local foods with better dietary health, yet a tendency of social media users to reinforce unhealthy dietary habits through likes and comments, with donuts and cupcakes being the most "liked" foods.
Twitter: A Digital Socioscope
Ingmar Weber (QCRI), Michael Macy (Cornell)
Cambridge University Press Book (editing)
This book surveys how to use Twitter data to study human behavior and social interaction on a global scale. It is a reference for behavioral and social scientists who want to explore the use of online data in their research, and for non-professionals that follow the social impact of new technologies.
Relating Social Media Users to Little-known Content
Ingmar Weber, Javier Borge-Holthoefer (QCRI)
Long-tail content -- news stories, music, even people -- may need a little more help in order for people to notice them. We combine the notions of serendipity and explainability to build "bridges" between content and users, utilizing high-quality knowledge bases as well as users' interests profiles, as estimated using their social media presence.
Controversy and Sentiment in News
Carlos Castillo (QCRI), Nicholas Diakopoulos (University of Maryland), Amy X. Zhang (MIT CSAIL)
Resource: Controversial topic list
Using lexical resources, such as those on sentiment and bias, we explore the use of emotional language around controversial topics by mainstream news agencies. Our aim is to eventually detect these controversial topics and to automatically find the sides of the discussion.
Monitoring Dietary Health via Social Media
Ingmar Weber, Sofiane Abbar, Hamed Haddadi (QCRI)
Using online check-ins and posts related to food, we track diet-related diseases like obesity and diabetes, and relate the perceptions of food to the demographics of the individuals. Social media also gives us an opportunity to explore the relationship between social connections and dietary habits -- indeed, there seem to be a connection between your social-media-detected diet and that of your friends.
Linguistically Motivated Semantic Aggregation Engines (LiMoSINe)
Ilaria Bordino (Yahoo Labs Barcelona), Mounia Lalmas (Yahoo London), Olivier Van Laere (Yahoo Labs Barcelona), Byungkyu Kang (UC Santa Barbara)
As a part of the LiMoSINe EU project, we are building search engines based on semantics found in large document collections. These semantics include entities, sentiment expressed about them, the quality of writing about them, their topical categorization, and, of course, the relationships between these entities. Our faceted search prototype allows the user to explore the web of entities, adjusting the data views along various metadata attributes.
Understanding Donation Behavior through Email
Ingmar Weber (Qatar Computing Research Institute), Venkata Rama Kiran Garimella (Aalto University), Michael C Dougal (UC Berkeley)
We analyze a two-month anonymized email log from several perspectives motivated by past studies on charitable giving: (i) demographics, (ii) user interest, (iii) external time-related factors and (iv) social network influence. We show that email captures the demographic peculiarities of different interest groups, for instance, predicting demographic distributions found in the US 2012 Presidential Election exit polls. We show the importance of the social connection in predicting whether an individual donates, showing that, although annoying to the most of us, email campaigns can be effective.
Economic, Social, and Cultural Boundaries in International Communication
Ruth Garcia (Universitat Pompeu Fabra Barcelona), Daniele Quercia (Yahoo Labs Barcelona)
In this study we show that the international Twitter communication landscape is not only still largely predetermined by physical distance, but that it also depends on countries' social, economic, and cultural attributes. This communication, as measured using @mentions, is correlated (r = 0.68) with the Gravity Model, which hypothesizes that the flow between two areas is proportional to their masses and inversely proportional to the distance between them. Our final model, which takes into consideration income, trade share, migration, language, and Hofstede's cultural variables, achieves an Adjusted R2 of 0.80.
Political Sentiment Classification & Tracking
Padmini Srinivasan (Computer Science, University of Iowa), Bob Boynton (Political Science, University of Iowa)
Understanding the nature of political discourse on social media allows us to gauge the motivations of its constituent voices and representativeness of its message. A thorough evaluation of sentiment classification algorithms applied to political writings shows this to be a non-trivial task.
Content Reuse in an Organization
Klaar De Schepper (Columbia University), Lawrence Bergman (IBM T.J. Watson Research Center), Jie Lu (IBM T.J. Watson Research Center)
CHI'11 [Honorable Mention]
How important is it for an organization to keep track of the content it generates? Turns out very much so. We track content reuse across a collection of slideshow presentations, modeling the flow of information within the organization. Our ethnographic survey and interviewing effort resulted in a set of guidelines for a content management system that supports modular content search, team management, and provenance tracking.
Event Tracking in Social Media
Viet Ha-Thuc (University of Iowa), Padmini Srinivasan (University of Iowa)
Large-scale analysis of social media such as blogs allows us a glimpse into the mind of a large segment of Earth's population. This record of people's thoughts can be leveraged to track significant events and discussion about them. Using Viet Ha-Thuc's new topic modeling approach we were able to track major news events over a period of time.