new! The Proceedings of the RAMSS 2013 Workshop are online:


Real-time User Modeling and Prediction: Examples from YouTube
Dr. Ramesh Rangarajan Sarukkai
(Senior Management, Google Inc.)

Abstract. Real-time analysis and modelling of users for improved engagement, interaction and social activity is an burgeoning area of interest with applications to web sites, social networks and mobile apps. Apart from scalability issues, this domain poses a number of modelling and algorithmic challenges. In this talk, as an illustrative example, we present DAL, a system that leverages real-time user activity/signals for dynamic ad loads, and designed to improve the overall user experience on YouTube. This system uses machine learning to optimize for user activity during a visit and helps decide on realtime advertising policies dynamically for the user. We conclude the talk with challenges and opportunities in this important area of real-time user analysis and social modeling.

Bio. Dr. Ramesh Rangarajan Sarukkai currently heads up the YouTube video monetization platform/formats group at Google, with the charter of delivering exciting and optimal advertising experiences for YouTube on desktop, mobile as well as TV. He also bootstrapped YouTube’s efforts on TV (called YouTube Leanback), in addition to leading the Watch Player team. Prior to this, he was a Director at Yahoo where he drove many initiatives including mobile ads, video/multimedia search and e-commerce. Dr. Sarukkai also presents periodically at various conferences as a keynoter/presenter/panelist (ACM Multimedia Conferences, World Wide Web Conferences, ACM/SPIE Conference on Multimedia Computing & Networking and so on). He has published many papers in leading journals, one book entitled “Foundations on Web Technology” (Kluwer/Springer), holds over 30 patents and has participated in working groups such as W3C. His current interests lie in the confluence of mobile/tv, video, advertising and social.

Gianmarco De Francisci Morales


SAMOA: A Platform for Mining Big Data Streams
Dr. Gianmarco De Francisci Morales
(Yahoo! Research)

Abstract. Social media and user generated content are causing an ever growing data deluge. The rate at which we produce data is growing steadily, thus creating larger and larger streams of continuously evolving data. Online news, micro-blogs, search queries are just a few examples of these continuous streams of user activities. The value of these streams relies in their freshness and relatedness to ongoing events. However, current (de-facto standard) solutions for big data analysis are not designed to deal with evolving streams. In this talk, we offers a sneak preview of SAMOA, an upcoming platform for mining big data streams. SAMOA is a platform for online mining in a cluster/cloud environment. It features a pluggable architecture that allows it to run on several distributed stream processing engines such as S4 and Storm. SAMOA includes algorithms for the most common machine learning tasks such as classification and clustering. Finally, SAMOA will soon be released as open source software to foster collaboration and research on big data stream mining.

Bio. Gianmarco De Francisci Morales is a postdoctoral researcher at Yahoo! Research Barcelona. He received his Ph.D. in Computer Science and Engineering from the IMT Institute for Advanced Studies of Lucca in 2012. His research focuses on large scale data mining and big data, with a particular emphasis on web mining and Data Intensive Scalable Computing systems. He is an active member of the open source community of the Apache Software Foundation working on the Hadoop ecosystem (Giraph, S4), and a committer for the Apache Pig project. He is a co-organizer of the First workshop on Social News on the Web (SNOW) co-located with the WWW’13 conference.

RAMSS 2013 Workshop Program

Tuesday, May 14

   13:00-13:30     Welcome and Introduction 
   13:30-14:30     Keynote talk by Dr. Ramesh Rangarajan Sarukkai 
 (Google Inc.)
Coffee break
Session #1
   15:00-15:30    Towards Real-time Collaborative Filtering for Big Fast Data.
Ernesto Diaz-Aviles, Wolfgang Nejdl, Lucas Drumond and Lars Schmidt-Thieme
   15:30-15:50    Detecting Real-time Burst Topics in Microblog Streams: How Sentiment can Help.
Lumin Zhang, Yan Jia, Yi Han and Binxing Fang.
   15:50-16:10    Real-time Discussion Retrieval from Twitter. [Slides]
Dmitrijs Milajevs and Gosse Bouma.
   16:10-16:30    MJ no more: Using Concurrent Wikipedia Edit Spikes with Social Network Plausibility Checks for Breaking News Detection. [Slides]
Thomas Steiner, Seth Van Hooland and Ed Summers.
Coffee break
Session #2
   17:00-17:30    MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd. [Slides]
Raphaël Troncy, Vuk Milicic, Giuseppe Rizzo and José Luis Redondo García.
   17:30-18:30     Keynote talk by Dr. Gianmarco De Francisci Morales [Slides
 (Yahoo! Research)
   18:30-19:30    Discussion and Wrap-up