netsci 2010 in boston was a success! on friday at 4pm, we’ll have a discussion session in the computer lab of cambridge university (all invited! slides are here). i’ve blogged about the satellite events on monday (round table at harvard) and on tuesday (mobile network workshop). here are the notes i’ve taken during the actual conference on wednesday, thursday, and friday:
Archive for May, 2010
notes taken during the round table:
This year I’ll be presenting a paper “Temporal Diversity in Recommender Systems” at SIGIR in Geneva, which I wrote with my supervisors (Steve Hailes, Licia Capra) and Xavier Amatriain from Telefonica Research in Barcelona. As you can expect from the paper’s title, we’ll be appearing in the session on Filtering and Recommendation.
The paper is one of the latest publications based on the work I was doing for my PhD (by the way, I submitted my thesis earlier this month! You can see the abstract, and eventually download it, from here). The elevator pitch for nearly four year’s work would be something like this:
Collaborative filtering has always been studied from a static viewpoint: by designing algorithms that can predict hidden test set ratings really well (the Netflix prize is the perfect example of that). However, when recommender systems are deployed, they will face a dynamic environment: users will keep rating things over time, and the system will need to be updated to take into account people’s latest ratings. My thesis examined this rift, by looking at how various dimensions of recommendations change as recommender systems are updated.
When I give this sort of pitch to anyone who has an interest in recommender systems, they usually say: ah, but wait, what about that awesome KDD paper by Koren on the temporal dynamics of recommendations? And I say: yes, that is an awesome paper, and (from what I gathered) he is dealing with shifting customer preferences. In my work, I take a more system-oriented perspective; I look at how things change as you re-train your CF algorithm every week.
So, a bit of background. Leading up to the paper that this post is about, we had a few other contributions: at RecSys 2008 we looked at how similarity between users evolves over time, as they rate more stuff; at SIGIR 2009 we looked at the effect of updating a recommender system on temporal accuracy.
The next step in the story was to look at how ranking of recommendations changes over time. And here, we hit onto the question: what happens if, as users keep rating stuff, their recommendations do not change at all? We decided that the best way to find out was to run a user survey. Actually, we designed three. Each survey simulated a “popular” movie recommender system over five “weeks.” In other words, respondents would see week 1′s recommendations, rate how much they liked them, and then click through buffer screens to get to the following week’s recommendations. In the first survey, we gave them the most popular movies of all time, every week. Nothing changed. The second survey, instead, gave popular movies, but slightly changed them each week: it introduced different popular movies into the top list. The last survey just threw up random movies.
There were two interesting outcomes from the survey. The first, which we report in the paper, relates to how people rated recommendations as they did (or did not change). Popular recommendations that changed were consistently rated highly. Random movies were consistently rated low. But popular movies that did not change were rated less and less each week; by week 5 they were rated as bad as the random movies. An interesting point about this is that when people rate, they aren’t just inputting their preference for a movie- their rating also reflects how much they like the recommendation that they are being served. The second interesting outcome was the angry emails I got: when recommendations didn’t change, people wrote to me to tell me that my system “sucked” or had a bug in it. At the broadest level, one of the conclusions we
drew from the survey is that temporal change to recommendations is important: people don’t like being recommended the same things over and over again.
Based on this result, we performed a large scale analysis of state of the art algorithms: how much change do they offer? What impacts the amount of change? What can be done to promote diversity? All the results, and (hopefully) some food for thought, are in the paper here.
Xavier and I will be at SIGIR 2010: we are looking forward to seeing you there too!
i’ve just finished to attend a chi presentation/report from cambridge folks. two fantastic videos at the end. now, some papers of interest:
An unobtrusive behavioral model of “gross national happiness” - this could catch the attention of people working on happiness/mood/emotions check out the facebook application
Blogging in a region of conflict: supporting transition to recovery - one interesting finding: “blogs enable people experiencing a conflict to engage in dialogue with people outside their borders to discuss their situation”
Pensieve: supporting everyday reminiscence - “everyday reminiscence by emailing memory triggers to people that contain either social media content they previously created on third-party websites or text prompts about common life experiences.”
Useful junk?: the effects of visual embellishment on comprehension and memorability of charts (best paper) – how to present a graph? cool info-graphics in the paper and interesting analysis of precision of retrieving charts vs. recall after 3 weeks
Crowdsourcing graphical perception: using mechanical turk to assess visualization design - they repeated some info visualization experiments on mechturk