Archive for the ‘recommendations’ Category

Enhancing Mobile Recommender Systems with Activity Inference

Thursday, July 2nd, 2009

Daniele had briefly blogged here about this interesting paper, by Kurt Partridge and Bob Price, for which I will give a longer review. Some of the techniques used in this paper could be useful for further research and even its limitations are interesting subject of analysis.

Given that today’s Mobile Leisure Guide Systems need a big amount of user interaction (for configuration/preferences), this paper proposes to integrate current sensor data, models built from historical sensor data, and user studies, into a framework able to infer user high level activities, in order to improve recommendations and decrease the amount of user tasks.

Authors claim to address the problem of lack of situational user preferences by interpreting multidimensional contextual data using a categorical variable that represents high-level user activities like “EAT”, “SHOP”, “SEE”, “DO”, “READ”. A prediction is of course a probability distribution over the possible activity types.

Recommendations are provided through a thin client supported by a backend server. The following techniques are employed to produce a prediction:

  • Static prior models
    • PopulationPriorModel: based on time of the day, day of the week, and current weather, according to typical activities studies from the Japan Statistics Bureau.
    • PlaceTimeModel: based on time and location, using hand-constructed data collected from a user study.
    • UserCalendarModel: provides a likely activity based on the user’s appointment calendar.
  • Learning models
    • LearnedVisitModel: tries predicting the user’s intended activities from time of day, learning from observations of their contextual data history. A Bayesian network is employed to calculate the activity probability given location and time.
    • LearnedInteractionModel: constructs a model of the user’s typical activities at specific times, by looking for patterns in the user’s interaction with his/her mobile device.

Activity inferences are made by combining the predictions from all the five models, using geometric combination of the probability distributions.

A query context module is fed to the activity prediction module to provide prediction data of the context in which the user may be interested. For example, the user could be at work when searching for a restaurant, but his/her actual context could be the area downtown in which he/she plans to go for dinner.

Authors carried out a user study, evaluating the capability of each model to provide accurate predictions.  Eleven participants carried the device for two days, and were rewarded with cash discounts for leisure activities they engaged in while using the device. The Query Context Prediction module was not enabled because of the short duration. Results show high accuracy (62% for baseline=”always predict EAT”, 77% for PlaceTimeModel).

Some good and problematic issues with this paper

  • the prediction techniques used are interesting and could be applied to other domains; moreover I think it’s useful to combine data from user studies and learning techniques as user profiling helps developers (and apps) to understand users in general - before applying this knowledge to a specific user
  • the sample size makes the user study flawed: 11 participants carrying devices for 2 days approaches statistical insignificance; weekdays/weekends is the first issues that bumps into my mind, just to mention one
  • offering cash discounts for leisure activities is presumably not the correct form of reward for this kind of study as it makes users more willing to engage in activities that require spending money over the free ones (e.g. EAT vs. SEE)
  • authors admit they have mostly restaurants in their RS base, which I think is not taken in enough account when claiming high accuracy. Given that the baseline predictor has a 62% accuracy predicting always EAT, a deeper analysis would have made the paper more scientific
  • one of the most interesting contribution of the paper is the definition of the query context module, which is unfortunately not employed in the study for usability reasons related to the its duration. Again, a better defined user study would have solved this problem. I question whether it’s worth carrying out user studies when resources are so limited that the statitistical significance becomes objectable. However, there is some attempt to discuss expected context vs. actual context which is potentially very interesting: e.g., a user wants to SHOP but the shops are closed, so he/she EATs. It would be interesting to discuss how a RS should react to such situations
  • user-interaction issues: the goal of the presented system is to reduce user tasks on the mobile; yet, this is needed to tune the system and address its mistakes; yet, one of the predictors uses exactly user’s interaction with the mobile as a parameter. It looks like there is some confusion considering the role of user interaction in this kind of systems (imho, I think that a HCI approach could improve RS usability and, consequently, accuracy)
  • the systems is not well suited to multi-purpose trips (e.g. one EATs whilst DOing, or alternatively SHOPs and EATs) and in this case predictions are mostly incorrect.

Discussing the Netflix Prize

Tuesday, June 30th, 2009

After my last blog post, I was contacted by a journalist who wanted to discuss the effects of the Netflix prize. It seems that now that the competition is winding to an end, one of the real questions that emerges is whether it was worth it. Below, I’m pasting part of my side of the dialogue; other blogs are posting similar discussions, and I’m curious as to what any of you fellow researchers may have to say.

(more…)

Research is the New Music

Monday, February 23rd, 2009

I’ve started trying out a new service, called Mendeley. The quickest way to describe it is a “last.fm for research;” they have a desktop client that can monitor the pdf files that you are reading, and an online presence where each user has a profile. (Read about them on their blog; my profile is here). So far, it seems that they are at a very early stage. However, the basic functionality (seeing/tagging/searching papers you read) seems quite nice. On the other hand, an obvious difficulty is that of extracting accurate meta-data from research pdf files.

The similarity between research papers and songs is quite striking. Think of it this way: songs (research papers) are made by musicians (authored by researchers), have a name (title), and are collected in albums (journals/conference proceedings). Both have a time of release; both can be tagged/described/loved/hated; both are blogged and talked about. Sometimes artists make music videos, sometimes researchers make presentations or demos. (more…)

Question/Answers @ RecSys Doctoral Symposium 2008

Friday, November 28th, 2008

I came across an interesting blog post by @HDrachsler, who I started following on twitter after this year’s RecSys conference. The post contains a recording of the question/answer time at the RecSys doctoral symposium (which I unfortunately did not attend). The clearest voice in the recording is Prof. Joseph Konstan, who (obviously, I know) has some very interesting things to say about collaborative filtering, recommender system research, and the state of the field. Here are some notes that I jotted down while I was listening: (more…)

A Pitch on Future Recommender Systems

Thursday, November 27th, 2008

Yesterday I attended a workshop that was aimed at fostering research collaboration between our department and BSkyB. After a short introduction by the head of the department, a number of members of staff gave short (10 minute) pitches about their past and current research, and areas they are interested in for potential collaboration. The range of work being done in the department is huge- perhaps this deserves a post of its own.

(more…)

Tutorials at RecSys 2008

Friday, October 24th, 2008

Yesterday was the first day of RecSys 2008, and was dedicated to three very interesting tutorials:

1. Robust Recommender Systems. Robin Burke introduced the wide range of attacks that typical collaborative filtering algorithms are vulnerable to; scenarios that arise when people attempt to force, rather than express, opinions. An attack was strictly defined as a set of profiles intending to obtain excessive influence on others, which can be aimed at pushing (making recommendation more likely) or nuking (i.e. recommendation less likely) items. His talk was an interesting blend of attack strategies, knowledge that attackers need to have, and a high-level description of approaches aiming at preventing or fixing the system when attacked. Of course, there are strong overlaps between this work and work in other areas (p2p trust, adversarial information retrieval, search engine spam..); I particularly like this area as pushes the point that recommender systems are about people/dynamic datasets, and not just prediction.

2. Recent Progress in Collaborative Filtering. Yehuda Koren (who has recently moved from AT&T to Yahoo! Research) gave a tutorial about the leading approaches in the Netflix prize competition. The techniques he described blend matrix factorisation and neighbourhood models, and include a number of other factors (such as user biases and time) that result in techniques that have multiple-billions of parameters (and the resulting ranking of team BellKor in the competition). His work is remarkable and worthy of the progress prizes he has been awarded thus far. He also explored alternative techniques of evaluating recommender systems, explaining his take on evaluating top-N recommendation lists.

3. Context-Aware Recommendations. Gedas Adomavicius and Alex Tuzhilin introduced their work on incorporating context into recommender systems, including pre-, post-, and hybrid-filtering of recommendation algorithm results based on user context. A running example that was repeated throughout the tutorial was going to the theatre with your girlfriend on the weekend: if you always watch comedy, then your recommendations can be filtered to match what you did in previous instances of the same context (i.e. you can be recommended comedy). They have done a lot of cool stuff on multi-dimensional recommenders, extending the common rating scales into cubes of ratings, and stressed more than once that this is virgin territory. Their work is also impressive, but raised a few questions. For example, should context be described by a well-enumerated taxonomy? Moreover, if you always watch comedy at the theatre with your girlfriend on weekends, then why should you need a recommender system in the first place (especially a collaborative one- what happened to serendipity or diversity)? They have a number of papers that are worth reading before trying to answer these questions!

Redefining Information Overload

Thursday, December 27th, 2007

The other day I was sitting at Gatwick airport waiting for my flight home to Italy to spend Christmas with my family. I got my flight with Easyjet- and when I bought the ticket online I was also able to sign up to one of their new, free text-messaging services:

  • Some of the texts were very helpful: the morning of my flight I received a text with my flight details and confirmation number, information that I may usually scribble on a piece of paper or the back of my hand. Result: no paper, and clean hands (happier parents?)
  • Some of the texts could have made us of some location information: a text said (in a nice way) “go to your gate” … umm, should I reply to the computer and tell it I’m already there?
  • Other texts were interesting, but I didn’t need them: “Use this text to get 0% commission on currency exchange.” I have some Euros in my pocket. Can you send me this text again when I do need Euros? (Maybe I’ll tell you when?)
  • Other texts were just useless. “Go to shop X and get Y% discount with this text.” I won’t say what the shop is, let’s just leave it at the fact that its contents don’t quite fit my profile (specifically gender). Why do you keep interupting me from the book I was reading to give me this useless advertisement? My only current solution is to unsubscribe- but I’ll lose all the information I liked then! (more…)

Netflix Prize dataset de-anonymised

Wednesday, December 19th, 2007

Two researchers at the University of Texas have de-anonymised (re-nymised? nymified?) the Netflix Prize dataset.

Netflix: Winners and Losers

Friday, December 14th, 2007

By now, the news has spread around that team BellKor has won this year’s Netflix progress prize ($50,000) by achieving an 8.43% improvement over Cinematch (the grand prize is less than 2% away). Their current solution is described here; and perhaps the most interesting thing about it is the first sentence. “Our final solution (RMSE=0.8712) consists of blending 107 individual results.” It is also interesting to note that the second place on the leader board is team KorBell; I assume that this is because Netflix has restricted each team to one submission per day.

A natural question to ask, therefore, (other than the one about how many teams may have multiple names and can/are trying and infer what the qualifying ratings actually are) is that perhaps this race for accuracy is developing methods that are perfectly suitable for the qualifying data but not necessarily for the rest! It becomes a problem of overfitting. To quote wikipedia, a brute-force approach aimed at accuracy could develop a method that “reduces or destroys the ability of the model to generalize beyond the fitting data!” In other words, once they unleash the winning algorithm on the rest of their data, will they maintain the 10% improvement over Cinematch?

My work in recent weeks has been following up on a previous paper, by exploring the (lack of) information that a lonely RMSE or MAE can give us about how well collaborative filtering is performing: we know nothing about how much the predictions are dispersed around the mean, how error evolves over time, and are not considering a number of other aspects that should be close to our heart. More coming up on that soon. But in the mean time, I made my first submission to the Netflix prize site to see how well the Java random number generator would perform. My incredible predictions were made using nextInt(5)+1. I achieved an RMSE of 1.93, and hopefully no team has performed worse than me.

Just out of curiosity, I got RMSE 1.92 on the probe set using the same technique; I haven’t read anywhere about the extent to which the probe set offers a good idea as to how well qualifying performance will be. Further predictions on the probe set, based on a random number between 3 and 5, or (nextDouble()*2) + 3, (since rating distribution is skewed towards the positive end in these datasets) improved my losing streak to RMSE 1.31. Lastly, simply returning the average rating for each movie gets RMSE 1.13. So if anyone out there is doing this well with crazy matrix operations, you might want to rethink your strategy :)

Recommendation or Spam?!

Thursday, November 22nd, 2007

As part of my PhD I am interested in investigating the effect of spam in pub-sub and how to use social networks to minimize amount of delivered spam in MANETs. Recently I came across an article about Facebook starting their ‘Social Advertising‘. The idea is putting your face on advertisements for products that you like.

For example, a Facebook user who rents a movie on Blockbuster.com will be asked if he would like to have his movie choice broadcast out to all his friends on Facebook. And those friends would have no choice but to receive that movie message, along with an ad from Blockbuster.

Facebook says that many of its 50 million active users already tell friends about particular products or brands they like, and the only change will be that those communications might start to carry ad messages from the companies that sell them. Facebook is letting advertisers set up their own profile pages at no charge and encouraging companies like Blockbuster, Conde Nast and Coca-Cola to share information with Facebook about the actions of Facebook members on their sites.

Facebook users will not be able to avoid these personally recommended ads if they are friends with participating people. Participation can involve joining a fan club for a brand, recommending a product or sharing information about their purchases from external Web sites.

Although I agree the idea of sharing information with your friends is very useful but at the same time it can potentially create too much spam. So the question is that, are recommendations going to be new labels for spamming?

You can find the main article by Louise Story in here

Lightweight Distributed Trust Propagation

Wednesday, October 31st, 2007

I just finished to present our work at ICDM. Here are the slides (also in ppt).

Facebook and the Long Tail

Monday, October 8th, 2007

In previous posts we were discussing the long-tailed characteristic of music, books, and other physical items. It now seems that this characteristic is also applicable to facebook apps:

Artwork Recommenders

Thursday, October 4th, 2007

A recent email on the user-modeling email list is the Artwork Recommender that you can find here. After answering a short questionnaire, you can rate artwork from 1-5 stars, corresponding to “I hate this artwork” to “I like this artwork very much.” Being no artwork expert, I found my ratings of things to be quite biased towards the high end (3/4 stars) – it’s much more difficult to rate artwork than it is to rate movies or songs! They even have the option of saying that you are not interested in a piece of artwork at all, but I never felt like clicking it. Maybe this highlights the difficulty of finding good recommendations: understanding a rating process that users themselves don’t understand. You can also both give an overall rating for items and also rate a particular item in terms of its attributes (artist/material/style).

The website, though, has a neat interface (watch the paintings you rate “fly” into your profile history) and they are looking for participants to add to their dataset, so if you have 5 minutes then go in and rate some 20 or so paintings, help them, and see what comes out…

Mobile Social Shopping

Wednesday, July 4th, 2007

The Utiforo project that some of us here are involved in (see previous posts) is sub-titled “pervasive computing support for market trading;” the broad goal of the project is to bridge the gap between online and offline commerce by researching the applicability of trust to this scenario. One of the sub-goals of our partners at Sussex is to develop a navigation kiosk application, to capture user policies as they roam shopping centres.

However, I’ve read a couple recent articles that show that commercial applications are most likely one step ahead: Wishpot, for example, allows users to upload and share items of interest (via mobile phone text messages or photographs) to their online profile. They can then use their profile, along with various social-network features, to research prices, view user-ratings, and receive recommendations. Other commercial applications include Kaboodle, Stylehive, Zlio, and MyPickList. One site even quoted that these services will bring about the end of impulse buying (by allowing users quick access to price comparisons and product quality assessments)!

Recommender systems and The Long Tail

Friday, June 15th, 2007

The introduction of TheLong Tail by Chris Anderson is available online! Anderson describes how recommender systems help you ‘navigate’ the long tail. He then lists (context-related, time-related) problems of music recommender systems:

1) They tend to run out of suggestions pretty quickly as you dig deeper into a niche, where there may
be few other people whose taste and preferences can be measured. Plus, many kinds of recommendations tend
to be better for one genre than for another—rock recommendations aren’t useful for classical and vice versa. In the old hit-driven model, one size fit all. In this new model, where niches and sub-niches are abundant, there’s a need for specialization.

2) Even where a service can provide good suggestions and encourage you to explore a genre new to you, the advice often
stays the same over time. Come back a month later, after you’ve heard all the recommendations, and they’re probably pretty much as they were. … You’ll need another kind of filter to take you to your next stop on your exploration.

Points worthy of further research ;-)