way of placing nodes in a k-dimensional layout — so that nodes that are more likely to be linked as they are closer in the layout. The algorithm has to be decentralized, and I accomplished this by requiring each node to exchange information only with its “friends”. Once we have such a layout, we can use in various settings. The most obvious application is routing in “darknets” such as Freenet (networks where connections are only between trusted pairs of friends): we use the position on the layout of a node as an “address”, and we try to reach the destination node by routing the query to the closest node. This way we obtain efficient anonymous routing. Other applications can be related to network analysis (e.g., we can recommend to an user some content that is ‘affine’: closer on the network map), or to trust metrics (e.g., finding paths in social networks that connect nodes to assess trust). I did this by adapting an algorithm for graph drawing based on spectral graph theory to decentralized systems. Results are very encouraging. I think it could be interesting to see if the same idea can be applied to opportunistic networks, in order to analyze social structure and/or do routing.
Archive for June, 2007
The introduction of TheLong Tail by Chris Anderson is available online! Anderson describes how recommender systems help you ‘navigate’ the long tail. He then lists (context-related, time-related) problems of music recommender systems:
1) They tend to run out of suggestions pretty quickly as you dig deeper into a niche, where there may
be few other people whose taste and preferences can be measured. Plus, many kinds of recommendations tend
to be better for one genre than for another—rock recommendations aren’t useful for classical and vice versa. In the old hit-driven model, one size fit all. In this new model, where niches and sub-niches are abundant, there’s a need for specialization.
2) Even where a service can provide good suggestions and encourage you to explore a genre new to you, the advice often
stays the same over time. Come back a month later, after you’ve heard all the recommendations, and they’re probably pretty much as they were. … You’ll need another kind of filter to take you to your next stop on your exploration.
Points worthy of further research
A first step towards participatory sensing (pdf). Commissioned by the Cabinet Office, a report, called The Power of Information, aimed to find out more about Web 2.0 tools and communities to see how the government can get involved to help Britons make the most of this “new pattern of information creation and use”. (bbc news)
Recently, I submitted my first paper to RecSys 07, “Private Distributed Collaborative Filtering using Estimated Concordance Measures.” Even though it is not particularly about mobile-stuff, here’s a quick run through the main ideas:
Collaborative filtering is a means of using a community’s behavior, within a certain domain (movies, music), to support reducing the amount of information that each individual needs to looks through to find their items of interest. It is the dominant method behind recommender systems (such as Amazon, etc), and is based on a simple idea: people with previous shared interests will, most likely, share common likes and dislikes in the future. So, to predict how much I will like a certain item, the system first compares my rating history to all the other users to produce similarity measures, and then uses these similarity measures to compute a weighted average of how much they enjoyed the item in question.
The problem is that this method allows for no privacy. Particularly in a distributed environment, where I do not know how much to trust unknown neighbors, I do not want to have to share my entire rating history (i.e. my profile) with them, to find a similarity measure- thus discouraging cooperation, which is harmful to collaborative filtering. The actual similarity measures (the most famous being the Pearson correlation coefficient), simply cannot be found without full profile disclosure.
Therefore, in the paper we proposed a new similarity measure, based on concordance. You and I rate a movie concordantly if we both rate it above or below our mean rating (in other words, we agree about whether to give it thumbs up/thumbs down). If you hated the movie (and rated it below your mean), and I loved it (rating it above my mean), then we disagree- the ratings are discordant. If one of us has no opinion about the movie, we just say it is tied. The new similarity measure is derived from the number of concordant (C), discordant (D), tied (T) pairs between our rating sets, and the size of the set N, and this similarity measure works just as effectively (in terms of generating recommendations) as the Pearson correlation coefficient.
So how can this be used to include privacy in collaborative filtering? If you and I share a common randomly-generated set, and report to each other the number of C, D, and T pairs we have with the random set to each other, these values can be used to place bounds on the actual values of C, D, and T pairs we have with each other: we can estimate our similarity without ever sharing any profile-specific information, only sharing abstracted profile information derived from a comparison with a random set. Privacy is not breached, and, along with an incremental learning technique (future work) about how to evolve the similarity measure between recommenders, we can start collaborating with each other!
If you’re interested in the details (and the evaluation), I’ll post the paper on my web site soon (when I hear back from RecSys!)
Nokia enters Social Sharing World with Mosh. MOSH is a content sharing site where community members upload, distribute and manage content to be viewed and enjoyed on mobile devices. With MOSH, anything from applications like mobile games, to videos, blogs, songs or photos are now accessible and distributable on your mobile device.
There are three key elements to MOSH:
1. A website
2. A mobile website
3. An application for mobile devices (available for download on Nokia devices only)
The website is your main source for accessing the wide range of content available through MOSH. It is here where you can create your profile, upload content, manage your collections and specify which selects to send to your mobile device as mobile feeds.
At USENIX Security, a paper will show how three consumer devices leak personal information.
We analyze three new consumer electronic gadgets in order to gauge the privacy and security trends in mass-market UbiComp devices. Our study of the Slingbox Pro uncovers a new information leakage vector for encrypted streaming multimedia. By exploiting properties of variable bitrate encoding schemes, we show that a passive adversary can determine with high probability the movie that a user is watching via her Slingbox, even when the Slingbox uses encryption. We experimentally evaluated our method against a database of over 100 hours of network traces for 26 distinct movies.
Despite an opportunity to provide significantly more location privacy than existing devices, like RFIDs, we find that an attacker can trivially exploit the Nike+iPod Sport Kit’s design to track users; we demonstrate this with a GoogleMaps-based distributed surveillance system. We also uncover security issues with the way Microsoft Zunes manage their social relationships.
We show how these products’ designers could have significantly raised the bar against some of our attacks. We also use some of our attacks to motivate fundamental security and privacy challenges for future UbiComp devices.