Archive for the ‘p2p’ Category

Papers at RecSys 2008

Monday, October 27th, 2008

A lot of interesting papers have been discussed during RecSys conference. Here are a few list of interesting ones.

(more…)

Are you downloading copyrighted files? You are out!

Tuesday, November 27th, 2007

A proposal(*) backed by French president Nicolas Sarkozy: those people who download copyrighted material (using, for example, P2P networks) will get disconnected from the Internet by their ISPs.

“Like it or not, the total cost of Internet service will rise because French ISPs have signed on to the plan. They will now spend time and (tax) money enforcing copyright on their networks via expensive deep packet inspection (DPI) software that will monitor traffic on their networks and look for copyrighted content. Subscribers detected illicitly sharing or downloading copyrighted material will receive warnings, requiring additional administrative overhead. If the behavior continues, then Internet access would be guillotined. Most of this will be carried out by a government-funded independent authority overseen by a judge” (full post).

(*) I do not describe the proposal as being insane just for this reason: “Do not be hectoring or arrogant. Those who disagree with you are not necessarily stupid or insane. Nobody needs to be described as silly: let your analysis show that he is. When you express opinions, do not simply make assertions. The aim is not just to tell readers what you think, but to persuade them; if you use arguments, reasoning and evidence, you may succeed. Go easy on the oughts and shoulds. ” You may well understand to which category Mr. Sarkozy qualifies to belong.

Trust propagation and the origins of PageRank

Wednesday, February 7th, 2007

Since Matteo’s seminar about neighbourhood maps a couple of months ago I’ve been wondering whether PageRank could be applied to a local view of a social network to calculate trust scores. (This might be useful in the new darknet version of Freenet, for example.) One of the Freenet developers pointed out that PageRank is patented, but Wikipedia showed that using eigenvector centrality to calculate the importance of nodes isn’t a new idea.

After following a few references it turns out that the idea of propagating trust/status/etc across a graph dates back to at least 1953 [1]. Pinski and Narin [2] suggested normalising each node’s output by dividing the output on each outgoing edge by the node’s outdegree. Geller [3] pointed out that their model was equivalent to a Markov chain: the scores assigned to the nodes followed the Markov chain’s stationary distribution. In other words, propagating trust/status/etc with normalisation at each node is equivalent to taking random walks from random starting points and counting how many times you end up at each node.

The only difference between Geller’s model and PageRank is the damping factor: in PageRank you continue your random walk with probability d or jump to a random node with probability 1-d. (Incidentally, when the algorithm’s described this way rather than in terms of a transition matrix, it’s easy to see how you could implement it on a web spider.)

[1] L. Katz, “A new status index derived from sociometric analysis,” Psychometrika 18 (1953), pp. 39-43. (PDF)
[2] G. Pinski and F. Narin, “Citation influence for journal aggregates of scientific publications: Theory, with application to the literature of physics,” Information Processing and Management 12 (1976), pp. 297-312. (PDF)
[3] N. Geller, “On the citation influence method of Pinski and Narin,” Information Processing and Management 14 (1978), pp. 93-95. (PDF)