Archive for October, 2009

ACM RecSys 2009 Keynote (in 140 character chunks)

Friday, October 23rd, 2009

The third ACM RecSys conference started today in New York; unfortunately I could not make it. However, a number of people who I follow on Twitter are there (@xamat, @danielequercia, @barrysmyth)… and are tweeting away as the conference unfolds. You can follow the stream of #recsys09 tweets here. Although I’m sure that there are many details that do not make it into the 140 character-long tweets, they provide a real time snapshot of what is going on in the conference.

For example, the first keynote has just ended. Francisco Martin, Founder/CEO Strands, gave a talk about the “Top 10 Lessons Learned Developing, Deploying, and Operating Real-World Recommender Systems.” Here’s the twitter summary (note: copy/pasted and lightly edited to merge similar tweets).

Lesson 1 – Make sure a recommender is really needed! Do you have lots of recommendable items? Many diverse customers?… also think Return-on-Invesment… a more sophisticated recommender may not deliver a better ROI.

Lesson 2 – Make sure the recommendations make strategic sense. Is the best recommendation for the customer also the best for the business? What is the difference between a good and useful recommendation? Good recommendations vs useful recs; Obvious recommendations may not be useful; risky recs may deliver better long-term value

Lesson 3 - Choose the right partner! Select the right rec vendor vs hire some #recsys09 students. If you are a big company the best you can do is to organize a contest

Lesson 4 – Forget about cold-start problems (!) …. just be creative. The internet has the data you need (somewhere…)

Lesson 5 – Get the right balance between data and algorithms. 70% of the success of a #recsys is on the data, the other 30% on the algorithm

Lesson 6 – Finding correlated items is easy but deciding what, how, and when to present to the user is hard… or dont just recommend for the sake of it. Remember user attention is a scarce and valuable resource. Use it wisely! … dont make a recommendations to a customer who is just about to pay for items at the checkout! User interface should get at least 50% of your attention.

Lesson 7 – Dont’s waste time computing nearest neighbours (use social connections)… just mine the social graph. Might miss useful connections??

Lesson 8 – Dont wait to scale

Lesson 9 – Choose the right feedback mechanism. Stars vs thumbs …. the YouTube problem. More research on implicit and other feedback mechanisms is needed. The perfect rating system is no rating system! … focus on the interface. Seems to me this is one of the gaps in current research… algorithms > data > interface

Lesson 10 – Measure Everything! … business control and analytics is a big opportunity here.

Keynote Takeaway – Think about application context; Focus on interface as much as algs; Be creative with startup data. … the UI needs to get the lion’s share of the effort (50%) compared to algorithms (5%) , knowledge (20%), analytics (25%)

Code and other laws of urban space

Friday, October 23rd, 2009

Mobile phones offer more radical possibilities than ‘PC + internet’ in terms of bringing information into the real spatial environment, argues The City Project – which means architects and urban planners need to start engaging with the way space is experienced and manipulated through mobile software. Map-tagging and location-tracking could help planners to understand how space is used, reducing the tension between the ideal space of architecture and the real space of inhabitation.

So if the prophets of user-generated-everything need to learn that space matters, do those who dream of clean, Cartesian space also need to learn that use matters? No doubt – but to reduce location-aware software to a feedback channel from users to developers (in either sense), or to see it as another element in an architectural programme, would be to miss its truly radical potential, which would lie – if sufficiently open platforms could be developed – in enabling the unplanned, disorganised and ever-changing use of space, without architects.

The fallacy of web 2.0 utopians – motivational inforporn

Saturday, October 17th, 2009

Chris Aderson said that the future of business is free. We are still waiting for this revolution (email Chris for a detailed revolutionary plan, he’ll be happy to answer).

Now comes Clay Shirky with the next revolution: “innovation can happen everywhere”, he said yesterday (1:39′ of his talk). OK, let’s buy few more copies of Shirky’s books and wait for the next Microsoft or Google coming from Tanzania. Meanwhile, let me tell you why I think speeches on web 2.0 revolutions are motivational infoporn.


notes on Engaging Data

Tuesday, October 13th, 2009

The Engaging Data conference went very very well. Thanks to Caitlin and Francisca for their fantastic job! Few notes I’ve put quickly together:

Day 1

  • Peter Hirshberg gave a great keynote talk! He introduced interesting applications using real-time data: NYTE (use of phone calls from NY to cities around the world, City sense (tracking where people are right now), and City sourced (taking geocoded pictures and upload them directly to an official who can do something about it). Great quotes in his talk:
    • “with every augmentation comes amputations”. along those lines,  LBS turns us to starring at the screen instead of what’s around us.
    • ” Privacy is often eroded one convenience at a time” (by Chris Hughes of City Sourced)
  • A couple of researchers of SkyHook followed. They described how, by aggregating  data from GPS, cell phone towers, and wifi networks, they extract:
    • emergence bursts – lots of people come out at once
    • impedance clustering – accidents people want to get around
    • social affinity – large group of similar people

    Interestingly, they find spikes or dips with respect to a baseline level of actitivity (normal level of activity in a specific area)

  • Glen Urban of MIT Sloan introduced Ad Morphing. This system matches on line ads to individual cognitive style (e.g., deliberative/impulsive, analytical/holistic, verbal/visual). He concluded by introducing the recent migration of Ad Morphing on mobile phones (Concierge). The application would be to serve apps and ads that are useful on a phone’s screen.
  • Deborah Estrin of the Center for Embedded Networked Sensing introduced few projects:
  • Eric Paulos of CMU gave a beatiful talk that revolved around his research goals: improve science literacy, provide professional scientists with better data, develop new usage models for phones, enabling grass roots activism, & greater public understanding. From the same research group, Ian Li proposed powerful ways to improve self-awareness of physical activity.
  • Based on mobile phone calling data, Nathan Eagle is studying sex workers in Kenya (with Eduard Sanders), 150 undegrad smokers/recent quitters (with Yuelin Li), slums’ inhabitants (30% of people in slums carry mobile phones!) . He also touched on a spatial dynamic bayesian anomaly detection he developed with Eric Horvitz to answer questions including:
    • How do peoples’ movements and communications change when they get sick?
    • Calculate regional deviations from normal use and triangulate epicenter of disasters (e.g., tsunamis, earthquake)

    Great stuff!!!

  • Anmol Madan reported on his cool research on  how  things (e.g., political ideas, deseases) spread within face-to-face nets. He run an extensive study in one of the MIT dorms.
  • Michael Siegel of MIT Soan School mentioned that Japanese doctors created a system to capture EVERY piece of data in their hospital – every activity by every person (bar-code, RFID, EHR, test data). Check here & here.
  • Michiel Van Meeteren, Ate Poorthuis, and  Elenna Dugundji gave an engaging talk on mapping communities in large virtual social networks. They used twitter data to identify the indie mac community. They started from a central node and found the communities this node speaks to. The implications of this method are powerful – it may have jeopardize the protest during the recent Iranian elections. Very interesting work!!! Check the last abstract on this page.

Day 2

I was busy having meetings with my lab’s sponsors and followed just the panel session. Surprisinlgy, during the session, David Lazer pointed out that i’m the most central node in the twitter networks of the conference participants in  win (pic done using NodeXL) and engaging data (pic). wow!:-)

New Data

Thursday, October 1st, 2009
  • 30 Resources to Find the Data You Need
  • New Reality Mining Data Available. From Nathan Eagle: “I am currently releasing the full Reality Mining dataset. It’s got loads of additional information – especially related to survey responses (friendships, recent illness, satisfaction, etc). The new ReadMe has a complete description. If you’d like access, just drop me an email. As I’m now involved in other projects, I haven’t had much time to look at this new data – so have at it. ”