Evaluating our smart algorithms

Many of us are designing smart algorithms and are often supposed to evaluate them by carrying out well-designed user studies.

Problem: Those studies are expensive and, consequently, we tend to trade off between sample size, time requirements, and monetary costs.

Proposal (by PARC researchers) : To collect user measurements from micro-task markets (such as Amazon’s Mechanical Turk). Here is their blog post (which comments on their upcoming HCI paper titled “Crowdsourcing User Studies With Mechanical Turk“).

Note: Often, users are irrational.

2 Responses to “Evaluating our smart algorithms”

  1. [...] to come to you. A new competition is adding its name to the Netflix prize, this previous post on evaluating algorithms with the masses, and the above competition: semantihacker is offerring $1 million to anyone who can put their [...]

  2. [...] of the Prisoner’s dilemma, which was very interesting; the only question that arises is, as Daniele mentioned, is this appropriate when users may very well behave [...]