Evaluating our smart algorithms

Many of us are designing smart algorithms and are often supposed to evaluate them by carrying out well-designed user studies.

Problem: Those studies are expensive and, consequently, we tend to trade off between sample size, time requirements, and monetary costs.

Proposal (by PARC researchers) : To collect user measurements from micro-task markets (such as Amazon’s Mechanical Turk). Here is their blog post (which comments on their upcoming HCI paper titled “Crowdsourcing User Studies With Mechanical Turk“).

Note: Often, users are irrational.

