Using Data Mining and Recommender Systems to Scale up the Requirements Process

Paper by Jane Cleland-Huang and Bamshad Mobasher


Ultra-large scale (ULS) software projects involve hundreds and thousands of stakeholders. The requirements may not be fully knowable upfront and emerge over time as stakeholders interact with the system. As a result, the requirements process needs to scale up to the large number of stakeholders and be conducted in increments to respond quickly to changing needs.

Existing requirements engineering methods are not designed to scale for ULS projects:

  • Waterfall and iterative approaches assume requirements are knowable upfront and are elicited during the early phases of the project
  • Agile processes are suitable for small scaled projects
  • Stakeholder identification methods only identify a subset of stakeholders

This position paper makes two proposals:

  • Using data-mining techniques (i.e., unsupervised-clustering) to identify themes from stakeholders’ statements of needs
  • Using recommender systems to facilitate broad stakeholder participation in the requirements elicitation and prioritisation process

Early evaluations show promise in the proposals:

  • Cluster algorithms (e.g, bisective, K-means) generated reasonably cohesive requirements clusters. However, a significant number contained requirements that were loosely coupled. The probabilistic Latent Semantic Analysis (LSA) method was used and early results showed improvement in cluster quality.
  • Their prototype recommender systems generated discussion forums that were more cohesive than ad-hoc ones created by users, and were able to recommend a significant number of relevant forums to stakeholders.

The proposed open elicitation framework introduces some challenges:

  • unsupervised group collaborations
  • decentralised prioritisation of requirements
  • malicious stakeholders manipulating the system for their personal gains

One Response to “Using Data Mining and Recommender Systems to Scale up the Requirements Process”

  1. Soo Ling, that’s a great post. It fits quite well with what you have been doing on using social networks to prioritize stakeholders of big software projects ;-) It’s also exciting that our ICSE paper (about which we’ll shortly blog) is starting to look good!