Archive for the ‘taxonomy’ Category

Intelligence-gathering by sneakernet

Tuesday, January 5th, 2010

A new report by senior US intelligence officers recommends sweeping changes to intelligence-gathering practices in Afghanistan. The two most interesting recommendations:

  • Intelligence work should be divided along geographic, rather than functional, lines. “The alternative – having all analysts study an entire province or region through the lens of a narrow, functional line (e.g. one analyst covers governance, another studies narcotics trafficking, a third looks at insurgent networks, etc) – isn’t working.” (p4)
  • Analysts should aggregate intelligence by regularly travelling to visit those who collect it. “Information essential to the successful conduct of a counterinsurgency is ripe for retrieval, but analysts that remain confined to restricted-access buildings in Kabul or on Bagram and Kandahar Airfields cannot access it.” (p17) The internet is not suitable for this purpose because “vital information piles up in obscure SharePoint sites, inaccessible hard drives, and other digital junkyards.”

The first point interests me because it suggests that problem-solving doesn’t always scale through specialisation, as tends to be assumed in academia: when the flow of information is constricted, a geographically-organised hierarchy of generalists may be more effective than a taxonomically-organised hierarchy of specialists.

The second point bears more directly on mobblog’s research interests (though I’m not suggesting we should design communication systems for the US military): manual aggregation and curation of information are still necessary, even when that information is in digital form. More surprisingly, the oldest method of aggregation – sneakernet – remains the most reliable.

The issues discussed in the report might seem specific to the chaotic and poorly connected environment of Afghanistan, but I want to argue that the fundamental problem – finding relevant information in a shifting sea of circumstances, practices, organisational structures and data formats – exists everywhere, and is not solved by better connectivity, nor by making everything digital.

David Weinberger has suggested that in the digital realm, tags will replace taxonomies and it will no longer be necessary to separate the organisation of information from its retrieval; but while the notion of a ‘hierarchy of generalists’ does cast doubt on the usefulness of a priori taxonomies, the recommendation of manual data collection and curation is directly opposed to Weinberger’s ‘tag soup’ approach.

Does this simply reflect a lack of tools (or, God help us, standards), or is the complexity of real-world information as irreducible to tags as it is to taxonomies? Anyone who’s used Google Images will recognise the difficulty of applying tags to non-textual data; assuming the sea never stops shifting, will the extraction of relevant knowledge from information always be a matter of – well – intelligence?

Studying Social Tagging and Folksonomy: A Review and Framework

Tuesday, April 14th, 2009

paper (pdf) by J. Trant, University of Toronto

Abstract:  This paper reviews research into social tagging and folksonomy (as reflected in about 180 sources published through December 2007). Methods of researching the contribution of social tagging and folksonomy are described, and outstanding research questions are presented. This is a new area of research, where theoretical perspectives and relevant research methods are only now being defined. This paper provides a framework for the study of folksonomy, tagging and social tagging systems. Three broad approaches are identified, focusing first, on the folksonomy itself (and the role of tags in indexing and retrieval); secondly, on tagging (and the behaviour of users); and thirdly, on the nature of social tagging systems (as socio-technical frameworks).

Mr Taggy: Searching Tag Taxonomy & Coping with Noise

Monday, March 16th, 2009

“Think of MrTaggy as a cross between a search engine and a recommendation engine: it’s a web browsing guide constructed from social tagging data. … The problem with using social tags is that they contain a lot of noise, because people often use different words to mean the same thing or the same words to mean different things. The TagSearch algorithm is part of our ongoing research to reduce the noise while amplifying the information signal from social tags.” (from here)