Stackoverflow.com

There is a good article on the principles driving the development of stackoverflow.com, a site where programmers get help with their coding problems on ReadWriteWeb.

I was particularly struck by the design points where Spolsky highlights the frustration created wrong answers and obsolete results.

I can remember when I was able to circumnavigate the web through a search engine for the topic of history of photography. It was that small. I could see everything there was to see about history of photography online in a week, a week of drudgery wading through duplicate results page after duplicate results page, until I had made sure I had seen everything about my topic. Although filled with a fair amount of junk and duplicates, I was still able to find a single web page if it contained sufficiently unique keywords, until about a year before Google emerged, I had relied on AltaVista to take me back to a web page in one go, when I could not remember where I had found a code solution on some obscure personal page, for example. Then the search engines began to fail me, and single pages I had found before became nearly impossible to find, but eventually, search engine technology improved and with Google, you could find that one blog page with the coding. That was one the solution to the problem of finding things.

Spolsky is right to observe the problem now is that search is failing to distinguish between correct and incorrect answers; between current and obsolete answers to technical questions.

When I first started programming using Microsoft Visual C++ (I was just a dabbler), I had a question about how to render bitmap graphics. I turned to the library of articles and code intended to help developers. I was happy when search quickly turned up an article on how to introduce bitmaps into your application. After an hour or two of reading, it slowly dawned on me the author was not talking about what I was familiar with, Microsoft Foundation Class applications. I was seeing unfamiliar code and unfamiliar techniques. I glanced up at the date. The article was from the mid 1990s. It was about coding C under Windows before MFC was introduced. The first, supposedly most relevant, documents search had brought up from MSDN was completely obsolete and about coding without an application framework. I had wasted hours reading the wrong articles.

Stackoverflow.com is an example of a great site. It is well designed, the developers learned the lessons of the last fifteen years of web technology and applied them. It is clean, beautifully presented and well organized site. I have to admit they did right what I failed to do with phphelp.com, which started by envisioning many of the same goals. They had to courage to go ahead with "soft security," collaborative editing, and content surfacing and valuing through a user voting system. Of course, with the volume of content and edits, such tools are necessary. What two humans could watch and police such a flow of content while doing their day job? User contributed and curated content is the only rational answer.

(By the way, it would probably be better to describe their principles as being informed by behavioral economics or an evolutionary branch of the field, than anthropology or social psychology, I feel the way people use voting systems to surface content, how "soft" social engineering strategies are employed on wikis, etc. to be close to the phenomena studied by behavioral economics, not just financial choices.)

Labels: , , , , , ,

SpeakUp: A Transcript Markup Language

What is SpeakUp?

A simple text markup language for transcripts of moving pictures or video including a markup language for annotation.

Overview

When the Folkstreams project required a way for filmmakers and academic contributors to create and maintain transcripts for films archived and presented through the Folkstreams website, I decided a simple text markup language would be the best way to store and edit transcripts.

A transcript markup language defines a series of conventions for formatting text (like wiki text) that is translated into HTML for display. SpeakUp was designed to contain as much content as possible and preserve meaning for possible later conversion into XML or database form.

Speakup is implemented as a module extending the PEAR Text_Wiki library text translation module and is a requirement for use.

Although development and documentation of Speakup is not complete, it is in use on the Folkstreams website.

Speakup, including all markup, code and documentation is open source and released under a GPL license. I apologize for the brevity of this document, but the best way to learn SpeakUp is to download the package and experiment with it. Download.

Some Background

Some background on why transcripts are important. As the Folkstreams project was developed, project director Tom Davenport and developer Steve Knoblock, in a series of discussions, arrived at the conclusion that transcripts are essential to searching, finding and understanding films online. Two points emerged: that transcripts are a rich source of indexable text that help make media searchable and that more importantly, transcripts are a rich source of conversation and debate.

Frequently notes are more informative and interesting than the work they annotate. We discovered this was true for film transcripts (see Sadobabies for an example of a conversation going on in the notes about the nature of folklore). Although there are sophisticated means to capture the dialog of a moving picture and render it to text, these transcripts are inadequate. They lack annotation. They lack expressive quality of a transcript edited by a knowledgeable person. They are in a sense, a travesty, like an OCR'ed copy of Dickens left uncorrected.

Labels: , , , ,

Simple Software: Is it viable?

I love the idea simple software. Of software without cruft and bloat. When I create an application I want to make it simple, but that is not the way of the world. Most applications end up being complex. I have not entirely given up on the idea of fighting creeping featurism, featuritis and bloat, but it occurred to me that it may be a losing battle, after switching from SimplePHPBlog to Wordpress (not for this site) and Blogger. It seems that most projects to create plain vanilla or simple versions of software with a reduced feature set nearly always fail to gain popularity. If you look at Windows, it's sold on features, when you look at products in the store, they are sold on features. When you think of the natural world, it seems true that:

All significantly interesting things will necessarily become complex and paradoxical given enough time.

Labels: , , ,