All the Web's a Stage

Timothy Lee has an excellent post at Freedom To Tinker. Here is some of it:

[T]alking about "free riding" as a problem the Wikipedia community needs to solve doesn't make any sense. The overwhelming majority of Wikipedia users "free ride," and far from being a drag on Wikipedia's growth, this large audience acts as a powerful motivator for continued contribution to the site. People like to contribute to an encyclopedia with a large readership; indeed, the enormous number of "free-riders"—a.k.a. users—is one of the most appealing things about being a Wikipedia editor.

This is more than a semantic point. Unfortunately, the "free riding" frame is one of the most common ways people discuss the economics of online content creation, and I think it has been an obstacle to clear thinking.

The idea of "free riding" is based on a couple of key 20th-century assumptions that just don't apply to the online world. The first assumption is that the production of content is a net cost that must either be borne by the producer or compensated by consumers. This is obviously true for some categories of content—no one has yet figured out how to peer-produce Hollywood-quality motion pictures, for example—but it's far from universal.
Moreover, the real world abounds in counterexamples. No one loses sleep over the fact that people "free ride" off of watching company softball games, community orchestras, or amateur poetry readings. To the contrary, it's understood that the vast majority of musicians, poets, and athletes find these activities intrinsically enjoyable, and they're grateful to have an audience "free ride" off of their effort.

The same principle applies to Wikipedia. Participating in Wikipedia is a net positive experience for both readers and editors. We don't need to "solve" the free rider problem because there are more than enough people out there for whom the act of contributing is its own reward.

You can see how we got into this problem. Britannica is an encyclopedia and it is an economic enterprise. Wikipedia ends in pedia so it must be an economic enterprise too. But it is not, and we should look elsewhere for our analogies. Once we stop thinking of Wikipedia in economic terms the supposed paradox disappears. Small wonder that, as Lee says, economists are the ones who have the hardest time understanding it.

Amateurs have performed on other world-stages before now. Sports provides prominent examples: rugby union,Wimbledon and the Olympics were all amateur-only for many years. What happened, of course, is that although these stages were not economic for their participants, the hosts ended up with a lot of money once the world of television and advertising intruded. And once the hosts got money the performers started acting like economic agents too and demanded their share. Giving an amateur performance on a non-profit stage is not a paradox, but giving an amateur performance on a profitable stage is being a sucker.

Wikipedia, for historical reasons, has so far kept advertising and profit out of the picture. But other "platforms of mass collaboration" have not - Amazon, YouTube, IMDB, Bebo have all shown that the owners of the stage can make a lot of money. And once they do, it's only so long before the actors start to demand a share, and the whole dynamic changes. 

The indiscriminate gushing over the age of mass collaboration has obscured these differences for now, but they won't for long. And then, I would guess, amateur production may go the way of the amateur olympics. But maybe not - let's hope that non-profit stages stay that way so that the economists' misunderstanding of Wikipedia can continue.

YouTube Becomes Big Business

It's one of those hoary old sayings - it's been around for maybe two whole years now - that while the Geezer Generation of passive consumers watched network TV, the Net Generation of cool participators go on YouTube and do their creative teenage thing. But it's no longer either/or. A few recent milestones highlight how YouTube is changing.


First, Avril Lavigne pipped a homegrown video to be first to 100 million viewers and now "six of the 10 most-watched videos of all time are straight music videos."

Second, CBS has reached an agreement with Google to show full-length TV shows on YouTube.

Third, Tina Fey's Sarah Palin sketches for Saturday Night Live have been watched more times on the Internet than on TV. Says the Associated Press:

There were 10.2 million people watching the season-opening "Saturday Night Live" when Fey first appeared as Palin, with Amy Poehler portraying Hillary Clinton, according to Nielsen Media Research. These days, that's a good-sized audience for prime-time, let alone late-night, TV.
Another 1.2 million people captured the episode on their DVRs and watched within the week. Through the middle of last week, NBC estimated that it had streamed the skit online more than 13 million times. Those are just the numbers NBC can keep track of; the skit was undoubtedly captured and posted or e-mailed many more times.
NBC perfected "widget" technology only a few months ago, allowing video of its material to be captured across the Internet while retaining a tie to the network's Web site. It has aggressively marketed the Fey skits to political and comedy blogs...
...There's also the chance for even more revenue. Only in the past few weeks has NBC Universal perfected the technology to place a movie studio advertisement at the end of the clip it distributes online. Pre-clip advertising would add even more value.


It's not actually YouTube - as the article says, NBC now posts its own videos and removes them from the YouTube site when viewers post there - but these developments show that Internet viewing can be complementary to, not competitive with, mainstreamTV.  And "going viral" is no longer reserved for amateur guitar players. In fact, in an example of the centripetal web, the Internet now lets US Network productions go where they could not go before - you can now watch Fey/Palin in the UK on the Guardian and The BBC web sites. 

Whether TV networks will end up hosting their own material a la NBC, or whether there will be more CBS-style Google/network deals that see the networks outsourcing the hosting to YouTube in return for a slice of the advertising money, who knows? But the trend is clear; the confrontation between Internet and TV is coming to an end. The old enemies were perhaps never really at each others throats, and now they will cohabit happily. And while YouTube will continue to host amateur videos (why not?) it will make money from the music videos and the TV networks as it moves to higher-quality images and longer shows. Google's expanded YouTube advertising initiatives will help these deals along.

The Carr-Benkler wager is looking more and more like a win for Carr.

Money Ruins Everything, but we have to talk about it anyway

In Money Ruins Everything (blog post, complete article), John Quiggin and Dan Hunter look out at the new forms of creative expression introduced by the Internet (blogs, wikis, citizen journalism, and to some extent open-source software) and conclude that today's most important innovations are driven by the collaboration of amateurs with non-economic motives. They give this development progressive political overtones by labelling it "the 'amateur collaborative content' movement" [p216] and explicitly identify it as a non-commercial alternative to the market. I don't know Dan Hunter's other writings but Quiggin, at least, is a social democrat whose views I usually agree with and whose writings I read often and respect, so I don't disagree lightly with him. But the picture they paint is a distorted one so I have to.

[Note to regular readers: you may want to move right along - much of what's here is stuff I've written earlier in other contexts. It's just more of the same, but it needs to be said.]

There are three major things wrong with the paper.

  1. A portion of the activity they describe as non-commercial is in fact commercial.
  2. They exaggerate the rise of the amateur in the Internet age.
  3. They neglect the other side of the coin, which is the spread of commercialism into areas that were previously non-commercial.

I thought I'd talk about all three, but it's taking too long to write and I need to go and cook some dinner as part of my contribution to our household's creative production so I'll just address the first. If I were to say something about the other two it would be

(2) in assessing the rise of the new amateurism they neglect instances of amateur production that existed before the Internet and may be driven out by the rise of the Internet. And

(3) Facebook represents the commercialization of conversation, not the amateurization of collaborative content production.

The distinguishing technological feature of the collaborative web ("web 2.0") is the shift from peer-to-peer networks to a "platform" architecture that is built on top of the lower-level protocols of the Internet.

Wikipedia is a platform: it defines the ways in which you can interact with it, stores the changes you make, and provides security and authentication mechanisms among other things. Facebook is a platform. Amazon is a platform. YouTube is a platform. Blogging takes place on platforms. So the authors are wrong when they say that "As a result of various forces—notably the ascension of the general purpose computer, peer-to-peer technologies, and the internet—all manner of established verities in the content industry are falling." [p215] None of the platforms mentioned above operate in a peer-to-peer manner.

The distinction matters because when content is built on a platform, it is in some important senses owned by the platform-owner or aggregator. Private ownership is present, even if the content (videos, book reviews etc) is explicitly shared by its amateur producer. The suggestion that "Users can modify open source software as they see fit, and can choose whether to make their modifications publicly available, but cannot charge for the use of software derived from an open source program." [218] is incorrect for open source software itself (it applies only to GPL'd software and not software produced under other open source licenses such as the BSD license, and even then copyright owners - such as MySQL AB, now a part of Sun Microsystems  -  can and do charge for some versions of the software). More importantly for this paper, it is also misleading when it comes to collaborative content. Chad and Steve cannot sell an individual video produced by an amateur, but they can sell the entire collection of videos lock stock and barrel to Google for about a third of a billion dollars each; Michael Birch can sell Bebo and all its content (including Billy Bragg's songs) to AOL and pocket $600 million. Now that's commercial. Or you can get a seat at Davos, of course, which now seems to be the Cannes red carpet equivalent for our youthful webby leaders.

Amateur production on a non-commercial platform such as Wikipedia is non-commercial. And it's important. No argument there. But amateur production on a commercial platform is commercial activity. With one party in each transaction motivated my money, it's the sound of one hand shaking. Back in 2003 Tim Bray's dinner companion Robb Beal introduced the phrase "digital sharecropping" to distinguished building software "for any platform that is owned and operated by a company" from building sofware for the web. But now that dichotomy has faded: the web includes many platforms owned and operated by companies and sharecropping has moved online along with it. Nicholas Carr has been particularly pointed about the movement of digital sharecropping onto the web, and Seth Finkelstein has pointed to several examples including citizen journalism. It's a phrase that should be front and centre of everyone's mind when they see the phrase "networked production".

The distinction can often be seen in who is sharing what. On a non-commercial platform, the amateurs share and the platform owners share as well. At least, my understanding is that on Wikipedia not only the content but also the software and large amounts of data mining derived from it about users and so on is shared publicly. In contrast, on a commercial platform only the amateur material is shared. The contributions of the commercial part of the transaction (Amazon's sales data for example; data on user habits; the software that runs the platform itself) are not shared -- in fact this half of the story is hidden with as much zealousness as the source code of any closed-source company. One of the reasons that "The Long Tail" was such a flop of a book is that the data Chris Anderson relies on is unavailable for anyone else to inspect, and much of it was given to him in filtered form by the content owners.

A second failure is most obviously present in the following paragraph (page 245).

Copyright thus provides incentives to the intermediaries of the content industries—publishers, agents, movie studios, retail stores, etc.— where the processes of moving content from creator to user are expensive or capital-intensive. These “content processes” include the creation of the content, the selection of the content for commercial publication, its production and dissemination, its marketing, and its eventual use. Until recently each of these processes has been too expensive, too difficult,  or too specialized for amateurs to undertake. So until now, highly capitalized intermediaries were necessary to ensure that content moved into society.

Amazon, Google and so on are obviously highly capitalized. More so, in fact, than many small distributors of pre-web content. The web is seeing the movement of content into society consolidated into a handful of "highly capitalized intermediaries": very large and wealthy commerically-driven web site owners, including such old-school outfits as News Corporation, the owner of MySpace. It does no good to ignore this trend.

The paper also elides the distinction between amateur and professional motivations, with the temptation to caricature employed work as factory-line production of alienated labour, while amateur work is driven by love. To take a relevant example, what is the overlap between the motivations of John Quiggin (blogger) and John Quiggin (employee)? Is his work for the University of Queensland driven by the his "first rule" which is "never to give more than you get?" [231] And if not, why does he think that software engineers for commercial software companies writing closed-source programs are driven by different motivations? There is a possible argument (nicely made by Geof in a comment on a previous post) about alienation and the relationship between the developer and the code, but this does not cover production in sharecropping platforms. I'm more inclined to go with another comment by Dipper, that participation in many cases can be covered by standard utilitarian calculus.

One argument made by the authors is that the incorporation of money into production would drive out amateur efforts in a blood-donation kind of way. I'm happy to help push someone's car out from a snow drift for free, but I wouldn't do it for a dollar. But there are two groups of people with an incentive to keep money out of the equation: one is the promoters of real community-driven, shared production (the authors' camp, and one I'm quite happy with) and the other is the turbocapitalist platform owners such as Amazon (would you review a book for a nickel? would you trust the review of someone who did?) It is this shared interest in supressing the role of money for very different reasons that makes me most queasy in contemplating the future of the Internet. We need to be clear in distinguishing public goods from privately owned plantations. Unfortunately, in this paper, John Quiggin and Dan Hunter fail to make the distinction and the result is a distorted picture of web-based creative work. Money may ruin everything, but turning a blind eye to money doesn't get around that problem.

I feel like I'm being harsh on the authors here - as I say, I agree with their political sympathies, and the one I have read (Quiggin) is obviously smart and well informed on many issues. But serious social scientists need to do serious work on the nature of production on the Internet and not just adopt the turbocapitalist line. Me, all I can do is raise a few questions. But then, when it comes to social science, I'm just an amateur.

Don't be Pompous

Let's get this straight - I don't have anything  against Google any more than I do any other company. But there are times when it just goes out of its way to be a pretentious prat. Today is such a day.

Here is Google's unbelievable official response to the Microsoft attempt to buy Yahoo!  How about this for a paragraph:

Could Microsoft now attempt to exert the same sort of inappropriate and illegal influence over the Internet that it did with the PC? While the Internet rewards competitive innovation, Microsoft has frequently sought to establish proprietary monopolies -- and then leverage its dominance into new, adjacent markets.

We take Internet openness, choice and innovation seriously. They are the core of our culture.

The beauty of this stance is that you can play the openness and innovation off against each other. Google's important software is just as proprietary, closed source, and hidden as that of Microsoft - in fact more so because M$ has shared source agreements with many companies while Google's core technologies are not shared with anyone. Google does not disclose information about things like water consumption at its server stations because it's a "competitive matter"; Google buys properties under other names because it can get a better deal. In cases like these where Google wants to be secretive they pull the innovation card and talk about innovation and the need to  prosper in a competitive market.

When Google wants to promote technologies that are complementary to its own, it makes them open source (for example, in its sponsorship of Firefox browser development) and talks in idealistic terms about "the community". The overlap of interests between Google and its "communities" is partial at best, however, and such talk is cheeky, to say the least, coming from some of the world's richest people. And if making its advertising-driven wealth isn't leveraging its dominance in search into new, adjacent markets then I don't know what is.

In the end, Google and Microsoft are both profit-maximizing companies. No matter how much they couch their goals in suitably vague idealistic terms ("don't be evil" and the "freedom to innovate" respectively) they respond to the incentives faced by all such institutions. I just wish they'd be up front about it and not put out the kind of drivel that Google did today.

The Netflix Prize: 300 Days Later

Today the Netflix Prize Competition has been running for 300 days.

Online DVD rental outfit Netflix caused a real buzz last October when it announced the competition. If anyone can come up with a recommender system for predicting customer DVD preferences that beats its own algorithm (Cinematch) by a certain amount, Netflix will hand over $1million. The prize got a lot of attention because it exemplifies the idea of crowdsourcing. Not only does Netflix rely on crowdsourcing of DVD ratings (user ratings of DVD titles) but the competition itself is an attempt to use crowdsourcing to develop the algorithms to make the most of those ratings. Instead of doing the work itself, or hiring specialists, Netflix lets whoever anyone enter their competition and pays the winner. The competition is still in progress: Netflix says it will run until at least 2011. So now the initial buzz has died down, what can we learn from the Netflix Prize?

First, the competition details (see here (PDF) for a short paper by two Netflix employees). Netflix made public a database of customer DVD ratings (tweaked to ensure privacy) that included over 100 million individual ratings of 17,770 titles by 480,189 people. If you sign up for the prize, you can download these ratings. Each rating involves one customer giving an integer number from one star (very bad) to five stars (very good) for a given title. For example, customer 296452 gave title 234 ("Animation Legend: Winsor McCay") a rating of 1 (very bad).

The idea is that competition entrants try to develop an algorithm by using the training set (which is the 100 million plus set of ratings), try it out on a set of probe set of test data that they also give you, and once they think they have a good algorithm, create a set of predictions for a qualifying set of users and titles, and upload it to Netflix. Netflix test these predictions against the actual rankings (which they keep private) for that qualifying set. They post the leading algorithms on a leaderboard.

The quality of any algorithm is determined by its root mean squared error (RMSE). To calculate the RMSE you take the difference between the rating the algorithm predicts and the actual rating, and square it so it's guaranteed to be a positive number. Then you take the average of all these over the set of data to get the mean squared error. Finally, taking the square root gives the RMSE, which is the roughly the size of a typical error.

A perfect algorithm would predict exactly what rating every user would give to every title and would have an RMSE of zero. A random set of predictions has an RMSE of 1.95. But the actual range of action is much narrower than this 1.95 range. A simple algorithm that uses the average rating for each title as the prediction - "let's see, the average rating for the 104,000 customers who rated Mean Girls was 3.514, so I predict you will give it a rating of 3.514" - gets an RMSE of 1.0540. Netflx's Cinematch algorithm has an RMSE of 0.9525. Netflix set the prize target at a 10% improvement over that, which is an RMSE 0.8563. So the range that recommendation systems can realistically cover - from naively simple to cutting-edge research - seems to be the narrow band between the middle three lines in the following diagram.

In the days and weeks after the prize was announced, progress was rapid. The Cinematch score was matched within a week. Within a month the leaders were half way to the winning prize with a 5% improvement. But getting further improvement progress has proved more and more difficult. It took another month to get to a 6% improvement, about 5 more months to get to 7%, and the current (July 29 2007) leader is at 7.8% improvement and has been unchanged for a month. Here is a graph of the progress, showing the three lines above and the prize leader progress:


At this stage it is not clear if the prize is winnable: the existing algorithms use a lot of linear algebra and some pretty fancy machine learning ideas (see a description by a leading participant here and some sample code for a similar approach here), the leading groups include university research labs from around the world, and many of the more obvious approaches have been explored. Certainly media and blog interest - huge in the early days - has dropped off in recent months. This New York Times article is one of the few from the last month or two.

Let's not get into the computer science of recommender systems - there's a good review from 2004 called Evaluating Collaborative Filtering Recommender Systems here if you want to know more. Instead, let's step back a bit and ask what this prize tells us so far, and look at a couple of things we can learn by poking around the massive data set that Netflix provided.

One question is: how good is an algorithm with an RMSE of 1, and is an algorithm with an RMSE of 0.8563 much better for the average customer? Actually, I guess that's two questions. Anyway, if the errors followed a normal distribution (which they don't, but we're talking back-of-envelope here) then if a customer actually rated a title as 2 (poor), an algorithm with an RMSE of 1.0 would predict somewhere between 1 and 3 about 70% of the time. Not bad, but not startling. If the algorithm gave ten recommended movies, then it would get on average seven out of ten within one unit of the customer's actual rating. Meanwhile, the RMSE=0.8563 algorithm would get 7.6 out of ten. While this is an improvement, and while it may be a remarkable technical accomplishment, it does not seem to be exactly a revolutionary leap compared to the really simple algorithms as far as customers go.

[Update, December 25, 2007: Yehuda Koren of leading team KorBell approaches the recommendation problem a different way, looking at ordering of recommendations rather than at matching them. His way is more appropriate, and gives much more encouraging results. See here.]

As soon as you start looking at the data set it becomes obvious why it is so difficult to get good results. Databases don't have the linear algebra and other mathematical tools for taking a run at the prize but they are convenient for exploring data sets, so I loaded the data into a SQL Anywhere  database (The developer edition is a free download, and I'll provide a perl script to load the data if you really want it) and started poking around. Here are a few of the more obvious oddities (all these observations have been posted elsewhere - see the Netflix prize forum for more):
  • Customer 2170930 has rated 1963 titles and given each and every one a rating of one (very bad). You would think they would have cancelled their subscription by now.
  • Five customers have rated over 10,000 of the 17,770 titles selected - and presumably they also have rated some of the others among the 60,000 or so titles Netflix had available when they released the ratings. Are these real people?
  • Customer 305344 had rated 17654 titles. Even though Netflix make it easy to rate titles that you have not rented from them (so they can get a handle on your preferences) can this be real?
  • Customer 1664010 rated 5446 titles in a single day (October 12, 2005).
  • Customer 2270619 has rated 1975 titles. 1931 were given a 5, 31 were given a 4, 10 given a 3, 2 given a 2 (Grumpy Old Men and Sex In Chains) and a single title was given a 1. That title? Gandhi, which has an average rating of over 4 and which less than 2% of those who watch it give a 1.
  • The most often rated movie? Miss Congeniality with ratings by over 232,000 of the 480,000 customers. And which title is most similar to it in terms of ratings (using a slightly weighted Pearson formula)? Bloodfist 5: Human Target.
  • Most highly rated - Lord of the Rings: Return of the King (Extended Edition), with 4.7.

Some of the more bizarre facts above may be artifacts of whatever tweaking process Netflix put the data set through (although they claim not to have materially affected the statistics). While odd, bizarre users are not always difficult to deal with: if you have rated each of the last 1963 titles you've watched as 1 it is pretty easy to predict what you will rate the next title. But others are more tricky.

One reason for these oddities is one of the things that Evaluating Collaborative Filtering Recommender Systems identifies. They note that (on other, smaller data sets) even the best algorithms don't seem to get beyond an RMSE of 0.73 on a five-point scale, and speculate that the cause may be "natural variability". We users provide inconsistent ratings - sometimes we'd rate a movie a 3 and sometimes a 4, with no consistency. It may depend on our mood when we watched the movie - we may give a romantic movie a higher rating if we watched it on a first date than if we watched it a week later after being left broken hearted, or a demanding movie a low rating because we were tired and out of sorts when we watched it - or it may depend on our mood when we actually provide the rating.

There are other, more obvious reasons which, for reasons I don't understand, don't seem to get discussed much. Netflix itself and most competitors talk about the data in terms of "movies" and "users". But the "movies" in the list are not all movies: a lot are TV series or music video collections. The variability among the episodes of a series (Do you think Lost Season 1 deserves a 3 or a 4?) must make single-number ranking even more variable and these composite DVDs figure prominently among those titles that have the biggest variance in ratings.

Then there's the fact that a customer might not really be a single person. It might be a household with several viewers in it. So perhaps one person likes Terminator, one likes Bridget Jones, and one likes Spongebob Squarepants. Once we realize that the "user" might be a collection of people there is no strangeness between giving high ratings to each of these, but you can see how, depending on which household member entered the rating, the values may be quite different (perhaps this is why titles like 'N Sync: Making of the Tour, Pokemon Vol 9, and Boston Red Sox 2004 World Series Collectors' Edition have high variance - the person rating may not always have been the person who wanted to watch it). If the data set contains these inevitable variations (in addition to the plain kookiness on show in the Netflix set) then it may be that even the clevverest algorithm can make little progress in untangling all the intrinsic vagueness of the data.

So what I get from the Netflix prize is that there are probably significant limits to recommender systems. Even the smartest don't do a whole lot better than the simple approaches, and a lot of work is required to eke out even a little more actual information from the morass of data. It seems surprisingly difficult to get reliable, factual information on this important question of how useful they can be. Part of the reason is that they are new - Amazon has only been in business for about ten years after all - and part of the reason is that the behaviour of these systems is often a closely guarded secret despite the aura of openness that web companies cultivate.

This matters because there is a surprising amount riding on the effectiveness of recommender systems. Silicon Valley's new-economy enthusiasts see them as the key to developing a new level of cultural democracy: they see recommender systems as a trebuchet hurling rocks at the castles of the old elite of mainstream media, big publishers with big marketing departments, big-chain book stores and Hollywood sequels. Recommender systems are claimed to embody the "wisdom of crowds". The idea is that everyone just publishes stuff (blogs, wikipedia entries and so on) and amateur readers or viewers decide what has merit by their actions (rating stories, buying and rating books and DVDs and so on). The work of critics is "crowdsourced" to customers, but it is the recommender system that distills these ratings to yield the aforementioned wisdom.

If faith in recommender systems is misplaced, then the new boss may look much like the old boss only with more computer hardware. There is a danger that recommender systems may simply magnify the popularity of whatever is currently hot - that they may just amplify the voice of marketing machines rather than reveal previously-hidden gems. Even worse, their presence may drive out other sources of cultural diversity (small bookstores, independent music labels, libraries) concentrating the rewards of cultural production in fewer hands than ever and leading us to a more homogeneous, winner-take-all culture.

I'm no futurist, but I see little evidence from the first 300 days of the Netflix Prize that recommender systems are the magic ingredient that will reveal the wisdom of crowds.

Reputations

Regarding reputation-building on the Internet, Clive Thompson writes approvingly that

network algorithms do not favor the cagey or secretive. They favor the prolific, the outgoing, the shameless.

Said another way, network algorithms do not favour the quiet or the reflective. They favour the loud-mouthed, the self-promoting, the flashy.

O brave new world that has such algorithms in it.

Believe the Opposite: Radical Opacity

I'm afraid Clive Thompson has jumped the shark. From being a witty journalist at the interesting This Magazine he now fits right in at the boring Wired Magazine. On the way he seems to have lost his sense of irony (maybe they don't let you bring irony into Silicon Valley?) and his cynicism. As a result, he has also lost the plot. Come back Mr. Thompson!

His March 2007 article in Wired Magazine called The See-Through CEO coined the phrase Radical Transparency. Like other Silicon Valley catch phrases, it has that air of youthful rebellion, it is self-consciously ignorant of history (who needs history when all the interesting things are happening right now), and - most important of all - it imparts a feel-good sense of anti-corporate attitude to your next venture funding proposal or business plan. Because like other Silicon Valley catch phrases, Radical Transparency has about as much to do with rebellion as riding a mountain bike.

Here are some snatches from the article, and some recent events in the real world, mainly as reported by The Register - which has thankfully managed to keep its senses of both irony and cynicism - and mainly about Web 2.0 poster-offspring Google and its growing Google-hoard of companies.

"You can't hide anything anymore," Don Tapscott says. Coauthor of The Naked Corporation, a book about corporate transparency, and Wikinomics, Tapscott is explaining a core truth of the see-through age: If you engage in corporate flimflam, people will find out.

Meanwhile, Google plays cat and mouse with regulators. Leif Aanensen, deputy director general of the Norwegian Office of the Data Inspectorate, has been investigating Google's data retention policy:

   "We are not satisfied," he said. "We didn't get the proper answers."

   "Our main issue was their data retention policy and the use of the data they   stored. We asked them what they were doing with the personal data - are you   creating profiles - they didn't answer," he said.

Thompson writes: "You can't go halfway naked. It's all or nothing. Executives who promise they'll be open have to stay open."

Meanwhile, Google - who make repeated references to their own "radical transparency" - are closed-mouth about the introduction of new programs.
Paying select few video producers for example:

YouTube says anyone who wants to get paid can let it know by registering an interest, but provided no timescale for when it will cough up, or what the carve-up will be.

Or will there be   advertising   on the iGoogle front pages?

   The company has not made any noises about placing personalised ads on the new   iGoogle personalised homepage, but industry observers are fairly confident it   is only a matter of time.

When it comes to openness, Thompson writes "there's no use trying to resist. You're already naked." How Naked? Hard to tell, because it is not easy to find out what   information   Google keeps about you.

   "Upon arriving at the Google homepage, a Google user is not informed of   Google's data collection practices until he or she clicks through four links,"   says the section of the complaint which details Google's alleged deceptive   trade practices. "Most users will not reach this page. In truth and in fact,   Google collects user search terms in connection with his or her IP address   without adequate notice to the user. Therefore, Google's representations   concerning its data retention practices were, and are, deceptive practices.

   "As a result of Google's failure to detail its data retention policies until   four levels down within its website, its users are unaware that their   activities are being monitored," says the complaint in the section alleging   unfair trade practices.

Thompson writes:

Secrecy is dying. It's probably already dead.

Meanwhile, here's Google being radically opaque:

ord broke this month that Google has purchased 800 acres of land in Pryor, Oklahoma. The company has yet to confirm plans for the site, but I'm betting  on a new data center rather than an amusement park (in all fairness, you can   never tell with this bunch – Ed).

Oklahoma proves a handy spot to have a data center since the state's Governor signed a new law that affords the largest corporate energy users the right to keep their power consumption figures a secret.

Governor Brad Henry signed the energy law (House Bill 1038) just a couple of   days after news of Google's land purchase reached the local newspapers.  Coincidence? Sure.

The lawmakers behind the bill denied having chats with Google around any legislation. People familiar with the matter, however, did note that the law proves convenient for an entity such as Google that likes to keep as much information secret as possible.

If you're a demanding type who needs evidence of Google's secret ways, have a  listen to head of strategic development Rhett Weiss. He presided over a party celebrating yet another Google data center in South Carolina. When asked about  Google's water and power usage, Weiss confessed:   "We're in a highly competitive industry and, frankly, one or two little pieces   of information like that in the hands of our competitors can do us   considerable damage. So we can't discuss it."

What else does Google not tell us? Here's Nicholas Carr:

“We never,” says a Google representative, “comment on who we’re talking to, who we’ve considered, who we’ve rejected. We feel that when we come to an agreement, that’s the time to make an announcement.”

So please, Mr. Thompson - exercise some scepticism. Even a little would go a long way.

The liberation mythology of the internet

Nicholas Carr of Rough Type has been reading David Weinberger's Everything is Miscellaneous, and is disappointed. But in his disappointment he coins a phrase I really like: "the liberation mythology of the Internet".

I only reached the bottom of page nine, at which point I crashed into this passage about music:

For decades we've been buying albums. We thought it was for artistic reasons, but it was really because the economics of the physical world required it: Bundling songs into long-playing albums lowered the production, marketing, and distribution costs because there were fewer records to make, ship, shelve, categorize, alphabetize, and inventory. As soon as music went digital, we learned that the natural unit of music is the track. Thus was iTunes born, a miscellaneous pile of 3.5 million songs from a thousand record labels. Anyone can offer music there without first having to get the permission of a record executive.

"... the natural unit of music is the track"? Well, roll over, Beethoven, and tell Tchaikovsky the news.

There's a lot going on in that brief passage, and almost all of it is wrong. Weinberger does do a good job, though, of condensing into a few sentences what might be called the liberation mythology of the internet. This mythology is founded on a sweeping historical revisionism that conjures up an imaginary predigital world - a world of profound physical and economic constraints - from which the web is now liberating us. We were enslaved, and now we are saved. In a bizarrely fanciful twist, made explicit in Weinberger's words, the digital world is presented as a "natural" counterpoint to the supposed artificiality of the physical world.


There's much more at Rough Type, as Carr demolishes Weinberger's claim.

Trackbacks Are Dead

I hand over a few dollars each month to Six Apart, who own Typepad, for this blog. There are free ones, of course (blogger for one) but when I started off I decided that Trackbacks were worth paying for. If you make a post about an entry on someone's blog, then you can add a trackback, which is a link from their blog post to yours. It looked to me like a valuable part of the conversational aspect of blogs.

But it seems that trackbacks are doomed. I don't think blogger ever supported them. And now many blogs have disabled them, because all you get is trackback spam which is a pain in the neck to deal with. I've noticed that they seem to be going the way of the dodo, and now I see that Trackbacks Are Dead. It's a shame - one more victim of the plague of spam.

But this little corner of the web is quiet enough I still have trackbacks enabled.

Bureaucracy: it ain't just the government

A glimpse inside the world of that old efficient, lean and mean, innovative private industry, Microsoft style, from someone who spent a year working on the shutdown menu.

The scary thing about the story is that you can imagine how it happens, one step at a time, with a good reason for each step. This is not a "what's wrong with Microsoft" story, this is a "what happens in big organizations" story. Read and weep.

Link: moblog: The Windows Shutdown crapfest.

So just on my team, these are the people who came to every single planning meeting about this feature [the shutdown menu]:

  • 1 program manager
  • 1 developer
  • 1 developer lead
  • 2 testers
  • 1 test lead
  • 1 UI designer
  • 1 user experience expert
  • --
  • 8 people total

  • These planning meetings happened every week, for the entire year I worked on Windows.
    In addition to the above, we had dependencies on the shell team (the guys who wrote, designed and tested the rest of the Start menu), and on the kernel team (who promised to deliver functionality to make our shutdown UI as clean and simple as we wanted it). The relevant part of the shell team was about the same size as our team, as was the relevant part of kernel team.
    So that nets us a conservative estimate of 24 people involved in this feature. Also each team of 8 was separated by 6 layers of management from the leads, so let's add them in too, giving us 24   (6 * 3) - 1 (the shared manager) 41 total people with a voice in this feature. Twenty-four of them were connected sorta closely to the code, and of those twenty four there were exactly zero with final say in how the feature worked. Somewhere in those other 17 was somebody who did have final say but who that was I have no idea since when I left the team -- after a year -- there was still no decision about exactly how this feature would work.

    By the way "feature" is much too strong a word; a better description would be "menu". Really. By the time I left the team the total code that I'd written for this "feature" was a couple hundred lines, tops.

    Update. The original post was down for a while, leading to a flurry of readers coming here instead, but is now back up. Any Joel readers who end up here anyway may want to read what I have to say about the question of choice in software. Or not, of course.

    Circular References

    • Not a Blogger
      This here is a relaxed, slow-moving weblog. It ain't one o' them hyperactive updated-all-the-time weblogs. Slow down a little.

    Book

    Blog powered by TypePad
    Member since 11/2005

    Tools

    • Sitemeter

    Books