Techmeme: News Automation Doesn’t Work

Posted on December 4th, 2008
By Todd Zeigler in Other

<Cross post from our ImpactWatch blog>

I am a big fan of the tech news aggregator Techmeme as well as its politically focused sister site Memeorandum (not so much the gossip focused WeSmirch).   Both sites use complex algorithms to discover and group new content in real time.  I read both sites on a daily basis as a way of getting a sort of Cliffs Notes summary of what is going on in the tech and political blogosphere on a given day. 

Anyone who reads these sites frequently is all to aware of the limits of the site’s automation algorithm.  Content that is only tenuously related is often grouped together.  The lead story on a given topic is sometimes the least important story on the topic.  The point of the news that is being broken is often missed.  The example below, which shows a story about Anna Nicole Smith being hospitalized as the lead story instead of one about her being dead, shows the just one of the kind of problems automation can bring. 

wesmirch

Techmeme founder Gabe Rivera explained the gist of the problem in a blog post yesterday:

Any competent developer who tries to automate the selection of news headlines will inevitably discover that this approach always comes up a bit short. Automation does indeed bring a lot to the table — humans can’t possibly discover and organize news as fast as computers can. But too often the lack of real intelligence leads to really unintelligent results.

In an effort to provide better results, Rivera has hired a human editor to augment the his site algorithm.  About the decision, he writes:

Early on, when our system was less technically refined, the clearest path toward improvement involved simply iterating algorithmic development. Later, as the automation reached a certain degree of maturity, we recognized that direct editing could now improve news results by leaps and bounds. Though our roadmap contains a number of novel future algorithmic enhancements, introducing editing now appears to be a no-brainer.

Through our ImpactWatch media monitoring platform, we’ve done a lot of work on automation versus human review.  Indeed, in many ways the challenges we face on are more difficult than Techmeme since ImpactWatch does sentiment analysis as well as categorization.  After a great deal of trial and error on ImpactWatch, we’ve come to the same conclusion as Rivera: the best way to analyze and organize news is through a combination of human editing and automation.  Automation can get you part of the way, but ultimately if you really care about the quality of the analysis some sort of human editing is necessary.

To learn more about ImpactWatch, please sign up for our demo.

| Trackback URL |

Leave a Reply

Comments for this post will be closed on 18 January 2009.

about this blog

The Bivings Report (TBR) is a source of news, insight, research and analysis on the web-based communications industry. TBR content is posted, created and managed by internet strategists, media/communications analysts, web developers, designers and programmers, all of whom are employees of The Bivings Group.

Search Site

Archives

2009
Jan          
2008
Jan Feb Mar Apr May Jun
Jul Aug Sep Oct Nov Dec
2007
Jan Feb Mar Apr May Jun
Jul Aug Sep Oct Nov Dec
2006
Jan Feb Mar Apr May Jun
Jul Aug Sep Oct Nov Dec
2005
Jan Feb Apr May Jun Jul
Aug Sep Nov Dec    
2004
Jan Feb Mar Apr May Jun
Jul Aug Sep Nov Dec  
2003
Jan Feb Mar Apr May Jun
Jul Aug Sep Oct Nov Dec
2002
Jan Feb Mar Apr May Jun
Jul Aug Sep Oct Nov Dec
2001
          Dec

RSS feed RSS feed
RSS feed Facebook
RSS feed Follow on Twitter

Email Subscription


Delivered by FeedBurner

Collaborate

Send Tips Send Tips
Wiki Wiki

Authors

Tags

Most Popular Posts

Blogroll