I’ve been thinking a lot about extensions over the last couple of months because we’re slowly working on getting our extensions site up and running. Thanks to much great feedback on a blog post I wrote last week, I feel reasonably certain we’re going to proceed with some sort of hybrid of several of the listing strategies I proposed. That is, we’ll list pretty much all the Flock extensions we know about, the Flock crew will single some out as especially worth your time, and we’ll devise some way of incorporating community feedback as well.
The big problem with community feedback and ratings is that people have a tendency to try to game the system. Some have suggested that using click counts rather than user ratings is a more accurate measure of how popular an extension is, though that seems as easily gameable as ratings. You can click to download or write a script to elevate click counts as easily as you can do so to add ratings. And whereas you can limit ratings to one per user per extension by forcing a login to rate, you can’t reasonably limit extension download clicks. So I’m not convinced that click counting is the best metric for quality. Nor am I particularly convinced that a simple ratings system is the best route.
I wonder if a more complex rating system might be useful. Consider that in the original feedback request, one of the concerns I mentioned was that we want to find ways of helping improve the quality of extensions. That was in fact part of the reason for our considering featuring only the best of the best, as doing so raises the bar (theoretically, though I’ve always suspected that in practice it would just tick people off) and stands to benefit the community as a whole. So what if the rating system helped us gauge quality in different areas?
Suppose that each extension lets you rate it along several axes. I’ll tentatively propose “works well,” “looks good,” and “has good user interaction” as three such areas. Each one can be rated on maybe a five-point scale from very poor to very good (or strongly disagree to strongly agree). This lets us do a few things. It lets us quantify based on user feedback how extensions measure up in the different areas. This is useful to both users and developers, though it may not always be an ego saver. More importantly, it gives us a basis for helping match developers with graphics or usability gurus. Perhaps there’s an extension whose core functionality is outstanding but is a beast to use. These metrics help developers figure that out based on feedback from people using the extensions. In principle, design volunteers can go looking for high-functionality/low-design extensions and offer their help. The same goes for developers watching for high-design/low-functionality extensions. This setup also makes it a little harder to game the system because it’s not strictly a one-dimensional “good or bad” rating. If an extension developer elevates his numbers by gaming the system, his extension perhaps gets more visibility from a download perspective, but it gets less visibility from a QA perspective and never improves. By elevating all of your numbers, in other words, you reduce the likelihood of getting feedback that’ll help you improve your extension. And I think that most extension developers have honest intentions and relish feedback on their work.
So, these are just a few quick thoughts on one possible way of handling ratings on an extensions site. I’d love to have lots of feedback on what’s right and wrong about this approach. I’d also love to hear ideas for other innovative approaches not only to helping float the best extensions but also to letting the process drive quality and community (ie, pairing designers with developers). Speak up, peanut gallery!