Popular Articles on Win-Vector


John has just put up an article on the Win-Vector blog, highlighting some of our popular series of articles, as well as our more popular posts. If you like the articles that I point to on this blog, check out some of the other posts written by John, too.

As readers have surely noticed the Win-Vector LLC blog isn’t a stream of short notes, but instead a collection of long technical articles. It is the only way we can properly treat topics of consequence.

What not everybody may have noticed is a number of these articles are serialized into series for deeper comprehension.

Our series include:

Check out the original article for more details about these series, and for a pointer to our page of popular posts.

We’ve also updated the company website, so please do visit that, too.

Recent post on Win-Vector blog, plus some musings on Audience



I put a new post up on Win-Vector a couple of days ago called “The Geometry of Classifiers”, a follow-up post to a recent paper by Fernandez-Delgado, et al. that investigates several classifiers against a body of data sets, mostly from the UCI Machine Learning Repository. Our article follows up the study with seven additional additional classifier implementations from scikit-learn and an interactive Shiny app to explore the results.

As you might guess, we did our little study not only because we were interested in the questions of classifier performance and classifier similarity, but because we wanted an excuse to play with scikit-learn and Shiny. We’re proud of the results (the app is cool!), but we didn’t consider this an especially ground-breaking post. Much to our surprise, this article got over 2000 views the day we posted it (a huge number, for us), up to nearly 3000 as I write this. It’s already our eighth most popular post of this year (an earlier post by John on the Fernandez-Delgado paper, a comment about some of their data treatment is also doing quite well: #2 for the month and #21 for the year).

Read more of this post

Follow me via RSS!

I went back to using RSS to follow blogs and other websites recently; I don’t know why I ever stopped. My email doesn’t get clogged by notifications anymore, and I don’t lose blog updates in the ever-flowing stream of Twitter or Facebook or the WordPress reader. I can follow any blog on any platform as long as they have an RSS feed, and I don’t need to have accounts on every possible platform, either, just Feedly (and not even that, if I didn’t want to sync between devices).

It also occurred to me that RSS is really the only reliable medium for following an irregular blog like this one. Since I don’t blog on a regular schedule (or all that often), my posts tend to get lost in the WordPress reader, as do tweets and facebook/google+ updates.

So I’ve added a “Follow me on Feedly” button to the side of my blog; if you use another RSS reader, like Bloglovin or NetNewsWire, there is a generic RSS widget, as well. Even if you follow me on WordPress, or follow Win-Vector on Twitter, please do consider also following me (and other bloggers you love) via RSS, so you will be sure to never miss my blog updates. I promise, they will not all be about the book.


Popularity and Social Networks: Life is still like high school


I remember setting up the Multo blog a few years ago: my first blog explicitly meant for public consumption. On the “Follow” widget — the button that allows readers to follow a blog via email notifications — there is an option to show the count of the blog’s followers.

My first reaction: why would I want to do that?

It’s an insecurity reflex, of course, one left over from high school. I was never one of the popular or cool kids, though I was lucky enough not to be one of the pariahs, either. Like most of us, I flitted on the edges of the cool circle — the very outer edges, in my case — once in a while being noticed, mostly not. As my life, so will be my blog, my mind said. Why would I want to advertise my obscurity to the world?

Read more of this post

Book Update, and Thoughts on Topical versus Archival Blogging

We are sending substantive drafts of the first four chapters of our data science book out for review. Manning, our publisher, hopes to launch the book in their Early Access Program (MEAP) by early May. Crossing our fingers!

In the meantime, we have been preparing for the marketing push. One small thing we’ve done is to finally give Win-Vector a Twitter presence . You can follow us through the link on my sidebar, or on the sidebar at the Win-Vector blog. We never felt a strong need for a Twitter presence before, because we have always thought of the Win-Vector blog as a source of archival, reference material, rather than topical commentary. In other words, our readers find our posts when they need them, which might not be the same moment that we write them.

That got me thinking about the two kinds of blogs out there: those that (like Win-Vector) lean to the archival, and those that lean to the topical. This inspired a post on my personal blog, Multo:

Do you blog for today, or for someday?

In other words, do you sit down and write about what inspires you in the moment, to an audience of right now? Do you imagine your readers reading the posts today or tomorrow, first thing in the morning? Do you care at all whether a surfer who trips on your site a year from now will connect or care in any way about the post, or do you write for a community of followers and commenters who will have a conversation, with you and each other, in something close to real time?

Or do you write carefully thought out pieces of prose that you just know are exactly the right answer to someone’s need, somewhere, somewhen — not necessarily now? Perhaps you imagine that your readers find you by searching on aswang, chupacabra, or whatever your subject is, and discovering your work; maybe this happens tomorrow, maybe next year. But whenever it happens, your readers think “Aha! This is exactly what I was looking for!” Or so you hope.

You can read the rest of the post here.

All of our blogs lean to the archival. We do the necessary amount of social network promotion (LinkedIn, Facebook, Google+, Hacker News for the appropriate articles), and Win-Vector Blog is syndicated through the Statsblogs and R-bloggers aggregation sites, but much of our traffic comes from web search on statistical terms, and from word-of-mouth on specific articles. And we like it that way. It will be interesting to see what Twitter adds to the mix.