nPost Blog

Instrumenting the offline world

In the last decade there have been major advances in storing, analyzing, and acting upon extremely large data sets.  Data sets that were previously left dormant are now being put to (mostly) constructive use. But the vast majority of information in the world isn’t available for analysis because it isn’t being electronically collected.

This is changing rapidly as new data collection mechanisms are implemented – what engineers refer to as instrumentation. Common examples of instrumentation include thermometers, public safety cameras, and heart rate monitors.

Smart phones are one obvious new source of potential instrumentation.  A person’s location, activities, audio and visual environment – and probably many more things that haven’t been thought of yet – can now be monitored.  This of course raises privacy issues.  Hopefully these privacy issues will be solved by requiring explicit user opt-in.  If so, this will require creating incentives for people to do so.

Foursquare instruments location in an opt-in way through the check in. The incentives are social and game-like, but the data produced could be useful for many more “serious” purposes.  Fitbit instruments a person’s health-related activity. The immediate incentive is to measure and improve your own health, but the aggregate data could be analyzed by medical researchers to benefit others.

In manufacturing, there has been a lot of interesting innovation around monitoring machinery, for example by using loosely joined, inexpensive mesh networks.  In homes, protocols like ZigBee allow devices to communicate which allows, for example, automation of tedious tasks and improved energy efficiency.

In the next decade, there will be a massive amount of innovation and opportunity around the big data stack. Instrumentation will be the foundational layer of that stack.

Web services should be both federated and extensible

One of the most important developments of the web 2.0 era is the proliferation of full featured, bidirectional APIs.  APIs provide a way to “federate” web services from a single website to a distributed network of 3rd party sites. Another important web 2.0 development is the proliferation of web Apps (e.g. Facebook Apps). Apps provide a way to make websites “extensible.”

The next step in this evolution is to create web services that are both federated (APIs) and extensible (Apps).

In my ideal world, the social graph would not be controlled by a private company. That said, Facebook, to its credit, has aggressively promoted a fairly open API through Facebook Connect. Facebook has also been a leader in promoting Apps. For Facebook, creating extensible, federated services would mean providing a framework for Facebook Connect Apps – apps that extend Facebook functionality but reside on non-Facebook.com websites.

Consider the following scenario.  Imagine that in the future a geolocation data/algorithm provider like SimpleGeo takes Facebook Places check-in data and, using algorithms and non-Facebook data, produces new data sets, for example: map directions, venue recommendations, and location-based coupons. The combination of Facebook’s data (social graph and check-ins) and SimpleGeo data/algorithms would create much more advanced feature possibilities than either service acting alone.

With today’s APIs, if, say, Gowalla wanted to integrate Facebook plus SimpleGeo into their app*, they would basically have 3 choices:

1) Embed Facebook widgets in Gowalla.  These are simple iframes (effectively separate little websites) that don’t interact with SimpleGeo.  Gowalla would just have to sit and wait and hope that Facebook decided to bake in SimpleGeo-like functionality.

2) Pre-import SimpleGeo data. This significantly limits the size and dynamism of the SimpleGeo data sets and doesn’t incorporate SimpleGeo algorithms, thus severely limiting functionality.

3) Host an instance of SimpleGeo’s servers internally.  This requires heavy technical integration, undermining the main benefit of APIs.

In a world of extensible APIs (or “API Apps”), Gowalla could instead send Facebook data back to SimpleGeo.  The data flow would look something like this:

(Note how there are three parties involved – @peretti calls this a “data threesome”). This configuration is much simpler to integrate – and potentially much more powerful and dynamic – than the other configurations listed above.  You could implement this today, but it would create user experience challenges.  For example, Gowalla would be sending Facebook data to a 3rd party (step 3), which might (depending on the data sent) require explicit user opt-in. Things become more onerous if SimpleGeo wanted to share its own user data with Gowalla. That would require an additional oAuth to SimpleGeo (authorizing step 4).

Allowing websites to be federated and extensible will open up a whole new wave of innovation.  Ideally some spec like oAuth could include the multiple authorizations in a single authorization screen.  Facebook could also do this by allowing 3rd parties to be part of the Facebook Connect authorization process.  Inasmuch as Facebook’s seems to be trying to embed their social graph as deeply as possible into the core experiences of other websites, allowing extensible APIs would seem to be a smart move.

* I have no connection to any of these companies (Facebook, Gowalla, SimpleGeo) and have no knowledge of their product plans beyond their public websites.  I am imagining functionality that Gowalla and SimpleGeo might include someday but for all I know they have no interest in these features – I just picked them somewhat arbitrarily as examples.

The bowling pin strategy

A huge challenge for user-generated websites is overcoming the chicken-and-egg problem: attracting users and contributors when you are starting with zero content. One way to approach this challenge is to use what Geoffrey Moore calls the bowling pin strategy: find a niche where the chicken-and-egg problem is more easily overcome and then find ways to hop from that niche to other niches and eventually to the broader market.

Facebook executed the bowling pin strategy brilliantly by starting at Harvard and then spreading out to other colleges and eventually the general public.  If Facebook started out with, say, 1000 users spread randomly across the world, it wouldn’t have been very useful to anyone.  But having the first 1000 users at Harvard made it extremely useful to Harvard students.  Those students in turn had friends at other colleges, allowing Facebook to hop from one school to another.

Yelp also used a bowling pin strategy by focusing first on getting critical mass in one location – San Francisco – and then expanding out from there.  They also focused on activities that (at the time) social networking users favored: dining out, clubbing and shopping. Contrast this to their direct competitors that were started around the same time, were equally well funded, yet have been far less successful.

How do you identify a good initial niche?  First, it has to be a true community – people who have shared interests and frequently interact with one another.  They should also have a particularly strong need for your product to be willing to put up with an initial lack of content. Stack Overflow chose programmers as their first niche, presumably because that’s a community where the Stack Overflow founders were influential and where the competing websites weren’t satisfying demand. Quora chose technology investors and entrepreneurs, presumably also because that’s where the founders were influential and well connected. Both of these niches tend to be very active online and are likely to have have many other interests, hence the spillover potential into other niches is high. (Stack Overflow’s cooking site is growing nicely – many of the initial users are programmers who crossed over).

Location based services like Foursquare started out focused primarily on dense cities like New York City where users are more likely to serendipitously bump into friends or use tips to discover new things. Facebook has such massive scale that it is able to roll out its LBS product (Places) to 500M users at once and not bother with a niche strategy.  Presumably certain groups are more likely to use Facebook check-ins than others, but with Facebook’s scale they can let the users figure this out instead of having to plan it deliberately. That said, history suggests that big companies who rely on this “carpet bombing strategy” are often upended by focused startups who take over one niche at a time.

hosting