nPost Blog

Some well worn advice for any startup

firehosedrink_3

by John Dietz of Adometry

When we started our company last year, we got a fair amount of advice, and most of it we even asked for.  Among the usual stuff like “Build a business, not a product” and “Everything takes longer than you think it will”, we also heard the standard “Get to market as fast as you can, even with a limited feature set.” We knew that we didn’t have all the answers, so we’ve appreciated the advice and help we’ve gotten from a number of sources, notably adjusting our development schedule to get to market as fast as we could.

Based on the idea of getting to market quickly, we had a UI prototype complete that we showed to as many people as possible, potential customers (advertisers and agencies), related companies (various publishers), other technology companies, and perhaps even some competitors.  Our goal here was to get feedback to validate or evolve the direction of the product and business we were building.  We followed that quickly with a functional beta product while continually getting feedback from our customer base.

Of course to get to this point quickly, like a lot of startups we took shortcuts. We chose what our priorities would be (scale, validity of data), but left some of the detail work for later.  The architecture isn’t necessarily what we want to end up with, but we wanted to spend our early development efforts proving we could collect and analyze the data we were pitching.

Then one day it happened.  We’d been chugging along happily for a few months with some small beta customers, when we finally got the signed contract from a very large Fortune 100 company that was very interested in using our system to show the value and efficiency of their online advertising purchase. The same afternoon we received the signed contract we had the email outlining the campaign they wanted to run. It wasn’t a huge campaign in the overall scheme of things, when I was at Disney we would easily serve a billion ads per day, but this campaign would represent a magnitude more traffic than we were seeing with our earlier betas. It was go time, or perhaps go go time since a couple of days later we got agreement from our next major customer who would generate for us another large volume of requests.

As everyone knows, success is not a bad problem to have, but our engineers and I worked hard and late for the next several days scaling out our Amazon EC2 tiers and doing some extra load testing to make sure we could handle the traffic.  Because we have a good relationship with these customers, we convinced them to do incremental campaign analysis, just as an extra precaution.  Our focus at this point was entirely on our customer experience. Whatever happened on the back end, or whatever extra effort we put in, our customers needed to see and trust the data we were providing.

To prepare for the traffic we took several steps:

  • Moved key files to a CDN (Content Delivery Network) for quick delivery
  • Validated multi-tiered environment with load balancers and failover for most important services
  • Generated traffic and data sizing projections
  • Performed load testing
  • Server resource monitoring

At 2 PM on a recent Monday, the fire hose opened and we watched carefully.  Things looked good and traffic was climbing. Our servers were running fine and data was showing up in our UI with about a 3 minute lag from real time.  We were watching error logs, server utilization, and log sizes. At 2:32 one of our logging servers failed, unfortunately due to some processes left running on that box from when we ran the entire system on that box, but the failover process worked perfectly and we lost no data (I would rather have not had the failure, but it was a nice production test of our failover ability).

We quickly found another bug in the data parsing and were able to resolve in quickly, again with no data loss. In all, we spent the next several hours watching, tweaking, and on edge. This time may be the most exciting time for a startup.

The system is still running just fine, and we are projecting traffic based on this data for bringing on the rest of this campaign, and for the start of our second big customer, scheduled to go online shortly.

A couple of last thoughts and learnings:

Advice about getting to market fast is right on. The feedback we got from early customers was fantastic and has helped us build a better product.  Had we gone heads down for 12 months to build what we thought was the perfect product would have likely missed the mark

It’s okay to take shortcuts, but understand your core value, and don’t skimp in those areas.  Had we not built for scale, we would have had more problems and might have lost data

Design for system failure. Although we didn’t expect things to fail, we planned for it and I’m glad we did when one of our logging servers locked up.

We were fortunate to have some very good data to use when forecasting traffic, and we spent a lot of time forecasting optimistically and pessimistically to make sure we understood what would happen in each case.

Don’t be afraid of success. When we first got that contract back quickly followed by the size of the initial campaign, I’ll admit to a short period of panic. I still wanted to call this a beta, I knew there were going to be problems. Fortunately we had planned well and focused on what we knew was important.

John Dietz (LinkedIn profile) is a co-founder of Adometry, a startup focused on online advertising metrics and writes about online advertising metrics atblog.adometry.com.

Baseball, Cherry-picking, Sample-size, and Startups

baseballs

On May 5th 2009, the Seattle Mariners stood in first place in the American League West, with 15 wins and 11 losses. How excited should we be about the success of the M’s in Seattle? I suppose I temper some of my own excitement by looking at the data. One of the reasons that I love baseball is that we store very detailed situational data for baseball games, with some historical data going back 100 years. One of the easy things to look at is OPS (On-base Plus Slugging), a nice simple number for a team that correlates fairly nicely with a team’s ability to score runs over a long period of time.  The M’s have a team OPS of .707 (as of May 5th), ranking them 13th out of 14 teams in the American League, not very good. Looking at the numbers more closely, perhaps there’s another reason (besides great pitching) that the M’s are winning.  When you look at the numbers with runners in scoring position (runners on 2nd and/or 3rd), the M’s rank 5th in the league with an OPS of .847. The question that obviously follows is, can the M’s maintain that performance over the course of the year?  And are your early successes and failures for your startup likely to continue? But first, a little more baseball…

Baseball is a great forum for statistics. Have you ever watched a baseball game on TV and heard the announcer say about a batter that this guy is hitting .400 on Tuesdays, or is hitting .350 against a certain pitcher? I certainly have, and that’s the beauty and curse of baseball. Baseball people, and announcers in particular, frequently fall into two of the biggest pitfalls of statistics: cherry-picking and sample sizes. In 2008, Ichiro had a .367 batting average against teams in the AL Central Division (his overall average was just .310 last year.  Does this mean that Ichiro should never take a day off against the AL Central? Adrian Beltre (Mariner’s third-baseman) batted .316 when batting 3rd in the lineup, but only .258 in other spots, should Beltre always bat 3rd for the M’s? In both of these cases, I can find this detailed information (thanks to ESPN.com), but I’m specifically picking data points that make a point and have relatively small sample sizes. If I look back over several years these trends tend to level out as the sample size gets larger. Going back to how the Mariner’s are doing this year, they are very unlikely to maintain that big an improvement on batting with runners on base for the entire season, I can look at a hundred years of historical data to back up my assertion.

Like with baseball, startups can fall into these same traps, with access to detailed data and the urge to use that data to drive strategic decisions. The key is to recognize the data that really indicates a trend and data that is an anomaly due to small sample size.  Unfortunately most of the data we collect doesn’t fall nicely into a sample size calculation that assumes a random sample of data (our data always has multiple variables).  If you are looking at some sales data, conversion data, web growth, etc., here are some ideas for identifying real trends:

  • Look for external causes – If my web site suddenly sees a lift in new registrations on Tuesdays, is it because the last two Tuesdays my site happened to get some press coverage? Perhaps there was some Twitter buzz growing that traffic.
  • Check for segmentation – If my sales are disproportionately high in Nevada, I can try to further segment my customers to see if there is a trend that makes sense
  • Increase your sample size – If my conversion rate from 2-6 PM is double other times of day, I will likely try to pull more historical data to see if there is a significant and consistent historical lift.
  • More advanced trending – If there really is a trend here, I don’t want to ignore it. If you have the chops for it, you can apply some statistical trending models (it’s actually not too hard, Excel has some built in).

Even if you don’t have to time to do mathematical models, you can get a benefit from trying to understand the variables that affect your metrics, and if people really are more receptive to your message on Thursday from 4-7 PM you might want to think about advertising during those times.

As for the Mariner’s, I’ll still root for them and look for signs of real success.  I don’t think Junior is going to hit .190 all season, he’s due.

John Dietz (LinkedIn profile) is a co-founder of Adometry, a startup focused on online advertising metrics and writes about online advertising metrics at blog.adometry.com.

Don’t Get Distracted By Your Data

picture-6

From John Dietz at Adometry

As a technology-based startup, we have access to a lot of data. Thanks to Google Analytics, I can see how many visitors I’ve had to my website from Norway in the last month (1), or than 4% of my visitors are using Google’s Chrome browser. But unless I’m building a Chrome targeted web app in Norwegian, this isn’t a great help to me. There’s a lot more than just web traffic that I can look at, here are some examples:

  • Site traffic – At a minimum everyone should have some idea about the traffic on their web site.  This can come from server logs, Google Analytics, Web Trends, Omniture, etc.  Understanding how people use your website or application is key to most new businesses.
  • Sales data – Everyone should keep basic metrics about their sales pipeline, how long it takes to make a sale for various industries, who are making the decisions, what are the price points, where are sales contacts coming in etc.
  • Advertising data – If you are advertising (search, display, traditional, etc.), understand who you are hitting with your ads and what kinds of responses you are getting.
  • Search engine data – Pay attention to your search rankings and the kinds of traffic it is generating (which you should be able to get from your site traffic data).
  • User registration data – You may be collecting some basic demographic data from your users if they have to register. This kind of data can be very valuable in understanding what kinds of users you have, and what kinds of users end up paying for your service.
  • Operational data – Your application has databases, application servers, web servers, message queues, etc. Your servers can probably report on their resource usage as well: disk space, RAM, CPU, etc.

It can be easy to get sucked up into tracking all of this data somehow believing it will help my business. Here’s what I do:

  1. Know what data is available – Start with what you have or can easily get
  2. Which metrics reflect my business – What metrics do I have that tell me how well I’m doing? This depends on the business, for me it’s primarily my sales numbers and retention numbers.
  3. What metrics affect my business – What metrics are early indicators or drivers of my business? My advertising data and lead conversion data drive my sales, my operational data drives my customer retention (when combined with the right functionality), etc.
  4. Which metrics distract me from by business – Everything else might be interesting, but doesn’t contribute to your business, so limit your efforts in these areas.
  5. Track and monitor my reflective and affective metrics, ignore the distractive metrics – Now that we know what really matters, we can monitor (in some case automatically, like operational metrics) my company’s performance and the leading indicators that affect that performance.  These are the ones to focus on, and it’s important to know the difference.

John Dietz (LinkedIn profile) is a co-founder of Adometry, a startup focused on online advertising metrics and writes about online advertising metrics at blog.adometry.com.

hosting