A firehose API provides a raw stream of updates to content published in (near) real time. This stream of content can be faster and cheaper for websites like Twitter or Status.net to serve (vs having a crawler index the content).
One criticism that websites often make is that hosting raw and realtime access to their content would be prohibitively expensive. As we will show, this is simply not the case.
At the time of this writing, Twitter is indexing about 300M tweets per day. If we were to break this down into the costs to serve this API we would have:
|posts per day:||600M|
|bytes per month 1 :||2.5TB|
|bytes per month compressed 2 :||830GB|
|price per month 3 :||$1,159|
1. 150 bytes for the max post length plus say 4000 bytes for metadata multipled by number of posts per day multipled by 30 days in a month. The number we selected for tweets per month was
a bit liberal
2. Compression (gzip) is often used to transfer the content and a 3x compression ratio is reasonable.
3. Bytes per month multipled by $0.05 which is Softlayer's price for 2TB of bandwidth.
The above numbers are just for bandwidth. Hardware is another important cost. Generally speaking, one server handle multiple API customers.
|Mbps for content transfer per client: 1 :||76Mbps|
|Clients per server: 2||13|
|Typical per month server cost: 3||$500 USD|
|Cost per client: 4||$38.43 USD|
1. Bytes per month divided by the number of seconds in a months multiplied by 8 (since there are 8 bites per byte).
2. A typical commodity server can transfer 1000Mbit/s (gigabit). This is simply 1000 / content tranfer per client.
3. $500 per month is a reasonable cost for Amazon, Softlayer, or Rackspace hosting.
4. This is simply the $500 server price divided by the clients per server.
The price per connection is about $1,159 for transfer plus $38 per connection. $1,198 seems a reasonable API cost for basic transport. We're not talking about thousands of clients here. Further, we are more than fine with hosting providers paying for their transfer costs to provide fair and reasonable access to 3rd parties.
Pricing for a blogging service would be higher considering blog posts are usually longer. About 5-10x is a reasonable multiplier.