HTTP RFC Spec 2616 Section 8.1.4 (http://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html) says
A single-user client SHOULD NOT maintain more than 2 connections with any server or proxy. A proxy SHOULD use up to 2*N connections to another server or proxy, where N is the number of simultaneously active users.
While this spec was written quite a long time ago, it’s still a spec and you as a client developer should respect. Most browsers till IE8 did not allow more than two concurrent connections to the same server. IE 8 changed the trend. Steve Sounders has an excellent write up on this here. Even with IE 8, Microsoft did limit the number of connections (to 6). The cap still exists and it’s not “unlimited” and in 2012, most browsers don’t allow more than 6 or 8 concurrent connections to the same server. (We are dealing with HTTP protocol and not torrents or others)
In my opinion, an iOS client that acts a native client for your website should behave like a browser (minus the layout engine), adhering to every possible HTTP standard (Caching headers, Authorization headers, Keep-Alive and others). This is exactly what my iOS networking framework MKNetworkKit, tries to achieve.
Effect of Multiple Connections
Making multiple HTTP requests concurrently hurts the performance. To understand why, we should first understand briefly about how web servers are architected. Firstly, servers (whether it’s a physical box in your office or a VPS or cloud instance) can handle only a given number of requests per second (depending on the hardware). In fact, most servers vie to handle 10,000 concurrent connections. (For the eager beaver, read about the c10k problem that offers solutions on how to implement 10k requests concurrently on your server). When a server cannot handle the load, it’s free to drop connections (again this is perfectly fine and an expected behaviour as per RFC). When a connection is dropped, a client retries the connection using a binary exponential back off algorithm (as per RFC 2616). So, achieving 10,000 concurrent connections per hardware is hard. So how do web server handle millions of users? With proper load balancing and making sure that a low Keep-Alive is sent to the client, a web server, tries to limit the number of concurrent connections.
Most HTTP servers including Apache/nginx allows sysadmins to set a value called “Keep-Alive”. I believe you might have edited this value too. Keep-Alive dictates how long a connection will be kept open before the server forcefully closes it. A lower value means, the server can serve more users and a high value means less latency for already connected user. Apache 2′s default value seems to be around 2 seconds. (This was originally 15 seconds before Apache 2). What that means is, if you make a request to your server (running apache) to get a list of items, and then when the user taps on the item to get the details, your second request would probably not use the same TCP connection (assuming your second request is started after 2 seconds). Initiating a new TCP connection is a slow process compared to fetching data over HTTP on an existing TCP connection. But unfortunately, servers are designed to cater to as many users as possible and not just your client and you just have to live with it.
This limitation has a side effect. It increases the number of concurrent connections opened by a client to a server. To workaround this, the RFC spec encourages clients not to maintain more than 2 connections per server. And let’s all face it. IE breaks standards. (But Microsoft could have broken the standard because the standard was too old and was written before the era of XmlHttpRequest). IE 8 changed this limit to 6 and other browsers followed.
When you open a higher number of concurrent connections (more than 6), say for a photo library app on an iPad that downloads 100 images for displaying it on the first screen, if your app is being used by 100 others at the same time (quite common, 100 users is not a huge number) you hit the 10,000 concurrent limit easily and your server will start closing connections. Without proper load balancing, your server will not even be able to handle a 100 concurrent users, a very poor engineering. (throwing more hardware at the problem is again poor engineering)
So, what can you do as a iOS developer? Limit the number of parallel connections in your app to somewhere around 6 or 8. Now, won’t throttling my download operations affect client side performance? No, apparently, limiting the number of operations will do more good than bad. Yes, it actually improves your performance. Let me explain.
Let’s assume that you are making a photo library iPad app (something similar to cool iris) that downloads 100 images to be displayed on the first screen. Let’s assume that every image (thumbnail) you download is around 50 KB. Downloading 100 images is going to be around 5 MB. If your 3G bandwidth is around 1 Mbps (a fair assumption), it would probably take 40 seconds to download). If you have 100 parallel connections sharing the 1Mbps bandwidth, all 100 images will appear after 40 seconds. If you have 50 parallel connections, 50 images will appears after 20 seconds and the next 50 after 40 seconds. If you have 10 parallel connections, 10 images will appear progressively every 4 seconds, a much better user experience. So where do you exactly stop? RFC says, stop at 2 parallel connections. Most browsers however stop at 6 or 8 parallel connections. Mobile Safari, on my testing using this website uses 2-6 based on the current network speed. While our own app cannot determine the speed of a network before hand (It’s too painful to download a fat file and calculate the time it takes and get the speed and is not just worth it), it’s better to use 6 connections on WiFi and 2 on mobile data network. Of course, this limit might change in future when LTE adoption becomes widespread, but restricting number of connections goes a long way in building a robust product (both client and the server) than opening up unlimited number of connections.
It doesn’t stop there. On the server side, if a client opens 100 connections at once, assuming a single machine can handle 10,000 concurrent connections, you limit the number of concurrent users to 100, which is bad. You have to throw more hardware for every 100 user. Not a great scalable service. By limiting the concurrent connections on client side to 5, you allow 2000 concurrent users (per machine/node) and this limit will actually do more good than bad as we saw in our previous example.
Additionally, if you someone else, like ISP or a intermediate proxy server, limits the number of connections to say 6, your first 6 connections will go through. But 7th, 8th and nth operation will start and enter the “isExecuting” state within the NSOperationQueue. The NSURLConnection will start the connection and wait, but the connection:didReceiveResponse: will never be called on the delegate till one of the 6 operations, that you previously started completes. After 240 seconds, this operation will just timeout. You can avoid this scenario easily by preventing the 7th+ NSURLConnection to be started. While 240 seconds is a large number, you will easily hit this wait time with 6 concurrent connections trying to download a 50 kb thumbnail over a 10 Kbps edge/gprs network.
Moreover, even in an ideal world where there are no proxies or ISPs to throttle (gasp!), if you don’t limit the number of connections, the server can close them when load is high and again, you will get timeouts for some operations.
A timeout in a image heavy app like this means, only few images will go through and your photo library app will display empty images/placeholders for the rest. Again, not a great user experience. User experience is not just slapping a beautiful UI onto your iOS app. The underlying engineering should be good as well.
What does other “industry leading” companies recommend?
Lastly, limiting the number of connections is not throttling. It’s optimization. It’s an optimization recommended by RFC and implemented by most browsers. You should implement them too. Throttling is limiting the bandwidth.
MKNetworkKit has some nifty features to help you manage scenarios like this. MKNetworkKit uses a single shared queue and optimizes the number of concurrent connections based on the available network. As on date, it is 6 on WiFi and 2 on mobile networks. While browsers implement 6 concurrent connections per server, I do it per application. This is because, an iOS app mostly would normally won’t have 6 concurrent connections talking to different servers. In most cases, your implementation will contain one API fetch and multiple CDN fetch. Limiting it per application will just average out.
MKNetworkKit also coalesces multiple requests. Assume that you are a twitter client and you download the last 200 tweets in a user’s timeline. While a stream might contain 200 tweets, there would be far less than 10 or 20 friends (statistically). As the user scrolls through the timeline, you shouldn’t create operations to download 200 profile pictures. You should in fact a unique download operation per profile picture. Surprisingly, a twitter timeline containing the most recent 200 tweets will contain around 20 friends and 20 requests is all that you need to make. Fret not. With MKNetworkKit take care of this operation coalescing automatically. Even if you enqueue an image download operation twice, the actual URL fetch is performed exactly once.
Any networking app can and should implement such performance related fine tuning measures and profile the performance. In fact, one of the hardest aspect of iOS programming is getting the performance right and in the upcoming edition of our book, iOS 6 Programming: Pushing the Limits we added a whole new chapter that explains memory and CPU optimisation using Instruments. You might want to check that out.
Comments are welcome.
Follow me on Twitter