When framework designers outsmart themselves [How to: Perform streaming HTTP uploads with .NET]
As part of a personal project, I had a scenario where I expected to be doing large HTTP uploads (ex: PUT) over a slow network connection. The typical user experience here is to show a progress bar, and that's exactly what I wanted to do. So I wrote some code to start the upload and then write to the resulting NetworkStream in small chunks, updating the progress bar UI after each chunk was sent. In theory (and my test harness), this approach worked perfectly; in practice, it did not...
What I saw instead was that the progress bar would quickly go from 0% to 100% - then the application would stall for a long time before completing the upload. Which is not a good user experience, I'm afraid. I'll show what I did wrong in a bit, but first let's take a step back to look at the sample application I've written for this post.
The core of the sample app is a simple HttpListener that logs a message whenever it begins an operation, reads an uploaded byte, and finishes reading a request:
// Create a simple HTTP listener using (var listener = new HttpListener()) { listener.Prefixes.Add(_uri); listener.Start(); // ... // Trivially handle each action's incoming request for (int i = 0; i < actions.Length; i++) { var context = listener.GetContext(); var request = context.Request; Log('S', "Got " + request.HttpMethod + " request"); using (var stream = request.InputStream) { while (-1 != stream.ReadByte()) { Log('S', "Read request byte"); } Log('S', "Request complete"); } context.Response.Close(); } }
Aside: The code is straightforward, but it's important to note thatHttpListener
is only able to start listening when run withAdministrator
privileges (otherwise it throws "HttpListenerException: Access is denied"). So if you're going to try the sample yourself, please remember to run it from an elevated Visual Studio or Command Prompt instance.
With our test harness in place, let's start with the simplest possible code to upload some data:
/// <summary> /// Test action that uses WebClient's UploadData to do the PUT. /// </summary> private static void PutWithWebClient() { using (var client = new WebClient()) { Log('C', "Start WebClient.UploadData"); client.UploadData(_uri, "PUT", _data); Log('C', "End WebClient.UploadData"); } }
Here's the resulting output from the C
lient and S
erver pieces):
09:27:07.72 <C> Start WebClient.UploadData 09:27:07.76 <S> Got PUT request 09:27:07.76 <S> Read request byte 09:27:07.76 <S> Read request byte 09:27:07.76 <S> Read request byte 09:27:07.76 <S> Read request byte 09:27:07.76 <S> Read request byte 09:27:07.76 <S> Request complete 09:27:07.76 <C> End WebClient.UploadData
The WebClient's UploadData method offers a super-simple way of performing an upload that's a great choice when it works for your scenario. However, all the upload data must be passed as a parameter to the method call, and that's not always desirable (especially for large amounts of data like I was dealing with). Furthermore, it's all sent to the server in arbitrarily large chunks, so our attempt at frequent progress updates isn't likely to work out very well. And while there's the OnUploadProgressChanged event for getting status information about an upload, WebClient
doesn't offer the granular level of control that's often nice to have.
So WebClient
is a great entry-level API for uploading - but if you're looking for more control, you probably want to upgrade to HttpWebRequest:
/// <summary> /// Test action that uses a normal HttpWebRequest to do the PUT. /// </summary> private static void PutWithNormalHttpWebRequest() { var request = (HttpWebRequest)(WebRequest.Create(_uri)); request.Method = "PUT"; Log('C', "Start normal HttpWebRequest"); using (var stream = request.GetRequestStream()) { foreach (var b in _data) { Thread.Sleep(1000); Log('C', "Writing byte"); stream.WriteByte(b); } } Log('C', "End normal HttpWebRequest"); ((IDisposable)(request.GetResponse())).Dispose(); }
Aside from the Sleep call I've added to simulate client-side processing delays, this is quite similar to the code I wrote for my original scenario. Here's the output:
09:27:08.78 <C> Start normal HttpWebRequest 09:27:09.79 <C> Writing byte 09:27:10.81 <C> Writing byte 09:27:11.82 <C> Writing byte 09:27:12.83 <C> Writing byte 09:27:13.85 <C> Writing byte 09:27:13.85 <C> End normal HttpWebRequest 09:27:13.85 <S> Got PUT request 09:27:13.85 <S> Read request byte 09:27:13.85 <S> Read request byte 09:27:13.85 <S> Read request byte 09:27:13.85 <S> Read request byte 09:27:13.85 <S> Read request byte 09:27:13.85 <S> Request complete
Although I've foreshadowed this unsatisfactory result, maybe you can try to act a little surprised that it didn't work the way we wanted...
But what in the world is going on here? Why is that data sitting around on the client for so long?
The answer lies in the documentation for the AllowWriteStreamBuffering property (default value: True
):
Remarks
WhenAllowWriteStreamBuffering
istrue
, the data is buffered in memory so it is ready to be resent in the event of redirections or authentication requests.
Notes to Implementers:
SettingAllowWriteStreamBuffering
totrue
might cause performance problems when uploading large datasets because the data buffer could use all available memory.
In trying to save me from the hassle of redirects and authentication requests, HttpWebRequest
has broken my cool streaming scenario. False
, right?
Wrong; that'll get you one of these:
ProtocolViolationException: When performing a write operation with AllowWriteStreamBuffering set to false, you must either set ContentLength to a non-negative number or set SendChunked to true.
Okay, so we need to set one more property before we're done. Fortunately, the choice was easy for me - my target server didn't support chunked transfer encoding, so choosing to set ContentLength was a no-brainer. The only catch is that you need to know how much data you're going to upload before you start - but that's probably true most of the time anyway! And I think ContentLength
is a better choice in general, because the average web server is more likely to support it than chunked encoding.
Making the highlighted changes below gives the streaming upload behavior we've been working toward:
/// <summary> /// Test action that uses an unbuffered HttpWebRequest to do the PUT. /// </summary> private static void PutWithUnbufferedHttpWebRequest() { var request = (HttpWebRequest)(WebRequest.Create(_uri)); request.Method = "PUT"; // Disable AllowWriteStreamBuffering allows the request bytes to send immediately request.AllowWriteStreamBuffering = false; // Doing nothing else will result in "ProtocolViolationException: When performing // a write operation with AllowWriteStreamBuffering set to false, you must either // set ContentLength to a non-negative number or set SendChunked to true. // The most widely supported approach is to set the ContentLength property request.ContentLength = _data.Length; Log('C', "Start unbuffered HttpWebRequest"); using (var stream = request.GetRequestStream()) { foreach (var b in _data) { Thread.Sleep(1000); Log('C', "Writing byte"); stream.WriteByte(b); } } Log('C', "End unbuffered HttpWebRequest"); ((IDisposable)(request.GetResponse())).Dispose(); }
Here's the proof - note how each byte gets uploaded to the server as soon as it's written:
09:27:14.86 <C> Start unbuffered HttpWebRequest 09:27:14.86 <S> Got PUT request 09:27:15.88 <C> Writing byte 09:27:15.88 <S> Read request byte 09:27:16.89 <C> Writing byte 09:27:16.89 <S> Read request byte 09:27:17.90 <C> Writing byte 09:27:17.90 <S> Read request byte 09:27:18.92 <C> Writing byte 09:27:18.92 <S> Read request byte 09:27:19.93 <C> Writing byte 09:27:19.93 <C> End unbuffered HttpWebRequest 09:27:19.93 <S> Read request byte 09:27:19.93 <S> Request complete
Like many things in life, it's easy once you know the answer! AllowWriteStreamBuffering
property and see if maybe that's the cause of your problems.
[Please click here to download a sample application demonstrating everything shown here.]