ADODB.Stream 'format error: not a pdf or corrupt' only on large file
Iporter 15 April 2008 08:35:57
I use the code below to authorise the download of certain files. Thus, instead of linking to the file in a wwwroot directory, I link to this code with the filename as a parameter, and the script streams the file if the user is authorised.
This has worked fine on PDFs, DOCs, XLS, etc. until today, and 18MB file presents the error message 'format error: not a pdf or corrupt'.
Is there a file size limit, or a default that needs overridden? Any thoughts?
It has the advantage of using memory more effeciently on the server by turning off buffering and chunking the file to the response. There is no need to modify the buffering limit from the default 4MB when using this function.
It's not a fact by observation but neither is it speculation.
Response.BinaryWrite doesn't return until all the buffer
contents have been sent.
Doesn't return what? From what? To what? This does not make any sense to me.
I'll let you do the math.
OK. In either case, the same amount of content has to be put into the buffer, and the same amount written from the buffer. You don't even bother to speculate about the cost of repeatedly using Stream.Read(bytes) over Stream.Read(adReadAll). So absent the Urim and Thummim neccessary for scrying the Response Object, there does not seem to be any "math" to do. You are *clearly* speculating.
I will grant this -- when combined with Response.isClientConnected [1], using the chunked method allows you to abort the process before the entire Stream is dealt with, which seems at first glance like a decent idea for sending very large files. But since your example ignores isClientConnected, you have not made any sort of "performance" case whatsoever.
"Dave Anderson" <NYRUMTPELVWH@spammotel.com> wrote in message news:%23HBL%23FtcHHA.4260@TK2MSFTNGP02.phx.gbl...
Anthony Jones wrote:
It's not a fact by observation but neither is it speculation.
Response.BinaryWrite doesn't return until all the buffer
contents have been sent.
Doesn't return what? From what? To what? This does not make any sense to
me.
I'll let you do the math.
OK. In either case, the same amount of content has to be put into the
buffer, and the same amount written from the buffer. You don't even bother
to speculate about the cost of repeatedly using Stream.Read(bytes) over
Stream.Read(adReadAll). So absent the Urim and Thummim neccessary for
scrying the Response Object, there does not seem to be any "math" to do.
You
are *clearly* speculating.
I will grant this -- when combined with Response.isClientConnected [1],
using the chunked method allows you to abort the process before the entire
Stream is dealt with, which seems at first glance like a decent idea for
sending very large files. But since your example ignores
isClientConnected,
you have not made any sort of "performance" case whatsoever.
Dave,
My apologies I've seem to have managed to irk you once again.
What I meant by 'sent' is sent and acknowledged by the client.
With a buffered response the entire contents is duplicated in memory at the same time, once in the array read from the stream and once in the buffer. How much of the file is also duplicated inside the stream object I'm not certain possibly all of it.
With the 'unbuffered' response only 1MB of the file is duplicated in memory at the time plus whatever is in the ADODB stream.
Of course in the buffered solution the stream and the array are released when the client starts to receive bytes. At this time the script context will also be torn down. Ultimately the average memory requirement for the buffered solution is likely to be less than my unbuffered one. This would depend on details of the ADODB stream internals, how much the script context uses and how big the file is. However it's peak memory requirements are significantly more.
Another implementation detail that I don't known is whether IIS releases some of the buffer whilst still using it. It seems an obvious thing to do which would mean the buffered technique's average memory usage would be somewhere approaching half the file size. The slower the link the closer to half the file size it will be.
OTH the 'unbuffered' solution will keep the ADODB.Stream and therefore potentially a duplicate of the file in memory for the duration of the send. Hence it's peak requirement is at worst the file size + 2MB. However that is also it's average requirement as well.
Overall then in a real world I'm probably wrong. Unless spike memory requirement of the buffered solution is a problem. Although might I risk your ire once again I do believe ADODB.Stream uses a .tmp file when mem usage becomes excessive but I could be wrong ... again.
You may try to recover the corrupt pdf file. You may try Advanced PDF Repair at http://www.datanumen.com/apdfr/ This tool is rather useful in salvaging damaged PDF documents.