Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I do realise it. I have little interest in curl. It is far too complicated with too many features, IMO. Anyway, libcurl used to support "attempted pipelining". See https://web.archive.org/web/20101015021914if_/http://curl.ha... But the number of people writing programs using CURLMOPT_PIPELINING always seemed small. Thus, I started using netcat instead. No need for libcurl. No need for the complexity.

HTTP/1.1 pipelining was nor is ever "problematic" for me. Thus I cannot relate to statements that try to suggest it is "problematic", especially without providing a single example website.

I am not looking at graphical webpages. I am not trying pull resources from multiple hosts. I am retrieving text from a single host, a continuous stream of text. I do not want parallel processing. I do not want asynchronous. I want synchronous. I want responses in the same order as requests. This allows me to use simple methods for verifying responses and processing them. HTTP/2 is far more complicated. It also has bad ideas like "server push" which is absolutely not what I am looking for.

There are some sites that disable HTTP/1.1 pipelining; they will send a Connection: close header. These are not the majority. There are also some sites where there is a noticeable delay before the first reponse or between responses when using HTTP/1.1 pipelining. That is also a small minority of sites. Most have no noticeable delay. Most are "blazingly fast". "HOB blocking" is not important to me if it is so small that I cannot notice it. If HTTP/1.1 pipelining is "blazingly fast" 98% of the time for me, I am going to use it where I can, which, as it happens, is almost everywhere.

Even with the worst possible delay I have ever experienced, HTTP/1.1 pipelining is still faster than curl/wget/lftp/etc. That is in practice, not theory. People who profess expertise in www technical details today struggle to even agree on what "pipelining" means. For example, see https://en.wikipedia.org/wiki/Talk:HTTP_Pipelining. Trying to explain that socket reuse differs from HTTP/1.1 pipelining is not worth the effort and I am not the one qualified to do it. But I am qualified to state that for text retrieval from single host, HTTP/1.1 pipelining works on most websites and is faster than curl. Is it slower than nghttp2. If yes, how much slower. We know the theory. We also know we cannot believe everything we read. Test it.



One more thing: cURL only supported pipelining GET and HEAD. And on www forums where people try to sound like experts, I have read assertions that POST requests cannot be pipelined. Makes sense in theory, but I know I tried pipelining POST requests before and, surprisingly, it worked on at least one site. Using curl's limited "attempted pipelining", one could never discover this, which is another example of how programs with dozens of features can stil be very inflexible. If anyone doubts this is true, I can try to remember the site that answered pipelined POST requests.


I think the limitation on methods is related to error handling and reporting, and ambiguity as to whether the "extra" requests on the wire have been processed or not. It's the developers of the tools, who read the specifications, who find the topic "problematic." For whatever reason, there is a mildly paternalistic culture among web plumbing developers. More than in some other disciplines, they seem to worry more about offering unsafe tools, and focusing on easy to use "happy paths" rather than offering flexibility which requires very careful use.

Going back to pipelining, the textbook definition is all about concurrent use of every resource along the execution path in order to hit maximum throughput. As if the "pipe" from client application to server and back to client is always full of content/work from the start of the first input in the stream until the end of the last output. That was rarely achieved with HTTP/1.1 because of the way most of the middleware and servers were designed. Even if you could managed to pipeline your request inputs to keep the socket full from client to server, the server usually did not pipeline its processing and responses. Instead, the server alternated between bursts of work to process a request and idle wait periods while responses were sent back to the client. How much this matters in practice depends on the relative throughput and latency measures of all the various parts in your system.

I measured this myself in the past, using libcurl's partial pipelining with regular servers like Apache. I could get much faster upload speeds with pipelined PUT requests, really hitting the full bandwidth for sending back-to-back message payloads that kept the TCP path full. But, pipelined GET requests did not produce pipelined GET responses, so the download rate was always a lower throughput with measurable spikes and delays as the socket idled briefly between each response payload. For our high bandwidth, high latency environment, the actual path could be measured as having symmetric capacity for TCP/TLS. The pipelined uploads got within a few percent of that, while the non-pipelined downloads had an almost 50% loss in throughput.

If I were in your position and continued to care about streaming requests and responses from a scripting environment, I might consider writing my own client-side tool. Something like curl to bridge between scripts and network, using an HTTP/2 client library with an async programming model hooked up to CLI/stdio/file handling conventions that suit my recurring usage. However, I have found that the rest of the client side becomes just as important for performance and error handling if I am trying to process large numbers of URLs/files. So, I would probably stop thinking of it as a conventional scripting task and instead think of it more like a custom application. I might write the whole thing in Python and worry about async error handling, state tracking, and restart/recovery to identify the work items, handle retries as appropriate, and be able to confidently tell when I have finished a whole set of work...


s/HOB/HOL/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: