The widely used requests package allows you to make HTTP requests from within Python. To better handle large responses, we can stream data using the parameter stream=True (e.g. requests.get("http://server.abc/somebigfile", stream=True). This means the library only gets the HTTP headers and gives back control to you to deal with getting the content of the response using a couple of different methods and properties. One of them is response.iter_lines() which is a Python generator that yields lines in the response. In combination with stream=True this allows us to not have to load the entire response into memory at once, rather process it line by line.

But it has a small quirk in its implementation that tripped me up when testing the behaviour with netcat. iter_lines actually uses another similar method, iter_content() under the hood, where you have to specify the chunk_size — specifying how big are the chunks in which we iterate over the content. iter_lines() by default uses 512 (it used to be 10*1024!) which means any line shorter than that sent in the HTTP response won’t be immediately yielded. Only once the chunk is “filled” will all the lines that fit into the chunk yielded – at once.

Relevant links: