Sequential downloading is bad

From VuzeWiki
Jump to: navigation, search

Since people who have no understanding how bittorrent works often request that files or even pieces should be downloaded sequentially, here's an explanation why this is a very... bad idea:

[edit] Quick outline

  1. It threatens to kill smaller swarms due to piece starvation
  2. It severely limits the set of peers interested in each other, thus degrades swarm-wide and local performance
  3. It gets the client into endgame-mode like conditions towards the end of each file, thus slowing down the download

[edit] Detailed explanation

Note: All following arguments are about sequential file downloading, but they apply to sequential piece downloading too, in an amplified manner.

People already do abuse the Do Not Download priority to download files in order by downloading them one by one. This is especially noticeable visible on torrents with episodic content that can or should be consumed in a specific order. If one looks at the piece availability distribution of such a torrent one can easily notice that it is skewed towards the first files in the "consuming order". If one looks at the piece lists of individual peers one notices that currently only a few peers are causing that already noticeable skew and other clients - where the users haven't given any priorities - even try to compensate due to their rarest-first piece selection method.

This generally leads to the situation that the peers that try to compensate the availability skew are not very interested in what the prioritizing ones have to offer and the prioritizing ones aren't interested at all in the compensating ones, esp. towards the end of each file, when they're only looking for a very limited set of pieces. The relationship between multiple prioritizing peers is even worse. The older peers have already completed the first few files and thus aren't interested in younger peers who currently download the first file exclusively, thus no mutually beneficial relationship can be established between different "generations" of prioritizing peers, effectively splitting the swarm into sparsely connected sets.

On a small swarm this behavior can lead to pieces drop to the 0 availability because some peers concentrate on the first few files while the last peer/seed that has the rare piece in one of the later file quits after doing his fair share, but he only uploaded data for the first few files because the prioritizing peers were interested in those only.

One might argue that this behavior is not problematic on healthy, large swarms. But if such behavior is implemented as official, automatic feature it would be used more often and thus worsen the situation.

[edit] Example

The following picture shows how randomly peers have pieces, and how that then leads into a nice availability curve.



Read the Azureus FAQ.