A scrape, aka "Tracker Scrape", is a request sent to a tracker. A request is sent, connection to the tracker is established, information is exchanged, then the connection is closed. A scrape is what every BitTorrent client (such as Azureus) does, to any tracker that is hosting a .torrent which is loaded into the client. The request does something like a "wipe" or a "pass" over the tracker, and then the tracker sends information back to the client.
Please note that some trackers don't respond to scrape requests, but you will still be able to download the torrent. The returned information can contain such information as, whether the tracker is OK or offline, the reason it is offline (unknown host exception, hash missing, etc), the numbers of peers and seeds, etc.
Every BitTorrent client scrapes the tracker many times during the course of a download to update the swarm information. So you can imagine that the tracker is scraped many thousands of times for that torrent alone, even if the swarm is not very big. The tracker can usually handle this number of requests. However, if there are more requests than strictly necessary, this can destabilize the tracker and put it offline.
While a torrent is incomplete, Azureus scrapes in order to determine whether or not to send an announcement requesting more peers. Sending a list of peers is usually more bandwidth-consuming than sending a scrape result.
When a torrent is complete, Azureus periodically scrapes in order to determine which torrents are the neediest. Without scraping, Azureus would never know which torrents have no seeds and require assistance.
Azureus calculates the interval to scrape based on:
- min_request_interval flag sent by the tracker
- # of clients known to be seeding
Azureus will never knowingly scrape a tracker more than once every 15 minutes. (It could unknowingly scrape more often if the user restarts Azureus more than once within 15 minutes). If min_request_interval is specified, it won't scrape for the torrents returned until the time specified.
Azureus will scrape at least once every 3 hours. Even on torrents with a large number of clients, the dynamics can change dramatically in 3 hours. For example, a new torrent may have thousands of clients, but within 3 hours all of those clients could finish downloading and leave, leaving the torrent with a poor seed:peer ratio.
Tracker Scrape Convention
Aside from convention described at the Bittorrent Specification Wiki, Azureus supports additional unofficial specs. These specs are used on a variety of trackers and are considered de-facto standards.
Scraping is the act of sending a scrape request to the tracker.
The scrape convention specifies that passing a info_hash as an URL parameter limits the scrape results to only providing information from that hash. When Azureus has multiple torrents from a single tracker, it will send multiple info_hash parameters on it's URL query string, hoping that the tracker will send back information on each hash.
Azureus supports the following dictionary/keys.
files infohash complete incomplete downloaded name failure reason flags min_request_interval
Non-official keys are optional and are explained below.
Similar to the failure reason returned in an announce result. The value is a human readable error string as to why the scrape failed.
The value for this key is an integer specifying how the minimum number of seconds for Azureus to wait before scraping the tracker again.
Please note the flags is a dictionary key of the main dictionary, and its value is a dictionary. Inside the dictionary, is a key/value pair for min_request_interval.
Note: New lines in the request/results are not part of request/result. They exist only to help with formatting in the wiki.
This tells us that torrent with hash 'xxxxxxxxxxxxxxxxxxxx' has 2 seeders, and 4 leechers. The torrent has been downloaded 0 times, and its name is xxxxxxxxxxxx. A scrape will not occur until at least 3600 seconds, or 60 minutes.
d5:filesd20:xxxxxxxxxxxxxxxxxxxxd8:completei19e10:downloadedi23896e 10:incompletei21e4:name6:Name Xe20:yyyyyyyyyyyyyyyyyyyyd8:completei23e 10:downloadedi24026e10:incompletei21e4:name6:Name Ye20:zzzzzzzzzzzzzzzzzzzz d8:completei15e10:downloadedi26171e10:incompletei23e4:name6:Name Zee5:flags d20:min_request_intervali18000eee