Network Working Group T. Arcieri Internet-Draft A. Mysore Expires: May 4, 2008 G. Pahlke ClickCaster, Inc. J. Sanders T. Stapleton University of Colorado November 2007 Peer Distributed Transfer Protocol Specification draft-distribustream-pdtp-rfcs-01 Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on May 4, 2008. Copyright Notice Copyright (C) The Internet Society (2007). Abstract Peer Distributed Transfer Protocol is a high performance peer-to-peer file distribution system that provides streaming downloads of files originating from a central server. The files are then shared over a peer network, allowing the aggregate bandwidth of the network to Arcieri, et al. Expires May 4, 2008 [Page 1] Internet-Draft PDTP Specification November 2007 scale with the number of clients. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Message Format . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Server to Client Messages . . . . . . . . . . . . . . . . . . 5 3.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . . 5 3.2. TELLINFO . . . . . . . . . . . . . . . . . . . . . . . . . 5 3.3. TRANSFER . . . . . . . . . . . . . . . . . . . . . . . . . 6 3.4. TELLVERIFY . . . . . . . . . . . . . . . . . . . . . . . . 6 3.5. HASHVERIFY . . . . . . . . . . . . . . . . . . . . . . . . 7 3.6. PROTOCOLERROR . . . . . . . . . . . . . . . . . . . . . . 7 4. Client to Server Messages . . . . . . . . . . . . . . . . . . 8 4.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . . 8 4.2. REGISTER . . . . . . . . . . . . . . . . . . . . . . . . . 8 4.3. REQUEST and UNREQUEST . . . . . . . . . . . . . . . . . . 9 4.4. PROVIDE and UNPROVIDE . . . . . . . . . . . . . . . . . . 10 4.5. ASKINFO . . . . . . . . . . . . . . . . . . . . . . . . . 10 4.6. ASKVERIFY . . . . . . . . . . . . . . . . . . . . . . . . 11 4.7. COMPLETED . . . . . . . . . . . . . . . . . . . . . . . . 11 5. Client to Client Communication . . . . . . . . . . . . . . . . 13 5.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . . 13 6. References . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Appendix A. Glossary . . . . . . . . . . . . . . . . . . . . . . 15 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 16 Intellectual Property and Copyright Statements . . . . . . . . . . 17 Arcieri, et al. Expires May 4, 2008 [Page 2] Internet-Draft PDTP Specification November 2007 1. Introduction This document describes version 2 of PDTP, an enhancement of PDTP version 1 [1]. PDTP currently only allows for operation on IPv4 networks, though support for IPv6 networks is planned. PDTP is based on an extensible set of asynchronous messages. These messages are comprised of a type, and a sequence of key-value pairs. The possible data types are described in Appendix A. Due to the structure of PDTP, protocol messages can be grouped into two distinct categories: o Server to client messages (Section 3) manage the file transfer process and provide infomation of files requested by the client. o Client to server messages (Section 4) handle requests for files by the client, inform the server of files the client can provide, and report the status of transfers. o Client to client communication (Section 5) makes use of the standard HTTP 1.1 [2] protocol Arcieri, et al. Expires May 4, 2008 [Page 3] Internet-Draft PDTP Specification November 2007 2. Message Format All client to server (Section 4) and server to client (Section 3) communication utilizes a lightweight form of asynchronous messaging over TCP with JavaScript Object Notation (JSON) [3] providing underlying data serialization format. Messages are length-prefix framed with a JSON message body. The IANA approved TCP port for all PDTP client/server intercommunication is 6086. Each frame consists of a 16-bit unsigned integer in network byte order representing the length of the message body, followed by the message body itself. Because the length prefix is 16-bit, the maximum allowed length of the message body is 65,535 bytes, with a maximum frame size of 65,537 bytes including the header. Messages are sent in succession over a persistent TCP connection. Message bodies consist of JSON messages in the following format: ["type", {"arg1": "value1", "arg2": "value2", ..., "argN": "valueN"}] Each message is composed of an outer JSON array with two members. The first member is a string which represents the message type. The second member is a JSON object which contains a message-specific collection of arguments. Unless otherwise specified, all members are represented as JSON strings. To improve the readability of the protocol on the wire, it is recommended that CRLF be appended to the end of the JSON message. This is ignored as whitespace by JSON parsers, but meaningful to humans who may be reading wire dumps. However, this is not an explicit requirement of the protocol, nor should its presence be required by any implementation. Arcieri, et al. Expires May 4, 2008 [Page 4] Internet-Draft PDTP Specification November 2007 3. Server to Client Messages 3.1. Overview These messages are sent from the server to the client to respond to information requests by the client and manage the file transfer process. o TELLINFO (Section 3.2) provides the client with information about the object located at the specified URL. o TRANSFER (Section 3.3) tells the client to initiate a peer-to-peer data transfer. o TELLVERIFY (Section 3.4) tells the client whether the specified transfer is authorized. o HASHVERIFY (Section 3.5) informs the client whether the successful transfer had the correct file hash. 3.2. TELLINFO tell_info url [size] [chunkSize] [streaming] The tell_info datagram is the expected response from the ask_info datagram, and contains information that the client can use to determine its chunk request policy and the manner in which a file is handled. Since the server only arranges for chunks to be sent when they are explicitly requested, a client must know how many chunks are in a file so that it knows how many to request. In order to determine the number of chunks in a file, the size of each chunk and the total size of the file are sent. The number of chunks is then the 'fileSize' divided by the 'chunkSize', rounded up to the first integer. The 'chunkSize' could also be used to determine a policy on caching or memory management and the 'fileSize' could be used to alleviate complications arising from the case of incomplete final chunks, and perhaps other things. Fields: url: String A unique file identifier. The server uses this field to specify which file it is sending information about. size: Integer (Optional) The exact size in bytes of the file. Since the last chunk in the file might not be completely filled, it is necessary to know the total file size as well as the chunk size and the chunk count. Arcieri, et al. Expires May 4, 2008 [Page 5] Internet-Draft PDTP Specification November 2007 chunkSize: Integer (Optional) The size in bytes of each chunk in the file. streaming: Boolean (Optional) If true, the file is assumed to be a streaming media file and chunks near the beginning will be sent first. If false, chunks from all over the file will have equal priority. Default is false. 3.3. TRANSFER transfer peer port method url range peer_id The server controls all data flowing over the network. It uses the transfer message to initiate a chunk transfer between two peers. The transfer message is always sent to the peer that should initiate the connection to the other peer. When a client receives a transfer message, it should connect to the specified peer and carry out the transfer as specified in Section 4 (Section 5) and reply to the server with a completed (Section 4.7) message upon success. Fields: peer: String This is the network address of the peer with which the transfer should take place. port: Integer The port on the peer to connect to. method: Enumeration The HTTP method to be used for this transfer. Possible values are GET and PUT, with the same semantics as in the HTTP protocol. That is, if method is GET, this client is receiving data from a peer, and if method is PUT, this client is sending data to a peer. url: String This is a unique file identifier for the file to transfer. range: Range This is the byte range being transferred. This, combined with the url, provides a unique identifier for the data to transfer. peer_id: Integer The unique id of the peer in this transfer. 3.4. TELLVERIFY tell_verify peer url range peer_id authorized The tell_verify datagram is sent to the client in response to an ask_verify (Section 4.6) message to inform it of the authorization status of a transfer. Fields: Arcieri, et al. Expires May 4, 2008 [Page 6] Internet-Draft PDTP Specification November 2007 peer: IP Address The address of the connecting peer. url: String A unique file identifier. range: Range The byte range to verify. peer_id: String The unique identifier of the peer. authorized: Boolean A boolean value specifying whether or not the transfer is authorized. 3.5. HASHVERIFY hash_verify url range hash_ok The hash_verify message is sent to a client after a completed message with a hash has been received. The value of hash_ok denotes whether or not the hash sent in the completed message matches the expected hash of the chunk containing the specified range. Fields: url: String The url of the file being transferred. range: Range The byte range of the file, which this message refers to. hash_ok: Boolean If true, the has sent in the completed message matched the expected hash of the byte range it referred to. 3.6. PROTOCOLERROR protocol_error message The protocol_error message is sent to a client whenever a protocol error has occurred. The message field is meant to be read by a human to determine what went wrong. In most cases, the error is due to a programming error and is fatal. One notable exception to this is the case when a client generates a Peer Id that is not unique. Fields: message: String An error message. This error message is not meant to be parsed programmatically, but rather to be logged and read as an aid to debugging. Arcieri, et al. Expires May 4, 2008 [Page 7] Internet-Draft PDTP Specification November 2007 4. Client to Server Messages 4.1. Overview These messages are sent from the client to the server to request files, inform the server of completed transfers and handle files the client can provide to the network. o REGISTER (Section 4.2) informs the server that the client exists and provides the server with information about itself. o REQUEST (Section 4.3) informs the server that the client wants the specified object range. o UNREQUEST (Section 4.3) informs the server that the client no longer needs the specified object range. o PROVIDE (Section 4.4) informs the server that the client has the specified object range and can provide it to peers. o UNPROVIDE (Section 4.4) informs the server that the client no longer has the specified object range, and therefore cannot provide it. o ASKINFO (Section 4.5) requests information from the server about the specified URL. o ASKVERIFY (Section 4.6) asks the server whether the specified transfer is authorized. o COMPLETED (Section 4.7) informs the server that a transfer has successfully completed. 4.2. REGISTER register client_id listen_port The register message is the first message a client sends to the server. The client must send this message to alert the server of its presence so that it may be included in the server's actions. Fields: client_id: String A unique identifier, which the client generates. The server will identify this client on the network based on this identifier. The identifier must be a string less than 4 KB in length. It is the client's responsibility to generate an identifier that is unique. The server should respond with a protocol_error (Section 3.6) message if the identifier is too long or not unique. Arcieri, et al. Expires May 4, 2008 [Page 8] Internet-Draft PDTP Specification November 2007 listen_port: Integer The port on which the client will listen for incoming connections from other peers. The server will pass this information along to other clients wishing to connect to this client for peer to peer chunk transfers. 4.3. REQUEST and UNREQUEST request url [range] unrequest url [range] The request and unrequest messages are used by the client to indicate what data it needs to receive. A request datagram indicates that the client needs the specified object bytes; an unrequest datagram indicates that the client no longer needs the specified bytes. The completed and provide messages may also be used to cancel a request, though they carry additional semantics. If the range of a request or unrequest message isn't specified, it is assumed to include all bytes in the file. The server is expected to continuously arrange for the data in the client's request set to be delivered to the client through peer to peer transfer channels. A request set is informally the set of chunks that a client wants. Formally, we may say that a given unique chunk C with the URL L and the id K is in a client's request set if and only if: o The client has sent at least one request message with a URL equivalent to L and a byte range containing chunk K. o Since the last such message was sent, the client has not sent an unrequest message with a URL equivalent to L and a byte range containing chunk K, o ...nor has it sent a completed message with a URL equivalent to L and a byte range containing chunk K, o ...nor has it sent a provide message with a URL equivalent to L and a byte range containing chunk K. In short, requests are standing and additive; unrequests are transient and subtractive. The server is expected to handle any possible combination of requests and unrequests that clients can send. (For example, a client may request a URL in its entirety, and then later unrequest certain parts of the URL. This is useful if, say, a client needs to complete a partial download, repair a damaged file, or optimize its network usage in response to user actions.) Fields: Arcieri, et al. Expires May 4, 2008 [Page 9] Internet-Draft PDTP Specification November 2007 url: String The object being requested. The server may discard requests for URLs it does not understand. range: Range (Optional) The range of bytes being requested, inclusive. If unspecified, it is taken to be the complete range of bytes in the file. 4.4. PROVIDE and UNPROVIDE provide url [range] unprovide url [range] The provide and unprovide datagrams are used to tell the server which chunks of a file it is able to provide to peers. Each client in the system can have a cache of file chunks that it has already downloaded and depending on the client's individual capabilities, this cache may or may not be empty on startup. If a client has data in a cache on startup, the provide message can be used to inform the server. If the cache is empty on startup, the provide message is never needed because the server is expected to keep track of each chunk that has been transferred. On the other hand, if a client evicts one or more chunks from its cache, it should immediately send an unprovide message. Although there is no firm requirement on a client to send appropriate unprovide datagrams, it is in the client's best interest to do so, as it could lose standing in the network if it were asked to send a chunk it did not have. A client can send as many provide messages as necessary to inform the server of its entire chunk cache. These messages should be interpreted by the server additively. That is, if a client first sends a message specifying that it provides byte range A to B and then another specifying that it provides byte range C to D, the server should conclude that it has byte range A to B and byte range C to D. Furthermore, a client can send provide and unprovide messages for multiple files, as specified by the url field. If the range in a provide or unprovide message isn't specified, it is taken to include the entire range of bytes in the file specified by the url field. Fields: url: String A unique file identifier. The client uses this field to specify a file in its cache. range: Range (Optional) The range of relevant bytes, inclusive. If unspecified, it is taken to be the complete range of chunks in the file. 4.5. ASKINFO ask_info url Arcieri, et al. Expires May 4, 2008 [Page 10] Internet-Draft PDTP Specification November 2007 The ask_info datagram is sent to the server when a client wants information about a file. Specifically, the client needs information about the file type, the file size, the number of chunks, and the size of each chunk in order to successfully receive a file. This information is used to determine when a file is finished transferring and how to handle a file, among other things. It is expected that the server will always respond to an ask_info datagram with a tell_info (Section 3.2) datagram. Fields: url: String A unique file identifier. The client uses this field to specify which file it would like information about. 4.6. ASKVERIFY ask_verify peer url range peer_id The ask_verify datagram is sent to the server when a client wants to know whether a transfer is authorized. This is sent upon receipt of a put or get from a peer. It is expected that the server will always respond to an ask_verify datagram with a tell_verify (Section 3.4) datagram. Fields: peer: String The network address of the connecting peer. url: String A unique file identifier. The client uses this field to specify which file it would like information about. range: Range The range of bytes in the file referred to by the url for which we are asking verification. peer_id: String The unique identifier of the peer. 4.7. COMPLETED completed peer url range peer_id [hash] The completed message is used by the client to indicate that a transfer has completed in either success or failure. Upon success, a completed message will include a hash of the chunk that it has received. The lack of a hash field denotes transfer failure. The server may use this information to inform its network optimization, so the appropriate use of completed messages is highly recommended. (Specifically, it is likely that the server will initiate another transfer after it has been informed that one has completed.) Arcieri, et al. Expires May 4, 2008 [Page 11] Internet-Draft PDTP Specification November 2007 Fields: peer: String The network address of the peer associated with this transfer. url: String The url of the transferred chunk. range: Range The byte range of the completed transfer. peer_id: String A unique identifier of the peer associated with the transfer. hash: String (Optional) A hash of the transferred chunk. Denotes failure if blank. Arcieri, et al. Expires May 4, 2008 [Page 12] Internet-Draft PDTP Specification November 2007 5. Client to Client Communication 5.1. Overview All client to client communication is done using the HTTP 1.1 [2] protocol. Each client is both an HTTP client and server. During a transfer, the connecting peer for that particular transfer acts as the client and the listening peer acts as the server. While full HTTP functionality may be implemented and implementation-specifically desirable, only the subset of the HTTP protocol that includes the GET and PUT requests and the full range of responses is strictly required for the PDTP protocol. The listening client is truly an HTTP server and the connecting client is truly an HTTP client. By this we mean simply that not only the syntax but also the semantics of the GET and PUT requests and any responses received are identical to that of the HTTP protocol. A connecting peer will include the following information in the request sent to a listening peer: o A compliant HTTP [2] Request-Line for either a GET or PUT request. The method to use should be taken from the method parameter of the transfer (Section 3.3) message received from the server. Similarly, the Request-URI portion of the Request-Line is taken from the url field from the server. o A Host header, identifying the host in the url specified by the server. This host should be recognized as a virtual host on the listening peer. It is important to note that this field does not necessarily represent any identifier associated with the listening peer, but rather the host in the url field from the server. o A Range header, identifying the byte range to be transferred, taken from the range field in the transfer message. o An X-PDTP-Peer-Id header. The value of this header is the Peer Id of the connecting client. That is, it is the Peer Id of the client sending this request. It is important to note that this is not the peer_id field from the transfer message, which represents the Peer Id of the listening peer and is necessary when notifying the server of transfer completion. o If the method is PUT, a body containing the data is also sent Upon receiving a request from a connecting peer, a listening peer processes the request in an appropriate manner and sends an appropriate response. The listening client may choose to ask the Arcieri, et al. Expires May 4, 2008 [Page 13] Internet-Draft PDTP Specification November 2007 server for verification of this transfer using the ask_verify (Section 4.6) message. If verification fails, a 403 Forbidden response should be sent. Similarly, if the file cannot be found, or the range is unsatisfiable, the 404 File Not Found or 416 Requested Range Unsatisfiable responses should be sent. Upon success, either 206 Partial Content, 200 OK, or 201 Created should be sent, depending on the situation. 6. References [1] Arcieri, A., "Peer Distributed Transfer Protocol", November 2005, . [2] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Fielding, P., and T. Berners-Lee, "Hypertext Transfer Protocol - HTTP/1.1", RFC 2616, June 1999. [3] Crockford, D., "The application/json Media Type for JavaScript Object Notation (JSON)", RFC 4627, July 2006. Arcieri, et al. Expires May 4, 2008 [Page 14] Internet-Draft PDTP Specification November 2007 Appendix A. Glossary o Client: a system that accesses a service on a remote computer via a network, equivalent to a 'peer'. o Connecting peer: The peer that initiates the connection over which a transfer takes place. o Enumeration: a set of acceptable values. o Integer: a unsigned integer field. o Listening peer: The peer that accepts the connection over which a transfer takes place. o Peer: a system that accesses a service on a remote computer via a network, equivalent to a 'client'. o Peer Id: a unique identifier generated by a client and sent to the server, uniquely identifying the client on the network. o Range: a representation of a sequence of indices implemented as a tuple containing Integer values for minimum and maximum values. All ranges referred to in this document are inclusive. o Request Set: the set of all chunks that a client has requested and not subsequently unrequested. o Server: the system that manages the network and distribution of information among clients. o String: a string of ASCII characters. Arcieri, et al. Expires May 4, 2008 [Page 15] Internet-Draft PDTP Specification November 2007 Authors' Addresses Tony Arcieri ClickCaster, Inc. Email: tony@clickcaster.com Ashvin Mysore ClickCaster, Inc. Email: ashvin@clickcaster.com Galen Pahlke ClickCaster, Inc. Email: galen@clickcaster.com James Sanders University of Colorado Email: sanderjd@gmail.com Tom Stapleton University of Colorado Email: tstapleton@gmail.com Arcieri, et al. Expires May 4, 2008 [Page 16] Internet-Draft PDTP Specification November 2007 Intellectual Property Statement The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Disclaimer of Validity This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Copyright Statement Copyright (C) The Internet Society (2007). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. Acknowledgment Funding for the RFC Editor function is currently provided by the Internet Society. Arcieri, et al. Expires May 4, 2008 [Page 17]