Credence is a robust and decentralized system for evaluating the reputation of files in a peer-to-peer filesharing system. Our goal is to enable peers to confidently gauge file authenticity, the degree to which a file's contents matches its advertised description.
At the most basic level, Credence employs a simple, network-wide voting scheme where users can contribute positive and negative evaluations of files. On top of this, a client uses statistical tests to weight the importance of votes from their peers. And finally, Credence allows clients to extend the horizon of information by selectively sharing information with their peers.
Authenticity and Pollution
We define pollution broadly as any file with content that does not match its description. An authentic file, by contrast, has content that is accurately described by its metadata. We find in practice that pollution in current networks can be easily identified by users without any special knowledge or expertise. As pollution becomes more sophisticated, more advanced detection techniques will need to be developed to help users safely identify malicious content.
The Credence system relies on individual users as the first line of defense against pollution. After a user downloads and uses a file, she is given a chance to submit a single vote to the Credence system: a positive (thumbs-up) vote for authentic files, and a negative (thumbs-down) vote for a polluted file. Each vote is cryptographically signed and entered into the system.
Credence uses these votes collected in the network to determine the authenticity of content. Credence displays a rating for each file that appears in response to a user query.
First, the client software executes a search for votes, and downloads a number of votes randomly selected from the network. These votes are then aggregated into a single estimate of the authenticity of the file in question.
Each vote collected from the network is not used directly, however, since some peers in the network may accidentally vote incorrectly, or even lie intentionally about the file's authenticity. Therefore we assign to each peer a correlation coefficient, or weight, reflecting the historical usefulness of the peer's votes. In effect, this helps remove the incentive for an attacker to lie about the authenticity of files. A consistent liar is, after all, just as useful as an honest peer when it comes to distinguishing authentic files and pollutions. And an inconsistent voter will come to be be ignored by others in the network.
Information Sharing and Transitive Correlation
Peer-to-peer networks can grow quite large, and many clients might participate rarely, sharing and voting on only a few files. This means that alone, a client may have trouble quickly discovering peer correlations and other historical data. To alleviate this problem, Credence uses a technique called transitive correlation to quickly spread information among small groups of peers and help clients expand their horizon .
In Credence, a client periodically requests historical data from selected peers in the network. This data contains information on how the peer voted in the past (cryptographically signed, as before), and information about how the peer is related to other peers in the network. The client can then validate this information for authenticity, then integrate it into its local databases. In this way, not only does the client take advantage of the work other peers do in evaluating files for authenticity, but also gains insight into the behavior of peers in the network. All this is done without need for user interaction, or any peer trust values, which can be difficult for a user to accurately determine.
Changes to the LimeWire Client and Gnutella Network
Credence is integrated into the LimeWire client, and works on top of the Gnutella network. The implementation is built entirely on top of existing primitives in the Gnutella protocol. It opens up no additional ports, does not require changes to the underlying protocol, and is backwards compatible with other Gnutella clients that are not using Credence.
Privacy is ensured in Credence by not collecting or using any personally identifiable information in any way in the protocol. Each Credence-equipped client is supplied with a unique, randomly generated key pair for use in cryptographic operations. The keys are not logged and are not in any way tied to any personal information. No record exists of which keys have been given out.