Existing Distributed Hash Table (DHT) based structured Peer-to-Peer (P2P) systems have not focused on the problem of building a scalable storage layer that can potentially scale up to millions of key value pairs in each peer. This kind of scenario occurs quite often in practice in several application classes. For instance, in semantic shared desktop, distributed software engineering, or Web data resources, each peer could have millions of entries.
An obvious solution is to use a Relational Database Management Systems (RDBMS) at each peer to store all key-value pairs (indexes) of this peer. These index entries can be inserted into the DHT one by one. However, it turns out that RDBMSs are not sufficient for building this storage layer, mainly due to the nature of operations required: search on large number of key-value pairs and bulk updates (when new peers join the P2P network or when there is a load balancing and need to update indexes). This agrees quite well with the recent trend of researchers finding that RDBMS is insufficient from a performance perspective for specific applications: see Stonebraker's interview. We are building a storage layer that is specific to P2P systems, DHTs in particular, that we expect will improve the performance of DHTs by orders of magnitude.
We are working on the first version of P-store, extensions to P-Grid that addresses the above problem of building a scalable storage layer for DHTs. A key idea is the use of flat files for the bulk operations and to update the RDBMS asynchronously. The RDBMS could still be used for answering queries. The important challenge it to keep the two, the flat file and the RDBMS in synchronization.
P-Store is an internal project managed by Dr. Vijay Srinivas Agneeswaran. Key members of the team include Roman Schmidt, Surender Reddy Yerva and Nicolas Bonvin. The corresponding implementation will be available soon (beginning of next year) in our download section.