Distributed Hashtable
Source Code on GitHub #
Hashtable is among the most useful data structures. A distributed hashtable is a hashtable served by a cluster of machines. It can be accessed through network. We demonstrate here how to implement a distributed hashtable on top of GE and how it can be accessed via user-defined interfaces.
Data Model #
A hashtable consists of a set of buckets, each of which is collection of key-value pairs. We use string to represent both the Key and Value in this demo application. In GE, we can use a cell as a hashtable bucket. We define BucketCell in TSL for this purpose:
struct KVPair
{
string Key;
string Value;
}
cell struct BucketCell
{
List<KVPair> KVList;
}
Given an arbitrary Key, we hash it to a 64-bit cell id. Then we use this id to reference the cell bucket.
Setting A Key-Value Pair #
Because we are working on a distributed hashtable, we need to define protocols to allow the clients to access and manipulate the hashtable. Let us look at the protocol for setting a key-value pair.
The Set protocol is straightforward: we just need to send a Key (string) and a Value (string) to the server. Here is the protocol specification in TSL:
struct SetMessage
{
string Key;
string Value;
}
protocol Set
{
Type:Syn;
Request:SetMessage;
Response:void;
}
The implementation of the server side Set operation is also straightforward:
public override void SetHandler(SetMessageReader request)
{
long cellId = HashHelper.HashString2Int64(request.Key);
using (var cell = Global.LocalStorage.UseBucketCell(cellId, CellAccessOptions.CreateNewOnCellNotFound))
{
int count = cell.KVList.Count;
int index = -1;
for (int i = 0; i < count; i++)
{
if (cell.KVList[i].Key == request.Key)
{
index = i;
break;
}
}
if (index != -1)
{
cell.KVList[index].Value = request.Value;
}
else
cell.KVList.Add(new KVPair(request.Key, request.Value));
}
}
In the cell bucket, we search the key against the key-value pair list. If we find a match, we update the value; otherwise, we add a new value.
Getting A Key-Value Pair #
Get operation is even simpler. It fetches the value by the user-specified key. This is the protocol specification:
struct GetMessage
{
string Key;
}
struct GetResponse
{
bool IsFound;
string Value;
}
protocol Get
{
Type:Syn;
Request:GetMessage;
Response:GetResponse;
}
This is the implementation:
public override void GetHandler(GetMessageReader request, GetResponseWriter response)
{
long cellId = HashHelper.HashString2Int64(request.Key);
response.IsFound = false;
using (var cell = Global.LocalStorage.UseBucketCell(cellId, CellAccessOptions.ReturnNullOnCellNotFound))
{
if (cell == null)
return;
int count = cell.KVList.Count;
for (int i = 0; i < count; i++)
{
if (cell.KVList[i].Key == request.Key)
{
response.IsFound = true;
response.Value = cell.KVList[i].Value;
break;
}
}
}
}
If we cannot find a matched bucket, we return immediately. Otherwise, we search the key against the key-value pair list in the matched bucket cell.