Robust Distributed System Nucleus (rDSN) is a framework for quickly building robust distributed systems. It has a microkernel for pluggable components, including applications, distributed frameworks, devops tools, and local runtime/resource providers, enabling their independent development and seamless integration. The project was originally developed for Microsoft Bing, and now has been adopted in production both inside and outside Microsoft.
The core of rDSN is a service kernel with which we can develop (via Service API and Tool API) and plugin lots of different application, framework, tool, and local runtime modules, so that they can seamlessly benefit each other. Here is an incomplete list of the pluggable modules.
Pluggable modules | Description | Release |
---|---|---|
dsn.core | rDSN service kernel | 1.0.0 |
dsn.dist.service.stateless | scale-out and fail-over for stateless services (e.g., micro services) | 1.0.0 |
dsn.dist.service.stateful.type1 | scale-out, replicate, and fail-over for stateful services (e.g., storage) | 1.0.0 |
dsn.dist.service.meta_server | membership, load balance, and machine pool management for the above service frameworks | 1.0.0 |
dsn.dist.uri.resolver | a client-side helper module that resolves service URL to target machine | 1.0.0 |
dsn.dist.traffic.router | fine-grain RPC request routing/splitting/forking to multiple services (e.g., A/B test) | todo |
dsn.tools.common | deployment runtime (e.g., network, aio, lock, timer, perf counters, loggers) for both Windows and Linux; simple toollets, such as tracer, profiler, and fault-injector | 1.0.0 |
dsn.tools.nfs | an implementation of remote file copy based on rpc and aio | 1.0.0 |
dsn.tools.emulator | an emulation runtime for whole distributed system emulation with auto-test, replay, global state checking, etc. | 1.0.0 |
dsn.tools.hpc | high performance counterparts for the modules as implemented in tools.common | todo |
dsn.tools.explorer | extracts task-level dependencies automatically | 1.0.0 |
dsn.tools.log.monitor | collect critical logs (e.g., log-level >= WARNING) in cluster | 1.0.0 |
dsn.app.simple_kv | an example application module | 1.0.0 |
rDSN provides flexible configuration so that developers can combine and configure the modules differently to enable different scenarios. All modules are loaded by dsn.svchost, a common process runner in rDSN, with the given configuration file. The following table lists some examples (note dsn.core is always required therefore omitted in Modules
column).
Scenarios | Modules | Config | Demo |
---|---|---|---|
logic correctness development | dsn.app.simple_kv + dsn.tools.emulator + dsn.tools.common | config | todo |
logic correctness with failure | dsn.app.simple_kv + dsn.tools.emulator + dsn.tools.common | config | todo |
performance tuning | dsn.app.simple_kv + dsn.tools.common | config | todo |
progressive performance tuning | dsn.app.simple_kv + dsn.tools.common + dsn.tools.emulator | config | todo |
Paxos enabled stateful service | dsn.app.simple_kv + dsn.tools.common + dsn.tools.emulator + dsn.dist.uri.resolver + dsn.dist.serivce.meta_server + dsn.dist.service.stateful.type1 | config | todo |
There are a lot more possibilities. rDSN provides a web portal to enable quick deployment of these scenarios in a cluster, and allow easy operations through simple clicks as well as rich visualization. Deployment scenarios are defined here, and developers can add more on demand.
rDSN borrows the idea in many research work, from both our own and the others, and tries to make them real in production in a coherent way; we greatly appreciate the researchers who did these work.
rDSN is provided on Windows and Linux, with the MIT open source license. You can use the “issues” tab in GitHub to report bugs.