Modeling Large Data
TL;DR: When state includes large data like images or binary blobs, cloning becomes expensive. Use
[SharedState]to share large immutable data by reference across clones, or use seeded random data (store seed + length, regenerate bytes on demand) when content doesn't need to be meaningful.
The Problem: Cloning Copies Everything
Accordant clones state on every transition. When you call .ThenState(...), you get a fresh copy of the state object. This keeps the original state immutable and makes reasoning about transitions straightforward.
But here's the issue: everything gets copied. If your state includes large data — images, binary blobs, documents — each clone duplicates all of it.
[State]
public partial class ImageStoreState
{
public Dictionary<string, ImageState> Images { get; set; } = new();
}
[State]
public partial class ImageState
{
public string Name { get; set; }
public List<byte> Content { get; set; } // Could be megabytes!
}
Every time an operation transitions state, that entire Content list gets deep-cloned. With a few images, you're copying megabytes on every transition. The old copies eventually get garbage collected, but you're still churning through memory and slowing things down.
Solution 1: The [SharedState] Pattern
For large, immutable data, mark the property with [SharedState]. This tells Accordant: "Don't deep-clone this — just copy the reference."
using System.IO.Hashing;
[State]
public partial class ImageState
{
public string Name { get; set; }
[SharedState(Fingerprint = nameof(ContentFingerprint))]
public List<byte> Content { get; set; }
private string ContentFingerprint(List<byte> content) =>
content == null ? "" : XxHash64.HashToUInt64(content.ToArray()).ToString();
}
How It Works
Cloning by reference: When the state is cloned, the
Contentproperty is copied by reference, not deep-cloned. Both the old and new state point to the sameList<byte>in memory.Fingerprinting for equality: Accordant still needs to know when two states are "the same" for test generation. The
Fingerprintmethod computes a hash that represents the property's value. States with the same fingerprint are considered equal.Immutability assumption: This only works if the shared data is treated as immutable. If you modify the content in one state, all states sharing that reference will see the change — which breaks the model.
When to Use [SharedState]
- Large binary data: Images, files, documents, serialized blobs
- Immutable reference types: Data that's set once and never modified
- Expensive to clone: Anything where deep-copying is a performance problem
The Fingerprint Method
The Fingerprint parameter is required for types that Accordant can't automatically hash (like List<byte>). The method must:
- Take the property type as its single parameter
- Return a
string - Be deterministic (same input → same output)
Use State.ComputeHash64() — the same XxHash64-based function Accordant uses internally:
using System.IO.Hashing;
private string ContentFingerprint(byte[] content)
{
if (content == null) return "";
return XxHash64.HashToUInt64(content).ToString();
}
// Or use the State helper method for string-based hashing:
private string ContentFingerprint(List<byte> content)
{
if (content == null) return "";
return State.ComputeHash64(Convert.ToBase64String(content.ToArray())).ToString();
}
XxHash64 is fast and consistent with how Accordant computes state equality internally.
Solution 2: Seeded Random Data
When you control the data generation, you can avoid storing large data entirely. Instead, store just a seed and length — enough information to regenerate the exact same bytes on demand.
The Idea
A seeded random number generator produces deterministic output: given the same seed, you get the same sequence of bytes every time. So instead of storing megabytes of image data, store two numbers:
[State]
public partial class ImageState
{
public string Name { get; set; }
public int ContentSeed { get; set; } // Seed for random generator
public int ContentLength { get; set; } // How many bytes to generate
}
When you need the actual bytes — to send in a request or validate a response — regenerate them:
public static byte[] GenerateContent(int seed, int length)
{
var random = new Random(seed);
var bytes = new byte[length];
random.NextBytes(bytes);
return bytes;
}
Full Example
[State]
public partial class ImageStoreState
{
public Dictionary<string, ImageState> Images { get; set; } = new();
}
[State]
public partial class ImageState
{
public string Name { get; set; }
public int ContentSeed { get; set; }
public int ContentLength { get; set; }
public byte[] GetContent() => GenerateContent(ContentSeed, ContentLength);
public static byte[] GenerateContent(int seed, int length)
{
var random = new Random(seed);
var bytes = new byte[length];
random.NextBytes(bytes);
return bytes;
}
}
Upload Operation
spec.Operation<UploadRequest, ApiResult>("Upload", (request, state) =>
{
return Expect.That<ApiResult>(r => r.IsSuccess)
.ThenState<ImageStoreState>(s => s.Images[request.Name] = new ImageState
{
Name = request.Name,
ContentSeed = request.Seed,
ContentLength = request.Length
});
});
Download Operation
spec.Operation<string, ApiResult<byte[]>>("Download", (name, state) =>
{
if (!state.Images.TryGetValue(name, out var image))
return Expect.That<ApiResult<byte[]>>(r => r.IsNotFound).SameState();
var expectedContent = image.GetContent();
return Expect.That<ApiResult<byte[]>>(r =>
r.IsSuccess &&
r.Data.SequenceEqual(expectedContent))
.SameState();
});
Test Inputs
var inputs = new InputSet
{
// Upload with seed=42, 1KB of data
spec.GetOperation("Upload").With(new UploadRequest("img1", Seed: 42, Length: 1024)),
// Upload with seed=123, 10KB of data
spec.GetOperation("Upload").With(new UploadRequest("img2", Seed: 123, Length: 10240)),
spec.GetOperation("Download").With("img1"),
spec.GetOperation("Download").With("img2"),
spec.GetOperation("Download").With("nonexistent"),
};
When to Use Seeded Data
- Testing with arbitrary binary data: When the specific bytes don't matter, just that they're consistent
- Large payloads: When you want to test with megabytes of data without storing megabytes in state
- Reproducibility: Same seed always generates the same content — tests are deterministic
Comparing the Approaches
| Approach | When to Use | Trade-offs |
|---|---|---|
[SharedState] |
Real binary data from external sources; data must persist in state | Requires immutability discipline; fingerprint method needed |
| Seeded Random Data | Arbitrary test data; content doesn't need to be meaningful | Only works when you control data generation |
Example: Uploading Real Images
If you're testing with actual image files (PNG, JPEG) where the content matters, [SharedState] is the right choice:
using System.IO.Hashing;
[State]
public partial class ImageState
{
public string Name { get; set; }
[SharedState(Fingerprint = nameof(ContentFingerprint))]
public byte[] Content { get; set; }
private string ContentFingerprint(byte[] content) =>
content == null ? "" : XxHash64.HashToUInt64(content).ToString();
}
Example: Testing with Arbitrary Binary Data
If you just need "some bytes" to test upload/download behavior, seeded data keeps state tiny:
[State]
public partial class ImageState
{
public string Name { get; set; }
public int ContentSeed { get; set; } // 4 bytes instead of megabytes
public int ContentLength { get; set; } // 4 bytes
public byte[] GetContent()
{
var random = new Random(ContentSeed);
var bytes = new byte[ContentLength];
random.NextBytes(bytes);
return bytes;
}
}
Summary
When your spec needs to track large data:
First, ask if you need the actual bytes. If you just need "some data" for testing, use seeded random data — store
(seed, length)and regenerate bytes on demand.If you must store real content, use
[SharedState]. Mark large, immutable properties so they're shared by reference instead of deep-cloned.Provide a fingerprint method using XxHash64. Accordant needs to know when two states are "equal" for test generation — use the same hash function it uses internally.
Treat shared data as immutable. Never modify shared state in place — always create new instances.