Core Concepts β
Understand how distributed locks work in SyncGuard.
Architecture & Design Decisions
Curious about why things work this way? See specs/adrs.md for architectural decision records explaining the rationale behind key design choices.
Lock Lifecycle β
Every lock follows a three-phase lifecycle:
- Acquire β Request exclusive access with a unique key
- Execute β Run your critical section while holding the lock
- Release β Free the lock for others (or let TTL expire)
// Automatic lifecycle management
await lock(
async () => {
// Execute phase (critical section)
},
{ key: "resource:123" },
);
// Manual lifecycle control with automatic cleanup (Node.js β₯20)
{
await using lock = await backend.acquire({
key: "resource:123",
ttlMs: 30000,
});
if (lock.ok) {
// Execute phase
// Lock automatically released on scope exit
}
}Contention Handling
If another process holds the lock, acquisition returns { ok: false, reason: "locked" } (manual mode) or retries automatically (auto mode).
Crash Safety
Locks expire via TTL even if your process crashes. No manual cleanup required.
Lock Keys β
Keys identify the resource being locked. Choose keys that uniquely represent your protected operation:
Good keys β
`payment:${paymentId}` // Unique per payment
`job:daily-report:${date}` // Unique per day
`deploy:${environment}` // One deploy at a time per env
`webhook:${eventId}`; // Idempotent webhook handlingBad keys β
"lock" // Too generic, serializes everything
`user:${userId}`; // Too broad, blocks all user operations
Math.random().toString(); // Random keys defeat the purposeKey constraints:
- Maximum 512 bytes after UTF-8 encoding
- Automatically normalized to NFC form
- Hashed when prefixed keys exceed backend limits (transparent to you)
Namespacing
Backend prefixes prevent cross-app collisions:
- Redis:
"syncguard:payment:123" - PostgreSQL: row in
"syncguard_locks"table - Firestore: document ID in
"locks"collection
Ownership & Lock IDs β
Every lock gets a unique lockId (22-character base64url string, 128 bits of entropy). Only the owner can release or extend the lock.
const result = await backend.acquire({ key: "resource:123", ttlMs: 30000 });
if (result.ok) {
const { lockId } = result;
// Only this lockId can release/extend this specific lock
await backend.extend({ lockId, ttlMs: 30000 });
await backend.release({ lockId });
}Why lock IDs matter:
- Idempotency β Same
lockIdcan't accidentally release someone else's lock - Concurrent safety β Two processes acquiring locks on the same key get different lock IDs
- Explicit ownership β Operations require proof of ownership via
lockId
Checking ownership β Use the helper functions for better discoverability:
import { owns, getById } from "syncguard";
// Quick boolean check - simple and clear
if (await owns(backend, lockId)) {
console.log("Still own the lock");
}
// Detailed info - includes expiration and fence tokens
const info = await getById(backend, lockId);
if (info) {
console.log(`Expires in ${info.expiresAtMs - Date.now()}ms`);
console.log(`Fence token: ${info.fence}`);
}TTL & Expiration β
Locks expire automatically after ttlMs milliseconds. This prevents orphaned locks when processes crash.
Choosing TTL:
// Short critical sections (default: 30s)
await lock(
async () => {
// Your work
},
{ key: "quick-task", ttlMs: 30000 },
);
// Long-running batch jobs
await lock(
async () => {
// Your work
},
{ key: "daily-report", ttlMs: 300000 },
); // 5 minutesGuidelines:
- Too short β οΈ β Lock expires mid-work β potential race conditions
- Too long β οΈ β Crashed processes block others for longer
- Sweet spot β β 2-3x your expected work duration
Extending locks (for work that takes longer than expected):
// With automatic cleanup (Node.js β₯20)
{
await using lock = await backend.acquire({
key: "batch:report",
ttlMs: 60000,
});
if (lock.ok) {
// TypeScript narrows lock to include handle methods after ok check
await processFirstBatch();
// Extend lock before it expires
await lock.extend(60000); // Reset to 60s from now
await processSecondBatch();
// Lock automatically released
}
}TTL Replacement Behavior
extend() replaces the TTL entirelyβit doesn't add to the remaining time. Extending with ttlMs: 60000 resets expiration to 60 seconds from now, not from the original acquisition.
Heartbeat pattern (for very long-running work):
{
await using lock = await backend.acquire({ key: "long-task", ttlMs: 60000 });
if (!lock.ok) throw new Error("Failed to acquire lock");
// TypeScript narrows lock to include handle methods after ok check
// Extend every 30s (half the TTL)
const heartbeat = setInterval(async () => {
const extended = await lock.extend(60000);
if (!extended.ok) {
clearInterval(heartbeat);
throw new Error("Lost lock ownership");
}
}, 30000);
try {
await doLongRunningWork();
} finally {
clearInterval(heartbeat);
// Lock automatically released
}
}Retry Strategy β
When locks are contended, the lock() helper retries automatically using exponential backoff with jitter.
Default retry behavior:
await lock(
async () => {
// Your work function
},
{
key: "resource:123",
acquisition: {
maxRetries: 10, // Try up to 10 times (default)
retryDelayMs: 100, // Start with 100ms delay (default)
timeoutMs: 5000, // Give up after 5s total (default)
},
},
);How it works:
- First attempt fails β wait 100ms
- Second attempt fails β wait ~200ms (Β± jitter)
- Third attempt fails β wait ~400ms (Β± jitter)
- Continue doubling until success or timeout
Jitter Prevents Thundering Herd
Jitter (50% randomization) prevents all processes from retrying simultaneously:
- Without jitter: 10 processes retry at exactly 100ms, 200ms, 400ms...
- With jitter: processes spread out between 50-150ms, 100-300ms, 200-600ms...
Custom retry strategies:
// More patient (higher contention tolerance)
await lock(
async () => {
// Your work
},
{
key: "hot-resource",
acquisition: {
maxRetries: 20,
timeoutMs: 10000,
},
},
);
// Less patient (fail fast)
await lock(
async () => {
// Your work
},
{
key: "quick-check",
acquisition: {
maxRetries: 3,
timeoutMs: 1000,
},
},
);
// No retries (single attempt)
const result = await backend.acquire({ key: "resource:123", ttlMs: 30000 });
if (!result.ok) {
// Handle contention immediately
}When acquisition fails:
try {
await lock(
async () => {
// Your work
},
{ key: "resource:123" },
);
} catch (error) {
if (error instanceof LockError && error.code === "AcquisitionTimeout") {
// Exceeded timeoutMs after all retries
console.log("Resource too contended, try again later");
}
}Ownership Checking β
Diagnostic Use Only
Ownership checks are for diagnostics, UI, and monitoring β NOT correctness guards. Never use check β mutate patterns. Correctness relies on atomic ownership verification built into release() and extend() operations.
Recommended approach β Use the helper functions for clarity and discoverability:
Check if a resource is locked:
import { getByKey } from "syncguard";
const info = await getByKey(backend, "resource:123");
if (info) {
console.log(`Locked until ${new Date(info.expiresAtMs)}`);
console.log(`Fence token: ${info.fence}`);
} else {
console.log("Resource is available");
}Check if you still own a lock:
import { owns, getById } from "syncguard";
// Simple boolean check
const stillOwned = await owns(backend, lockId);
if (!stillOwned) {
throw new Error("Lost lock ownership");
}
// Or get detailed information
const info = await getById(backend, lockId);
if (info) {
console.log(
`Still own the lock, expires in ${info.expiresAtMs - Date.now()}ms`,
);
}Helper Functions vs Direct Method
The helpers (getByKey, getById, owns) provide better discoverability and clearer intent than calling backend.lookup() directly. They're the recommended approach for lock diagnostics. For advanced cases, you can still use backend.lookup({ key }) or backend.lookup({ lockId }) directly.
Security Note
Helpers return sanitized data with hashed keys/lockIds by default. For debugging with raw values, use getByKeyRaw() or getByIdRaw() helpers.
When to use ownership checking:
- β Diagnostics: "Why is this resource locked?"
- β Monitoring: Track lock expiration times
- β Conditional logic: "Should I wait or skip?"
- β UI display: Show lock status to users
When NOT to use ownership checking:
- β Pre-checking before
extend()orrelease()(operations are idempotent) - β Gating mutations (use fencing tokens instead, see Fencing Tokens)
- β Polling for lock availability (use retry logic in
lock()instead)