How should CI differ from local mock seed values?

Locally, a fixed seed (e.g. MOCK_SEED=42) is fastest to debug. In CI, use the build or run ID as the seed so each pipeline run is unique but still fully reproducible from logs.

At what point does network simulation belong in mock handlers versus a separate proxy?

Inline handler delay is fine for a single frontend project. When multiple services share a mock stack, a proxy layer (WireMock, a local gateway) centralises throttle config and avoids duplicating middleware across every MSW handler.

Data Generation & Realism Strategies

Q: What is the difference between static fixtures and schema-driven mock data?

Static fixtures are hand-authored JSON files. Schema-driven generation synthesises payloads from an OpenAPI or JSON Schema definition, so they stay in sync with the contract as fields are added or types change.

Q: Why do flaky tests often trace back to mock data randomness?

When a faker library is not seeded, each test run generates different values. A field that is sometimes null, sometimes a long string, or sometimes an edge-case integer causes assertions to pass intermittently, making failures hard to reproduce.

When mock payloads do not mirror production schemas, integration tests pass locally and break in staging — costing the engineers who discover the gap hours of archaeology. Frontend developers, QA engineers, and platform teams all share this pain: the further mock data drifts from the real API contract, the less value local development delivers.

Where data generation fits in the local-dev stack

Mock data generation sits between your API specification and every consumer that runs locally: developer workstations, ephemeral CI runners, end-to-end test suites, and Storybook component sandboxes. The diagram below shows the four layers and how data flows through them.

Understanding this stack matters because a gap at any layer cascades downstream. A type mismatch in the spec produces invalid fixtures; an unseeded generator produces irreproducible fixtures; a mock server without validation silently serves stale payloads to every consumer.

Core concept 1 — Schema-driven payload synthesis

Establishing type-safe, contract-compliant payloads begins with a formalised specification. Schema-driven data generation ensures every mock response respects required fields, data types, enum boundaries, and nested object relationships — eliminating drift during early development cycles before a backend exists.

The json-schema-faker CLI reads your OpenAPI file and writes fixtures to an output directory. A thin config file makes it environment-aware:

// schema-faker.config.js
module.exports = {
  schemaPath: process.env.OPENAPI_SPEC || './specs/api-v2.yaml',
  outputDir: './mocks/generated',
  strictMode: process.env.NODE_ENV !== 'development',
  options: {
    useDefaultValue: true,
    requiredOnly: false,
    ignoreMissingRefs: false
  }
};

Wire up npm scripts so developers and CI share the same command surface:

{
  "scripts": {
    "generate:mocks": "json-schema-faker-cli --config schema-faker.config.js",
    "generate:mocks:ci": "CI=true npm run generate:mocks"
  }
}

Why this matters for the request interception pattern: an interceptor that returns a hand-crafted payload is a liability; one that returns a spec-generated fixture is a contract enforcer. Running generate:mocks as a pre-commit hook or in CI on spec changes catches mismatches before they reach test suites.

Trade-offs for schema-driven generation vs. hand-authored fixtures:

Dimension	Schema-driven generation	Hand-authored fixtures
Schema alignment	Always current (re-run on spec change)	Manual — drifts silently
Edge-case control	Requires explicit `x-faker` annotations	Full control per file
Setup cost	One-time tool integration	Zero tooling
Maintenance burden	Low (automated)	High (grows with API surface)
CI integration	Script-friendly	Commit-required

Use schema-driven generation as the default for all response shapes, and supplement with hand-authored fixtures only for specific edge cases (empty arrays, maximum-length strings, error payloads) that require deliberate authorship.

Core concept 2 — Deterministic seed management

Flaky tests and unpredictable UI states often trace back to unseeded randomness in mock data. Deterministic seed management anchors every faker call to a reproducible sequence, producing identical datasets across developer laptops, ephemeral CI runners, and staging preview environments while preserving statistically realistic value distributions.

// utils/mock-seed.ts
import { faker } from '@faker-js/faker';

const SEED = parseInt(process.env.MOCK_SEED ?? '42', 10);
faker.seed(SEED);

export interface MockUser {
  id: string;
  username: string;
  email: string;
  role: 'admin' | 'editor' | 'viewer';
  createdAt: string;
}

/**
 * Produces a deterministic user payload.
 * Output is identical on every run when MOCK_SEED is fixed.
 */
export function generateUser(index: number): MockUser {
  return {
    id: faker.string.uuid(),
    username: faker.internet.username(),
    email: faker.internet.email(),
    role: faker.helpers.arrayElement(['admin', 'editor', 'viewer'] as const),
    createdAt: faker.date.past({ years: 2 }).toISOString()
  };
}

Configure the seed at the environment boundary — never inside application code:

# .env.local  (developer workstation — fixed seed for instant repro)
MOCK_SEED=123456789

# .github/workflows/integration.yml
env:
  MOCK_SEED: ${{ github.run_id }}   # Unique per run, fully traceable in logs

The two-seed strategy — fixed locally, per-run-ID in CI — gives you the best of both worlds: a developer can paste a run ID from a failed CI build into MOCK_SEED locally and reproduce the exact failure in seconds.

Seed propagation checklist:

faker.seed() is called exactly once, at module initialisation, not inside individual generator functions
MOCK_SEED is documented in .env.example with a comment explaining the two-seed strategy
CI pipeline logs the active MOCK_SEED value so failures can be reproduced
Snapshot tests pin the seed in beforeAll and reset it in afterAll

Core concept 3 — Conditional logic and CI/CD integration

Production APIs rarely return identical payloads for identical requests. Realistic simulation requires context-aware routing — parsing request bodies, validating auth headers, and evaluating query parameters before returning a response. This logic must survive the same CI gates that run your unit and integration tests.

The following TypeScript MSW handler evaluates RBAC, field projection, and pagination in a single endpoint:

// mocks/handlers/users.ts
import { http, HttpResponse } from 'msw';
import { generateUser } from '../utils/mock-seed';

const PAGE_SIZE = 20;

export const userHandlers = [
  http.get('/api/v1/users', ({ request }) => {
    const auth = request.headers.get('Authorization');
    if (!auth?.startsWith('Bearer ')) {
      return HttpResponse.json({ error: 'Unauthorized' }, { status: 401 });
    }

    const url = new URL(request.url);
    const page = parseInt(url.searchParams.get('page') ?? '1', 10);
    const fields = url.searchParams.get('fields')?.split(',');

    const users = Array.from({ length: PAGE_SIZE }, (_, i) =>
      generateUser((page - 1) * PAGE_SIZE + i)
    );

    const projected = fields
      ? users.map(u =>
          Object.fromEntries(
            fields.filter(f => f in u).map(f => [f, u[f as keyof typeof u]])
          )
        )
      : users;

    return HttpResponse.json({
      data: projected,
      meta: { page, pageSize: PAGE_SIZE, total: 500 }
    });
  }),

  http.get('/api/v1/users/:id', ({ params, request }) => {
    const auth = request.headers.get('Authorization');
    if (!auth?.startsWith('Bearer ')) {
      return HttpResponse.json({ error: 'Unauthorized' }, { status: 401 });
    }

    const id = Array.isArray(params.id) ? params.id[0] : params.id;
    const numericId = parseInt(id, 10);
    if (isNaN(numericId) || numericId < 1 || numericId > 500) {
      return HttpResponse.json({ error: 'Not found' }, { status: 404 });
    }

    return HttpResponse.json(generateUser(numericId));
  })
];

CI integration surface: register the handlers in your test setup file (vitest.setup.ts or jest.setup.ts) and start the worker before the suite runs. In dockerized mock environments, the same handlers can be served from a Node process inside a sidecar container, making them available to every service in docker-compose.yml — not just the single frontend under test.

# docker-compose.mock.yml
services:
  mock-api:
    build:
      context: .
      dockerfile: Dockerfile.mock
    environment:
      - MOCK_SEED=${MOCK_SEED:-42}
      - NETWORK_PROFILE=${NETWORK_PROFILE:-local}
    ports:
      - "3001:3001"
    healthcheck:
      test: ["CMD", "wget", "--spider", "-q", "http://localhost:3001/health"]
      interval: 10s
      timeout: 5s
      retries: 3

This aligns with mock lifecycle management principles: the mock service has a well-defined start, health-check, and teardown surface that CI can orchestrate reliably.

Core concept 4 — Network realism and operational concerns

Data realism extends beyond payload structure to delivery characteristics. Applications must handle latency spikes, partial failures, and bandwidth constraints. Integrating network condition simulation into local mock layers ensures frontend and mobile clients are tested against realistic transport-layer behaviour before those conditions appear in production.

// mocks/middleware/throttle.ts
import { delay } from 'msw';

type NetworkProfile = 'local' | 'staging' | 'poor_network' | 'offline';

const LATENCY_PROFILES: Record<NetworkProfile, { min: number; max: number }> = {
  local:        { min: 50,   max: 200  },
  staging:      { min: 300,  max: 800  },
  poor_network: { min: 1000, max: 3000 },
  offline:      { min: 5000, max: 5000 }
};

const profile =
  LATENCY_PROFILES[(process.env.NETWORK_PROFILE as NetworkProfile) ?? 'local'];

/** Call at the top of an MSW handler to apply profile-matched network delay. */
export async function applyNetworkDelay(): Promise<void> {
  const jitter =
    Math.floor(Math.random() * (profile.max - profile.min)) + profile.min;
  await delay(jitter);
}

/** 5% random failure gate — toggled by SIMULATE_FAILURES=true */
export function shouldSimulateFailure(): boolean {
  return process.env.SIMULATE_FAILURES === 'true' && Math.random() < 0.05;
}

Apply both utilities inside any handler that needs transport realism:

// mocks/handlers/products.ts
import { http, HttpResponse } from 'msw';
import { applyNetworkDelay, shouldSimulateFailure } from '../middleware/throttle';

export const productHandlers = [
  http.get('/api/v1/products', async () => {
    await applyNetworkDelay();

    if (shouldSimulateFailure()) {
      return HttpResponse.json(
        { error: 'Service temporarily unavailable' },
        { status: 503 }
      );
    }

    return HttpResponse.json([
      { id: '1', name: 'Widget Pro', price: 49.99, stock: 142 },
      { id: '2', name: 'Gadget Plus', price: 89.99, stock: 0 }
    ]);
  })
];

Configure the network profile at the environment boundary:

# .env.local
NETWORK_PROFILE=local
SIMULATE_FAILURES=false

# GitHub Actions — degraded-network integration job
env:
  NETWORK_PROFILE: poor_network
  SIMULATE_FAILURES: "true"

Operational health check: expose a GET /health endpoint in your mock server that returns { status: "ok", seed: MOCK_SEED, profile: NETWORK_PROFILE }. Docker Compose healthcheck and CI wait-for scripts can gate dependent services on this endpoint, preventing race conditions at startup.

Decision guide — Choosing a data generation approach

Use this matrix when deciding how to generate mock data for a given scenario:

Scenario	Recommended approach	Why
New API surface, no backend yet	Schema-driven generation from OpenAPI draft	Stays aligned as the spec evolves
Regression test suite requiring snapshot stability	Deterministic seed + checked-in generated fixtures	Tests are reproducible without a running generator
Auth, pagination, or field projection logic	Dynamic MSW handler with conditional branching	Fixtures cannot express request-context dependencies
Mobile app under poor connectivity	Network delay middleware (`NETWORK_PROFILE=poor_network`)	Exercises timeout handling, retry logic, and skeleton states
Multi-service integration test in CI	Docker Compose mock sidecar with health checks	Shared across all consumers; no per-service MSW setup
Contract drift detection	AJV CI gate against spec-exported JSON Schema	Catches mismatches before they reach staging

When a scenario spans multiple rows — for example, a mobile integration test with auth and poor connectivity — layer the approaches: start with schema-driven fixtures, serve them through a dynamic handler that validates auth headers, and wrap the handler with network delay middleware.

Common failure modes and mitigations

1. Fixtures pass CI but fail staging because the spec was not regenerated after a field was renamed.

Mitigation: run generate:mocks as a step in the CI job that also runs tsc or OpenAPI linting. If the spec hash changes but fixtures are not regenerated, fail the build.

2. Seed is set in faker.seed() inside a factory function, resetting the sequence on every call.

Mitigation: call faker.seed() exactly once at module load time. Audit with a lint rule that disallows faker.seed inside non-setup files.

3. The MSW worker fails silently in test environments, so all requests fall through to the real network.

Mitigation: enable onUnhandledRequest: 'error' in the worker setup so unmatched requests fail tests loudly rather than making real HTTP calls. This is covered in depth under advanced MSW handler patterns.

4. Network delay middleware is not reset between test cases, causing timeout flakiness.

Mitigation: read process.env.NETWORK_PROFILE at handler invocation time, not at module load time. This allows per-test environment overrides without module cache invalidation.

5. Generated fixtures exceed MSW’s in-memory response size limit for large list endpoints.

Mitigation: cap generated list sizes at 50 items in development (use a MAX_MOCK_ITEMS env var) and use cursor-based pagination responses so tests do not load the full dataset. The response shaping techniques section covers how to structure paginated responses correctly.

6. Mock server port conflicts between local dev and CI parallel jobs.

Mitigation: randomise the port via PORT=$(shuf -i 3000-4000 -n1) or use Docker’s expose without a fixed host port, letting the orchestrator assign one. Pass the resolved URL to consumers via an env var.

FAQ

What is the difference between static fixtures and schema-driven mock data?

Static fixtures are hand-authored JSON files committed to the repository. Schema-driven generation synthesises payloads from an OpenAPI or JSON Schema definition at build time, so every generated fixture reflects the current contract. When a field is renamed or a new required property is added, re-running the generator surfaces the gap immediately rather than waiting for a test to fail in staging.

Why do flaky tests often trace back to mock data randomness?

When @faker-js/faker is not seeded, each test run generates different values. A field that is sometimes null, sometimes a 300-character string, or sometimes a boundary integer causes assertions to pass intermittently. The failure is non-deterministic — it cannot be reproduced from the CI log alone. Seeding eliminates the variable.

How should the CI seed differ from the local developer seed?

Locally, a fixed seed (e.g. MOCK_SEED=42) is fastest to debug because the data is always the same. In CI, set the seed to the pipeline run ID (${{ github.run_id }} in GitHub Actions). This makes each run unique but fully reproducible: a developer can copy the run ID from a failed build log, set it as MOCK_SEED locally, and reproduce the exact dataset that broke the test.

When does network simulation belong in handlers versus a separate proxy layer?

Inline delay() middleware is appropriate when a single frontend project needs to test its own loading states and error boundaries. When multiple services share a mock stack — as in a dockerized mock environment — centralise throttle configuration in a proxy (WireMock, a local API gateway) so every consumer inherits the same transport profile without duplicating middleware. The proxy vs inline mocking strategies comparison covers this decision in detail.

How do I prevent mock data from drifting out of sync with the real API?

Run an AJV validation script against all generated fixture files on every CI build. Export your OpenAPI spec as JSON Schema, hash the spec in the CI job, and fail the build when fixtures were not regenerated after a spec change. Additionally, integrate mock lifecycle management practices so that generated fixtures are treated as derived artefacts — never edited by hand — and regenerated from the canonical spec on every spec bump.

Schema-Driven Data Generation — generating type-safe fixtures from OpenAPI and JSON Schema definitions
Deterministic Seed Management — locking randomisation sequences for reproducible CI and local debugging
Advanced MSW Handler Patterns — dynamic handlers, conditional branching, and request-body parsing
Response Shaping Techniques — structuring paginated, filtered, and error responses
API Mocking Fundamentals & Architecture — the broader interception, lifecycle, and proxy architecture this data layer feeds into

← Back to Home