Implementing End-to-End Tracing with JWT Context in Node.js and Recoil Applications


A user reports a slow operation, but the logs show zero errors. The investigation begins. The backend team checks the API traces in Jaeger and finds that internal service latency is normal, with a P99 response time of 80ms. The problem doesn’t seem to be on the backend. The frontend team examines a browser performance recording and discovers that the network request isn’t even fired until 1.5 seconds after the button is clicked. This gap is an observability abyss. We have backend traces and frontend performance profiles, but they are disconnected. We can’t answer the most fundamental question: where exactly is time being spent across the entire lifecycle, from user click to data rendering?

This scenario is incredibly common in real-world projects. The root cause is the disconnect between frontend and backend monitoring systems. To build a complete picture of the user experience, we must link frontend actions with backend service calls in a single, unified trace. This article documents a complete, hands-on implementation: building end-to-end distributed tracing from scratch between a React/Recoil frontend and a Node.js backend. We’ll use OpenTelemetry as our standard, enrich traces with user identity information carried by JWTs, and ultimately visualize the entire user operation in Jaeger.

Initial Design and Technology Selection

Our goal: when a user triggers an action on the frontend, generate a globally unique traceId. This traceId and its context will travel with the API request to the backend. Upon receiving the request, the backend service will recognize this trace context and attach its own spans as children under the same trace.

Achieving this requires solving several core problems:

  1. Tracing Standard: How do we generate and propagate standards-compliant tracing data across heterogeneous frontend and backend environments?

    • Decision: OpenTelemetry (OTel). It provides a unified API and SDKs for both frontend JavaScript and backend Node.js, making it the de facto community standard.
  2. Frontend Context Management: In a React SPA, how do we generate, store, and automatically inject trace context into outgoing HTTP requests?

    • Initial Thought: Use React Context.
    • Final Decision: Our project already uses Recoil for state management. Managing authentication state (like a JWT) is a common use case for it. While OTel’s ContextManager can handle most of the automatic trace context propagation, the intersection of authentication state and tracing data requires fine-grained control. Recoil will primarily manage the JWT token, while the OTel Web SDK will handle the automatic injection of trace context.
  3. Cross-Boundary Context Propagation: How do we securely and reliably pass the frontend-generated trace context to the backend?

    • Option A: Custom HTTP Header, like X-Trace-Context. This is the most direct approach but risks being stripped by some gateways or proxies.

    • Option B: Standard Headers. The W3C Trace Context specification defines the standard traceparent and tracestate HTTP headers. This is OTel’s default propagation method and the best practice.

    • Option C: Embed in JWT Payload. This idea is tempting—encoding the traceId and spanId directly into the JWT. However, it was quickly rejected. JWTs typically have a longer lifecycle, whereas traces and spans are ephemeral. Re-issuing a JWT for every single request is unacceptable.

    • Final Decision: We’ll go with Option B, using the standard traceparent header. So, what’s the role of the JWT? It’s no longer for propagating trace context but for carrying stable user identity. On the backend, we can parse the userId or tenantId from the JWT and attach this information as tags to our backend spans. This dramatically improves the business readability of our traces, allowing us to quickly filter for all operations performed by a specific user.

  4. Backend Implementation: How does the Node.js service receive the trace context and link it to its own traces?

    • Decision: Use the OpenTelemetry SDK for Node.js. Its provided instrumentations automatically create spans for common modules like Express and HTTP and handle the incoming traceparent header out of the box.
  5. Visualization and Storage: Where do we send the collected trace data?

    • Decision: Jaeger. It’s simple to deploy, has an intuitive UI, and is perfect as a starting point for a tracing backend. We’ll use Docker Compose to spin up a local Jaeger instance quickly.

The data flow for the entire architecture looks like this:

sequenceDiagram
    participant User
    participant ReactApp as React App (Recoil)
    participant OTelWebSDK as OTel Web SDK
    participant NodeAPI as Node.js API
    participant OTelNodeSDK as OTel Node.js SDK
    participant Jaeger

    User->>ReactApp: Clicks "Get Data" button
    ReactApp->>OTelWebSDK: Creates Root Span (e.g., "fetch-user-data")
    OTelWebSDK->>ReactApp: Generates traceparent header
    ReactApp->>NodeAPI: Makes API request to /api/data (with traceparent and JWT)
    NodeAPI->>OTelNodeSDK: Receives request, auto-parses traceparent header
    OTelNodeSDK->>NodeAPI: Creates Server Span, inheriting traceId
    NodeAPI->>NodeAPI: Middleware parses JWT, gets userId
    NodeAPI->>OTelNodeSDK: Adds userId as a tag to the current Span
    NodeAPI->>NodeAPI: Executes business logic (e.g., query DB)
    OTelNodeSDK->>NodeAPI: Auto-creates Child Span for downstream calls
    NodeAPI-->>ReactApp: Returns API response
    ReactApp->>ReactApp: Renders data
    ReactApp->>OTelWebSDK: Ends Root Span
    OTelWebSDK-->>Jaeger: Asynchronously exports frontend trace data
    OTelNodeSDK-->>Jaeger: Asynchronously exports backend trace data
    Jaeger->>Jaeger: Combines data into a complete trace

Step-by-Step Implementation: The Code is Key

1. Launch a Jaeger Instance

Create a docker-compose.yml file in your project root. This is the fastest way to start an All-in-One Jaeger instance, suitable for development and testing.

# docker-compose.yml
version: '3.8'
services:
  jaeger:
    image: jaegertracing/all-in-one:1.48
    container_name: jaeger
    ports:
      - "6831:6831/udp"      # Agent (Thrift UDP)
      - "16686:16686"        # Jaeger UI
      - "14268:14268"        # Collector (HTTP)
      - "4317:4317"          # OTLP gRPC receiver
      - "4318:4318"          # OTLP HTTP receiver
    environment:
      - COLLECTOR_OTLP_ENABLED=true

Run docker-compose up -d to start it. You should now be able to see the Jaeger UI at http://localhost:16686.

2. Configure the Node.js Backend Service

Our backend is a simple Express application with two endpoints: login and data fetching.

Project Structure:

/backend
  - package.json
  - server.js
  - tracer.js       # Core OTel configuration
  - auth.js         # JWT-related logic

Install Dependencies:

npm install express jsonwebtoken cors
npm install @opentelemetry/sdk-node @opentelemetry/api \
            @opentelemetry/auto-instrumentations-node \
            @opentelemetry/exporter-trace-otlp-http \
            @opentelemetry/resources \
            @opentelemetry/semantic-conventions

tracer.js: OpenTelemetry Initialization

This is the core tracing configuration file for the backend. In a real project, this setup would be more complex, potentially including dynamic sampling, multiple exporters, etc.

// backend/tracer.js
const { NodeSDK } = require('@opentelemetry/sdk-node');
const { OTLPTraceExporter } = require('@opentelemetry/exporter-trace-otlp-http');
const {
  getNodeAutoInstrumentations,
} = require('@opentelemetry/auto-instrumentations-node');
const {
  Resource,
} = require('@opentelemetry/resources');
const {
  SemanticResourceAttributes,
} = require('@opentelemetry/semantic-conventions');
const { AlwaysOnSampler } = require('@opentelemetry/core');

// Key Point 1: Define the service resource information.
// This identifier shows up as the Service Name in the Jaeger UI.
const resource = new Resource({
  [SemanticResourceAttributes.SERVICE_NAME]: 'my-backend-service',
  [SemanticResourceAttributes.SERVICE_VERSION]: '1.0.0',
});

// Key Point 2: Configure the Exporter.
// This specifies where to send trace data. We're using the OTLP HTTP protocol
// to send to our local Jaeger instance on port 4318.
const traceExporter = new OTLPTraceExporter({
  url: 'http://localhost:4318/v1/traces',
});

// Key Point 3: Configure the Sampler.
// In production, sampling everything is too expensive.
// For this demo, we use AlwaysOnSampler. In production, consider TraceIdRatioBasedSampler.
const sampler = new AlwaysOnSampler();

const sdk = new NodeSDK({
  resource,
  sampler,
  traceExporter,
  // Key Point 4: Auto-instrumentation.
  // This is where OTel shines. It automatically creates spans for popular libraries
  // (express, http, pg, redis, etc.).
  instrumentations: [getNodeAutoInstrumentations({
    // Disable instrumentations we don't need to reduce overhead.
    '@opentelemetry/instrumentation-fs': {
      enabled: false,
    },
  })],
});

// Graceful shutdown
process.on('SIGTERM', () => {
  sdk
    .shutdown()
    .then(() => console.log('Tracing terminated'))
    .catch((error) => console.log('Error terminating tracing', error))
    .finally(() => process.exit(0));
});

// Start the SDK
try {
  sdk.start();
  console.log('Tracing initialized');
} catch (error) {
  console.log('Error initializing tracing', error);
}

module.exports = sdk;

auth.js: JWT Logic and Trace Enrichment

This is where JWT and tracing come together. We define a middleware that not only verifies the JWT but also extracts user information from it and attaches it to the current span.

// backend/auth.js
const jwt = require('jsonwebtoken');
const { trace, context } = require('@opentelemetry/api');

const JWT_SECRET = 'my-super-secret-key-for-demo';
const tracer = trace.getTracer('my-backend-service-auth');

function generateToken(userId) {
  return jwt.sign({ userId, role: 'user' }, JWT_SECRET, { expiresIn: '1h' });
}

// Core: Middleware to authenticate JWT and enrich the trace
function authenticateAndEnrichTrace(req, res, next) {
  const authHeader = req.headers.authorization;
  if (!authHeader || !authHeader.startsWith('Bearer ')) {
    return res.status(401).json({ error: 'Unauthorized: No token provided' });
  }

  const token = authHeader.split(' ')[1];

  try {
    const decoded = jwt.verify(token, JWT_SECRET);
    req.user = decoded;

    // Key Point: Attach user info to the currently active span.
    // OTel's auto-instrumentation has already created an active span for us.
    const currentSpan = trace.getSpan(context.active());
    if (currentSpan) {
      currentSpan.setAttribute('enduser.id', decoded.userId);
      currentSpan.setAttribute('enduser.role', decoded.role);
    }

    next();
  } catch (error) {
    // If the token is invalid, we record a span event.
    const currentSpan = trace.getSpan(context.active());
    if (currentSpan) {
        currentSpan.addEvent('jwt-verification-failed', {
            'error.message': error.message,
        });
    }
    return res.status(401).json({ error: 'Unauthorized: Invalid token' });
  }
}

module.exports = {
  generateToken,
  authenticateAndEnrichTrace,
};

server.js: The Express App

Note that require('./tracer'); must be the very first line to ensure all other modules are patched by OTel when they are required.

// backend/server.js
// MUST be imported first so OTel can correctly instrument other modules.
require('./tracer');

const express = require('express');
const cors = require('cors');
const { trace } = require('@opentelemetry/api');
const { generateToken, authenticateAndEnrichTrace } = require('./auth');

const app = express();
const port = 4000;

app.use(express.json());
app.use(cors()); // Allow cross-origin requests

const tracer = trace.getTracer('my-backend-service-server');

// Login endpoint: No auth required, returns a JWT
app.post('/api/login', (req, res) => {
  const { username } = req.body;
  // In a real app, you'd validate against a database here.
  if (username) {
    const userId = `user_${username.toLowerCase()}`;
    const token = generateToken(userId);
    res.json({ token });
  } else {
    res.status(400).json({ error: 'Username is required' });
  }
});

// Protected data endpoint
app.get('/api/data', authenticateAndEnrichTrace, (req, res) => {
  // This span will automatically be a child of the Express middleware span.
  tracer.startActiveSpan('fetch-data-from-db', (span) => {
    // Simulate DB query latency
    setTimeout(() => {
      span.setAttribute('db.system', 'postgresql');
      span.setAttribute('db.statement', 'SELECT * FROM users WHERE id = ?');
      span.addEvent('db-query-start');

      const data = {
        message: `Hello, ${req.user.userId}! This is protected data.`,
        timestamp: new Date().toISOString(),
      };
      
      span.addEvent('db-query-end');
      span.end();
      res.json(data);
    }, 200); // Simulate 200ms DB delay
  });
});

app.listen(port, () => {
  console.log(`Backend server listening at http://localhost:${port}`);
});

Start the backend service: node server.js.

3. Frontend React App Configuration (Recoil + OTel)

Project Structure:

/frontend
  - package.json
  - src/
    - App.js
    - state.js        # Recoil atoms
    - tracing.js      # Core OTel Web configuration
    - api.js          # Encapsulated fetch calls

Install Dependencies:

npx create-react-app frontend
cd frontend
npm install recoil axios
npm install @opentelemetry/sdk-trace-web @opentelemetry/api \
            @opentelemetry/context-zone \
            @opentelemetry/instrumentation-fetch \
            @opentelemetry/exporter-trace-otlp-http \
            @opentelemetry/resources \
            @opentelemetry/semantic-conventions

src/tracing.js: OTel Web SDK Initialization

The frontend OTel config is similar to the backend’s, but the choice of contextManager is critical. ZoneContextManager is recommended for SPAs as it does a better job of propagating context across async callbacks.

// frontend/src/tracing.js
import { WebTracerProvider } from '@opentelemetry/sdk-trace-web';
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http';
import { SimpleSpanProcessor } from '@opentelemetry/sdk-trace-base';
import { ZoneContextManager } from '@opentelemetry/context-zone';
import { FetchInstrumentation } from '@opentelemetry/instrumentation-fetch';
import { registerInstrumentations } from '@opentelemetry/instrumentation';
import { Resource } from '@opentelemetry/resources';
import { SemanticResourceAttributes } from '@opentelemetry/semantic-conventions';

const resource = new Resource({
  [SemanticResourceAttributes.SERVICE_NAME]: 'my-frontend-service',
});

const exporter = new OTLPTraceExporter({
  url: 'http://localhost:4318/v1/traces',
});

const provider = new WebTracerProvider({
  resource: resource,
});

provider.addSpanProcessor(new SimpleSpanProcessor(exporter));

// ZoneContextManager is crucial for handling async operations in web apps.
provider.register({
  contextManager: new ZoneContextManager(),
});

// Auto-instrument the fetch API and enable W3C Trace Context headers.
registerInstrumentations({
  instrumentations: [
    new FetchInstrumentation({
      // Key: Tell OTel to auto-inject the traceparent header on fetch requests.
      propagateTraceHeaderCorsUrls: [
        /http:\/\/localhost:4000\/.*/,
      ],
    }),
  ],
});

Import this at the very top of src/index.js to ensure it’s initialized early: import './tracing';

src/state.js: Recoil Atom for JWT Management

// frontend/src/state.js
import { atom } from 'recoil';

export const jwtTokenState = atom({
  key: 'jwtTokenState',
  // Initialize from localStorage to persist login.
  default: localStorage.getItem('jwt_token') || null,
});

src/api.js: Encapsulated API Calls

We create an API client that reads the JWT from the Recoil state and adds it to request headers.

// frontend/src/api.js
import axios from 'axios';

const apiClient = axios.create({
  baseURL: 'http://localhost:4000',
});

// Use an interceptor to dynamically add the Authorization header.
export const setupApiInterceptor = (token) => {
  apiClient.interceptors.request.use(
    (config) => {
      if (token) {
        config.headers.Authorization = `Bearer ${token}`;
      }
      return config;
    },
    (error) => {
      return Promise.reject(error);
    }
  );
};

export default apiClient;

src/App.js: The Business Component

This is where everything comes together. The component uses Recoil state, calls the API, and uses the OTel API to manually create a top-level span that wraps the entire user operation.

// frontend/src/App.js
import React, { useState, useEffect } from 'react';
import {
  RecoilRoot,
  useRecoilState,
  useRecoilValue,
  useSetRecoilState,
} from 'recoil';
import { trace } from '@opentelemetry/api';
import { jwtTokenState } from './state';
import apiClient, { setupApiInterceptor } from './api';

// Get a tracer instance.
const tracer = trace.getTracer('my-frontend-service-app');

function AuthComponent() {
  const [username, setUsername] = useState('alice');
  const setJwtToken = useSetRecoilState(jwtTokenState);

  const handleLogin = async () => {
    try {
      const response = await apiClient.post('/api/login', { username });
      const token = response.data.token;
      localStorage.setItem('jwt_token', token);
      setJwtToken(token);
    } catch (error) {
      console.error('Login failed:', error);
      alert('Login failed');
    }
  };

  return (
    <div>
      <h2>Login</h2>
      <input
        type="text"
        value={username}
        onChange={(e) => setUsername(e.target.value)}
      />
      <button onClick={handleLogin}>Login</button>
    </div>
  );
}

function DataComponent() {
  const [data, setData] = useState(null);
  const [error, setError] = useState(null);
  const setJwtToken = useSetRecoilState(jwtTokenState);

  const handleFetchData = () => {
    setData(null);
    setError(null);
    
    // Key: Create a Root Span to wrap the entire operation.
    tracer.startActiveSpan('ui.operation.fetch-data', async (span) => {
      try {
        span.setAttribute('component', 'DataComponent');
        
        // FetchInstrumentation will auto-create a child span and inject headers.
        const response = await apiClient.get('/api/data');
        setData(JSON.stringify(response.data, null, 2));

        span.setStatus({ code: 1 }); // 1 = OK
        span.addEvent('data-fetch-successful');
      } catch (err) {
        setError(err.response ? err.response.data.error : err.message);
        span.setStatus({ code: 2, message: err.message }); // 2 = ERROR
        span.recordException(err);
      } finally {
        // Ensure the span is always closed.
        span.end();
      }
    });
  };

  const handleLogout = () => {
    localStorage.removeItem('jwt_token');
    setJwtToken(null);
  };

  return (
    <div>
      <h2>Protected Data</h2>
      <button onClick={handleFetchData}>Fetch Protected Data</button>
      <button onClick={handleLogout}>Logout</button>
      {data && <pre>{data}</pre>}
      {error && <p style={{ color: 'red' }}>Error: {error}</p>}
    </div>
  );
}

function MainApp() {
  const jwtToken = useRecoilValue(jwtTokenState);

  // Update the axios interceptor whenever the token changes.
  useEffect(() => {
    setupApiInterceptor(jwtToken);
  }, [jwtToken]);

  return (
    <div style={{ padding: '20px' }}>
      <h1>Full-Stack Tracing Demo</h1>
      {jwtToken ? <DataComponent /> : <AuthComponent />}
    </div>
  );
}

function App() {
  return (
    <RecoilRoot>
      <MainApp />
    </RecoilRoot>
  );
}

export default App;

Now, start the frontend application: npm start.

The Payoff: Inspecting the Full Trace in Jaeger

Workflow:

  1. Open the frontend app at http://localhost:3000.
  2. Enter a username and click “Login”.
  3. Click “Fetch Protected Data”.
  4. Open the Jaeger UI at http://localhost:16686.
  5. Select my-frontend-service from the Service dropdown and click “Find Traces”.

You will see a trace named ui.operation.fetch-data. Click on it, and a complete, cross-stack waterfall diagram of the call chain will appear:

  1. Top-Level Span (Root): ui.operation.fetch-data, from my-frontend-service. This is the span we created manually in handleFetchData, representing the entire user operation lifecycle.
  2. Child Span (Frontend): HTTP GET, also from my-frontend-service. This was created automatically by OTel’s FetchInstrumentation and precisely measures the network latency from request start to response received.
  3. Child Span (Backend): GET /api/data, from my-backend-service. This was created automatically by the OTel Node.js SDK’s ExpressInstrumentation. It inherited the traceId from the frontend, seamlessly connecting the trace.
  4. Grandchild Span (Backend): fetch-data-from-db, from my-backend-service. This is the span we created manually in our backend business logic to measure the database query time.

Click on the backend span (GET /api/data) and look at the Tags tab. You’ll see the attribute we added via our JWT middleware: enduser.id: user_alice. This proves our identity enrichment strategy was successful. Now, we not only know the latency distribution of an entire operation but can also attribute that operation to a specific user. The initial “1.5-second black box” problem is now completely transparent.

Limitations and Future Iterations

While this setup solves the core problem, there are considerations for a production environment.

First, sampling strategy. We used AlwaysOnSampler, which would generate a massive amount of data and performance overhead in a high-traffic system. Production environments must use smarter sampling, such as head-based sampling with TraceIdRatioBasedSampler or tail-based sampling implemented with a component like the OpenTelemetry Collector to keep only “interesting” traces (e.g., those with errors or high latency).

Second, frontend performance overhead. Including the Web OTel SDK adds to the frontend bundle size and introduces some runtime overhead. You should evaluate its impact on Core Web Vitals and may need to create a custom build with only the necessary instrumentations.

Third, context propagation robustness. ZoneContextManager handles most async scenarios, but in some complex interactions across multiple macrotasks (e.g., triggering another action from within a setTimeout callback), context can still be lost. For extremely complex applications, thorough testing or an exploration of stricter context management mechanisms is necessary.

Finally, deeper integration of JWT and tracing. We opted not to pass trace context in the JWT, but we can think in the other direction: upon login, generate a unique sessionId and place it in the JWT. On the backend, attach this sessionId as a tag to all spans. This way, even without relying on a traceId, you could filter in Jaeger by sessionId to see all operation traces from a single user session, which is invaluable for analyzing user behavior patterns.


  TOC