Build Production-Ready Event-Driven Microservices with Go, NATS JetStream, and OpenTelemetry

golang

Build Production-Ready Event-Driven Microservices with Go, NATS JetStream, and OpenTelemetry

Learn to build production-ready event-driven microservices using Go, NATS JetStream & OpenTelemetry. Complete guide with resilience patterns & observability.

Sep 13, 2025

Build Production-Ready Event-Driven Microservices with Go, NATS JetStream, and OpenTelemetry

I’ve been thinking about how modern systems demand more than just functionality—they need to be resilient, observable, and scalable. That’s why I want to share my approach to building production-ready event-driven microservices using Go, NATS JetStream, and OpenTelemetry. Let’s walk through this together.

Event-driven architecture fundamentally changes how services communicate. Instead of direct HTTP calls, services publish events that others can react to. This loose coupling makes systems more flexible and scalable. But how do we ensure reliability when things can fail at multiple points?

NATS JetStream provides persistent messaging with at-least-once delivery guarantees. Here’s how I set up a basic event structure:

type Event struct {
    ID          string          `json:"id"`
    Type        string          `json:"type"`
    AggregateID string          `json:"aggregate_id"`
    Data        json.RawMessage `json:"data"`
    Timestamp   time.Time       `json:"timestamp"`
}

When configuring NATS JetStream, I always start with a proper stream setup. This ensures messages are stored and available for consumption even if services restart:

// Configure stream with persistence
_, err := js.AddStream(&nats.StreamConfig{
    Name:     "ORDERS",
    Subjects: []string{"orders.>"},
    MaxAge:   24 * time.Hour,
    Replicas: 3,
})

Have you ever wondered how to track requests across multiple services? OpenTelemetry solves this with distributed tracing. I integrate it early in the development process:

func StartSpan(ctx context.Context, name string) (context.Context, trace.Span) {
    tracer := otel.Tracer("order-service")
    return tracer.Start(ctx, name)
}

For the order service, I implement idempotent handlers that can process the same event multiple times without side effects. This is crucial for reliability:

func (s *OrderService) HandleOrderCreated(ctx context.Context, event Event) error {
    ctx, span := StartSpan(ctx, "HandleOrderCreated")
    defer span.End()

    // Check if we've processed this event before
    if s.store.HasProcessed(event.ID) {
        return nil // Idempotent handling
    }
    
    // Process order creation logic
    return s.store.SaveOrder(event)
}

Payment processing requires careful error handling and retries. I use exponential backoff with jitter to prevent thundering herd problems:

func ProcessPaymentWithRetry(ctx context.Context, payment Payment) error {
    backoff := NewExponentialBackoff(100*time.Millisecond, 30*time.Second)
    for attempt := 0; attempt < maxRetries; attempt++ {
        err := processPayment(ctx, payment)
        if err == nil {
            return nil
        }
        time.Sleep(backoff.Next())
    }
    return ErrPaymentFailed
}

What happens when services need to coordinate across multiple events? I implement sagas for distributed transactions. Each step emits events that trigger the next action:

func (s *OrderSaga) Start(ctx context.Context, order Order) error {
    // Emit events that will be handled by different services
    events := []Event{
        NewEvent("order.created", order.ID, order),
        NewEvent("payment.requested", order.ID, PaymentRequest{Amount: order.Amount}),
    }
    return s.publisher.PublishAll(ctx, events)
}

Observability isn’t just about debugging—it’s about understanding system behavior. I instrument everything with metrics and traces:

func InstrumentHandler(handler http.Handler) http.Handler {
    return otelhttp.NewHandler(handler, "http-server",
        otelhttp.WithMessageEvents(otelhttp.ReadEvents, otelhttp.WriteEvents),
    )
}

Deployment requires careful configuration. I use Docker with proper health checks and environment-specific settings:

services:
  order-service:
    image: orders:latest
    environment:
      - NATS_URL=nats://nats-cluster:4222
      - JAEGER_ENDPOINT=http://jaeger:14268/api/traces
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
      interval: 30s
      timeout: 10s
      retries: 3

Testing event-driven systems requires simulating different failure scenarios. I create comprehensive test suites that verify both happy paths and edge cases:

func TestOrderSaga_Compensation(t *testing.T) {
    // Test that compensating actions are triggered when steps fail
    saga := NewOrderSaga(testPublisher, testStore)
    err := saga.Start(context.Background(), testOrder)
    // Verify compensation events are published
}

Building production-ready systems means anticipating failures. I implement circuit breakers to prevent cascading failures:

func NewPaymentClient() *PaymentClient {
    return &PaymentClient{
        circuitBreaker: gobreaker.NewCircuitBreaker(gobreaker.Settings{
            Name:        "payment-service",
            Timeout:     30 * time.Second,
            ReadyToTrip: func(counts gobreaker.Counts) bool {
                return counts.ConsecutiveFailures > 5
            },
        }),
    }
}

The real value emerges when all these components work together. Services remain responsive even during partial failures, and we can trace every operation across the system.

I’d love to hear about your experiences with event-driven architectures. What challenges have you faced, and how did you solve them? Share your thoughts in the comments below, and if you found this useful, please like and share with others who might benefit from these patterns.

Share: Facebook Twitter Reddit LinkedIn WhatsApp Telegram Pinterest Email Instagram

golang

Build Production-Ready Event-Driven Microservices with Go, NATS JetStream, and OpenTelemetry

Our Creations

We are on Medium

Similar Posts

Building Production-Ready Event-Driven Microservices with Go NATS and OpenTelemetry Complete Guide

Production-Ready Event-Driven Microservices with Go, NATS JetStream, and gRPC: Complete Implementation Guide

How to Add Distributed Tracing to Go Microservices with Chi and OpenTelemetry

Cobra and Viper Integration: Build Professional Go CLI Tools with Advanced Configuration Management

How Caddy and Go Kit Simplify Secure, Observable Microservices in Go

Mastering Cobra and Viper Integration: Build Powerful Go CLI Apps with Advanced Configuration Management