Build Production-Ready Event-Driven Microservices with Go, NATS JetStream, and OpenTelemetry

Learn to build scalable event-driven microservices with Go, NATS JetStream & OpenTelemetry. Master resilience patterns, observability & production deployment.

Build Production-Ready Event-Driven Microservices with Go, NATS JetStream, and OpenTelemetry

I’ve been thinking a lot about how modern distributed systems handle complexity while maintaining reliability. The shift toward event-driven architectures isn’t just a trend—it’s becoming essential for building scalable, resilient applications. Today, I want to share a practical approach to creating production-ready event-driven microservices using Go, NATS JetStream, and OpenTelemetry.

Why focus on this combination? Go’s simplicity and concurrency model make it ideal for microservices. NATS JetStream provides durable messaging with persistence, while OpenTelemetry gives us the observability we need to understand our system’s behavior in production. Together, they form a powerful foundation for building systems that can handle real-world demands.

Let me show you how to set up the core infrastructure. We’ll use Docker Compose to orchestrate our dependencies:

# docker-compose.yml
version: '3.8'
services:
  nats:
    image: nats:2.10-alpine
    ports:
      - "4222:4222"
      - "8222:8222"
    command:
      - "--jetstream"
      - "--store_dir=/data"
    volumes:
      - nats_data:/data

  jaeger:
    image: jaegertracing/all-in-one:1.50
    ports:
      - "16686:16686"

volumes:
  nats_data:

Now, how do we actually connect to NATS and set up our streams? Here’s a clean implementation:

package messaging

import (
    "context"
    "time"
    "github.com/nats-io/nats.go"
)

type EventBus struct {
    conn *nats.Conn
    js   nats.JetStreamContext
}

func NewEventBus(natsURL string) (*EventBus, error) {
    conn, err := nats.Connect(natsURL,
        nats.RetryOnFailedConnect(true),
        nats.MaxReconnects(-1),
    )
    if err != nil {
        return nil, err
    }

    js, err := conn.JetStream()
    if err != nil {
        return nil, err
    }

    return &EventBus{conn: conn, js: js}, nil
}

But what happens when messages fail to process? We need to think about resilience patterns. Dead letter queues and retry mechanisms become crucial in production environments. Here’s how you might implement a consumer with built-in retry logic:

func (eb *EventBus) ConsumeWithRetry(subject string, handler func(msg *nats.Msg) error) {
    eb.js.QueueSubscribe(subject, "workers", func(msg *nats.Msg) {
        if err := handler(msg); err != nil {
            // Implement exponential backoff
            time.Sleep(2 * time.Second)
            msg.Nak()
        } else {
            msg.Ack()
        }
    })
}

Observability is where OpenTelemetry shines. Have you ever struggled to trace a request through multiple services? Distributed tracing changes everything. Let me show you how to integrate it:

import (
    "go.opentelemetry.io/otel"
    "go.opentelemetry.io/otel/attribute"
)

func processOrder(ctx context.Context, orderID string) {
    tracer := otel.Tracer("order-service")
    ctx, span := tracer.Start(ctx, "processOrder")
    defer span.End()

    span.SetAttributes(
        attribute.String("order.id", orderID),
        attribute.String("service.name", "order-service"),
    )
    
    // Your business logic here
}

What about schema evolution? As our system grows, our event schemas will inevitably change. We need strategies to handle backward compatibility. One approach is to include version information in every event:

type Event struct {
    ID      string      `json:"id"`
    Type    string      `json:"type"`
    Version int         `json:"version"`
    Data    interface{} `json:"data"`
    Time    time.Time   `json:"timestamp"`
}

Deployment considerations are equally important. How do we ensure our services are healthy in production? Health checks and proper monitoring become non-negotiable:

func healthCheckHandler(w http.ResponseWriter, r *http.Request) {
    if err := checkNATSConnection(); err != nil {
        w.WriteHeader(http.StatusServiceUnavailable)
        return
    }
    w.WriteHeader(http.StatusOK)
}

Building event-driven microservices requires careful consideration of many factors—from message delivery guarantees to observability and deployment strategies. The combination of Go, NATS JetStream, and OpenTelemetry provides a solid foundation, but the real magic happens when you understand how these components work together to create resilient, observable systems.

I’d love to hear about your experiences with event-driven architectures. What challenges have you faced, and how have you overcome them? Share your thoughts in the comments below, and if you found this useful, please consider sharing it with others who might benefit from these insights.

// Our Network

More from our team

Explore our publications across finance, culture, tech, and beyond.

// More Articles

Similar Posts