Building Production-Ready Event-Driven Microservices with Go NATS JetStream and OpenTelemetry

golang

Building Production-Ready Event-Driven Microservices with Go NATS JetStream and OpenTelemetry

Learn to build production-ready event-driven microservices with Go, NATS JetStream, and OpenTelemetry. Master distributed tracing, resilience patterns, and scalable deployment.

Aug 28, 2025

Building Production-Ready Event-Driven Microservices with Go NATS JetStream and OpenTelemetry

I’ve been thinking a lot about how we build systems that not only work but thrive under pressure. The shift toward event-driven architectures isn’t just another trend—it’s a fundamental change in how we handle scale, resilience, and complexity in modern applications. That’s why I want to share my approach to building production-ready microservices using Go, NATS JetStream, and OpenTelemetry.

When you start with Go, you get more than just a language—you get a toolkit for building efficient, concurrent systems. Combine that with NATS JetStream’s persistent messaging capabilities and OpenTelemetry’s observability features, and you have a foundation that can handle real-world demands.

Have you ever wondered how services maintain consistency when everything happens asynchronously?

Let’s look at setting up a basic NATS JetStream connection. This isn’t just about connecting; it’s about building something that survives network partitions and server restarts.

nc, err := nats.Connect("nats://localhost:4222",
    nats.MaxReconnects(10),
    nats.ReconnectWait(2*time.Second),
    nats.Timeout(5*time.Second))
if err != nil {
    log.Fatal("Connection failed:", err)
}
defer nc.Close()

js, err := nc.JetStream()
if err != nil {
    log.Fatal("JetStream context failed:", err)
}

This simple connection setup includes retry logic and timeouts—essential for production systems. But connections are just the beginning. The real power comes from how you structure your events and services.

What happens when your event schema needs to change without breaking existing consumers?

Event schema design requires careful thought. I prefer using Protocol Buffers for their strong typing and backward compatibility, but JSON works well too when you need flexibility. The key is having a clear versioning strategy from day one.

type OrderCreated struct {
    EventID     string    `json:"event_id"`
    Version     string    `json:"version"`
    Timestamp   time.Time `json:"timestamp"`
    OrderID     string    `json:"order_id"`
    CustomerID  string    `json:"customer_id"`
    Amount      float64   `json:"amount"`
    Items       []Item    `json:"items"`
}

func publishOrderCreated(js nats.JetStreamContext, order OrderCreated) error {
    data, err := json.Marshal(order)
    if err != nil {
        return err
    }
    
    _, err = js.Publish("orders.created", data, 
        nats.MsgId(order.EventID),
        nats.ExpectStream("ORDERS"))
    return err
}

Notice the MsgId and ExpectStream options? These aren’t just nice-to-haves—they prevent duplicate processing and ensure messages go to the right stream, which is crucial for data integrity.

Now, what about understanding what’s happening across all these distributed services?

That’s where OpenTelemetry shines. Instrumenting your services gives you visibility into performance bottlenecks and error patterns. Here’s how I typically set up tracing:

func initTracer() (*sdktrace.TracerProvider, error) {
    exporter, err := jaeger.New(jaeger.WithCollectorEndpoint(
        jaeger.WithEndpoint("http://jaeger:14268/api/traces")))
    if err != nil {
        return nil, err
    }

    tp := sdktrace.NewTracerProvider(
        sdktrace.WithBatcher(exporter),
        sdktrace.WithResource(resource.NewWithAttributes(
            semconv.SchemaURL,
            semconv.ServiceName("order-service"),
        )),
    )
    otel.SetTracerProvider(tp)
    return tp, nil
}

With tracing in place, you can follow a request as it moves through queues, services, and databases. This visibility is what separates hobby projects from production systems.

But what good is visibility if your services crash under load?

Resilience patterns like circuit breakers and retry mechanisms are non-negotiable. I use gobreaker for circuit breaking and a simple exponential backoff for retries:

var cb *gobreaker.CircuitBreaker

func init() {
    cb = gobreaker.NewCircuitBreaker(gobreaker.Settings{
        Name:    "payment-service",
        Timeout: 30 * time.Second,
        ReadyToTrip: func(counts gobreaker.Counts) bool {
            return counts.ConsecutiveFailures > 5
        },
    })
}

func processPaymentWithRetry(ctx context.Context, payment Payment) error {
    return backoff.Retry(func() error {
        return cb.Execute(func() (interface{}, error) {
            return processPayment(ctx, payment)
        })
    }, backoff.NewExponentialBackOff())
}

This combination prevents cascading failures and gives overloaded services time to recover while maintaining system stability.

As we wrap up, I hope this gives you a practical starting point for your own event-driven systems. The patterns I’ve shared come from real-world experience building systems that handle millions of events daily. They’re battle-tested and ready for your production environments.

What challenges have you faced with event-driven architectures? I’d love to hear your experiences and solutions. If you found this useful, please share it with others who might benefit, and feel free to leave comments or questions below.

Share: Facebook Twitter Reddit LinkedIn WhatsApp Telegram Pinterest Email Instagram

golang

Building Production-Ready Event-Driven Microservices with Go NATS JetStream and OpenTelemetry

Our Creations

We are on Medium

Similar Posts

Echo Redis Integration Guide: Building Lightning-Fast Scalable Web Applications with Go

Boost Web Performance: Integrate Fiber with Redis for Lightning-Fast Go Applications and Scalable Caching

Build Production-Ready Event-Driven Microservices with Go, NATS JetStream, and OpenTelemetry

Production-Ready Event-Driven Microservices with Go, NATS JetStream, and OpenTelemetry: Complete Tutorial

Production-Ready gRPC Microservices with Go: Service Discovery, Load Balancing, and Observability Guide

Fiber Go Framework + Redis Integration: Build Lightning-Fast Web Applications with Advanced Caching