Building Production-Ready Event-Driven Microservices with Go, NATS JetStream, and OpenTelemetry Guide

golang

Building Production-Ready Event-Driven Microservices with Go, NATS JetStream, and OpenTelemetry Guide

Master event-driven microservices with Go, NATS JetStream & OpenTelemetry. Build production-ready systems with distributed tracing, resilience patterns & monitoring.

Aug 7, 2025

Building Production-Ready Event-Driven Microservices with Go, NATS JetStream, and OpenTelemetry Guide

I’ve been designing distributed systems for over a decade, and nothing excites me more than building resilient microservices that handle real-world chaos. Last month, while debugging a production outage caused by lost messages between services, I knew we needed a better approach. That’s when I combined Go’s efficiency with NATS JetStream’s reliability and OpenTelemetry’s observability. The result? A rock-solid event-driven architecture that handles failures gracefully. Let me show you how I built it.

Our e-commerce order system processes transactions through four coordinated services. When you place an order, the journey begins with validation, moves through inventory checks and payment processing, and finally triggers customer notifications. Each step communicates via events, creating a responsive yet decoupled workflow. How do we ensure a payment failure doesn’t lose the entire order? That’s where our architecture shines.

First, we define our event structure. Clear contracts between services prevent interpretation errors:

// Order creation event
type OrderCreatedEvent struct {
    BaseEvent
    OrderID    string  `json:"order_id"`
    Items      []Item  `json:"items"`
}

func CreateOrderEvent(orderID string, items []Item) ([]byte, error) {
    event := OrderCreatedEvent{
        BaseEvent: NewBaseEvent("order.created", "order-service"),
        OrderID:   orderID,
        Items:     items,
    }
    return json.Marshal(event)
}

Connecting our services requires reliable messaging. NATS JetStream provides persistent streams that survive service restarts. Notice the retry logic - essential for real-world networks:

// Connecting to NATS with resilience
conn, err := nats.Connect("nats://localhost:4222",
    nats.MaxReconnects(5),
    nats.ReconnectWait(2*time.Second),
    nats.DisconnectHandler(func(_ *nats.Conn) {
        log.Println("NATS connection lost")
    }))
if err != nil {
    return nil, fmt.Errorf("connection failed: %w", err)
}

js, _ := conn.JetStream()
// Create persistent stream
_, err = js.AddStream(&nats.StreamConfig{
    Name:     "ORDERS",
    Subjects: []string{"order.>"},
})

When services fail mid-operation, how do we track what happened? Distributed tracing answers this. We instrument handlers to propagate context:

// Payment handler with tracing
func ProcessPayment(ctx context.Context, msg *nats.Msg) {
    tracer := otel.Tracer("payment-service")
    ctx, span := tracer.Start(ctx, "process-payment")
    defer span.End()

    var event PaymentRequestedEvent
    if err := json.Unmarshal(msg.Data, &event); err != nil {
        span.RecordError(err)
        span.SetStatus(codes.Error, "unmarshal failed")
        return
    }
    // Payment logic here
    span.AddEvent("payment_processed")
}

For resilience, we implement circuit breakers. These prevent cascading failures when dependencies struggle:

// Inventory check with circuit breaker
cb := gobreaker.NewCircuitBreaker(gobreaker.Settings{
    Name:     "inventory-service",
    Timeout:  10 * time.Second,
    ReadyToTrip: func(counts gobreaker.Counts) bool {
        return counts.ConsecutiveFailures > 5
    },
})

result, err := cb.Execute(func() (interface{}, error) {
    return inventoryClient.ReserveItems(ctx, orderItems)
})

Testing event-driven systems presents unique challenges. We use NATS’s testing utilities to verify behavior:

// Testing event subscriptions
func TestOrderCreation(t *testing.T) {
    nc, _ := nats.ConnectTest(t)
    js, _ := nc.JetStream()

    // Setup test subscriber
    _, err := js.Subscribe("order.created", func(m *nats.Msg) {
        // Assert event contents
    })

    // Publish test event
    js.Publish("order.created", orderCreatedJSON)
}

Deployment ties everything together. Our Docker Compose file brings up the entire stack:

# docker-compose.yaml
services:
  nats:
    image: nats:latest
    ports:
      - "4222:4222"
  jaeger:
    image: jaegertracing/all-in-one
    ports:
      - "16686:16686"
  order-service:
    build: ./cmd/order-service
    depends_on:
      - nats

What metrics should you monitor? These Prometheus counters reveal system health:

// Tracking processed events
processedOrders := prometheus.NewCounterVec(prometheus.CounterOpts{
    Name: "orders_processed_total",
    Help: "Total processed orders",
}, []string{"status"})

func init() {
    prometheus.MustRegister(processedOrders)
}

// In order handler
processedOrders.WithLabelValues("success").Inc()

Building this taught me valuable lessons. Always assume networks will fail. Design for replayability. Treat observability as a core feature, not an afterthought. The combination of Go’s concurrency, JetStream’s persistence, and OpenTelemetry’s tracing creates systems that withstand real-world turbulence.

What challenges have you faced with microservices? Share your experiences below - I’d love to hear how you’ve solved reliability issues. If this approach resonates with you, consider sharing it with others facing similar architectural decisions.

Share: Facebook Twitter Reddit LinkedIn WhatsApp Telegram Pinterest Email Instagram

golang

Building Production-Ready Event-Driven Microservices with Go, NATS JetStream, and OpenTelemetry Guide

Our Creations

We are on Medium

Similar Posts

Fiber Redis Integration: Build Lightning-Fast Go Web Applications with In-Memory Caching

Fiber Redis Integration: Build Lightning-Fast Go Web Applications with In-Memory Caching

Build Production-Ready Event Sourcing System with Go, PostgreSQL, and NATS Streaming

Boost Web Performance: Integrating Fiber with Redis for Lightning-Fast Go Applications

Echo JWT-Go Integration: Complete Guide to Secure Web API Authentication in Go

Why Gin and NATS Make the Perfect Pair for Scalable Event-Driven APIs