Build Production-Ready Event-Driven Microservices with Go, NATS, and OpenTelemetry: Complete Guide

golang

Build Production-Ready Event-Driven Microservices with Go, NATS, and OpenTelemetry: Complete Guide

Learn to build production-ready event-driven microservices using Go, NATS JetStream & OpenTelemetry. Complete tutorial with code examples, deployment & monitoring.

Oct 14, 2025

Build Production-Ready Event-Driven Microservices with Go, NATS, and OpenTelemetry: Complete Guide

I’ve been building distributed systems for years, and one question keeps coming up: how do we create microservices that don’t just work in development but thrive in production? This challenge led me to combine Go’s efficiency with NATS JetStream’s reliability and OpenTelemetry’s observability. Today, I want to show you how to build event-driven microservices that handle real-world complexity without falling apart.

Have you ever wondered what happens when your payment service goes down during peak traffic? Event-driven architecture with proper messaging can make your system resilient to such failures. Let’s start with the core event system that ties everything together.

// Event structure with tracing support
type Event struct {
    ID            string                 `json:"id"`
    Type          EventType              `json:"type"`
    Source        string                 `json:"source"`
    Subject       string                 `json:"subject"`
    Time          time.Time              `json:"time"`
    Data          json.RawMessage        `json:"data"`
    TraceID       string                 `json:"trace_id"`
    CorrelationID string                 `json:"correlation_id"`
}

func PublishOrderCreated(ctx context.Context, natsConn *nats.Conn, order *Order) error {
    span := trace.SpanFromContext(ctx)
    eventData := OrderCreatedData{
        OrderID:    order.ID,
        CustomerID: order.CustomerID,
        ProductID:  order.ProductID,
        Quantity:   order.Quantity,
        Amount:     order.Amount,
    }
    
    event, err := NewEvent(OrderCreatedEvent, "order-service", order.ID, eventData)
    if err != nil {
        return fmt.Errorf("failed to create event: %w", err)
    }
    
    event.TraceID = span.SpanContext().TraceID().String()
    eventBytes, _ := json.Marshal(event)
    
    return natsConn.Publish("orders.created", eventBytes)
}

Go’s concurrency model makes it perfect for handling multiple events simultaneously. The language’s built-in support for goroutines and channels aligns beautifully with event-driven patterns. But how do we ensure messages aren’t lost when services restart?

NATS JetStream provides persistent storage and exactly-once delivery semantics. Here’s how to set up a durable consumer that survives service restarts:

func SetupOrderConsumer(nc *nats.Conn) (*nats.Subscription, error) {
    js, err := nc.JetStream()
    if err != nil {
        return nil, err
    }

    // Create stream if it doesn't exist
    _, err = js.AddStream(&nats.StreamConfig{
        Name:     "ORDERS",
        Subjects: []string{"orders.*"},
    })

    // Create durable consumer
    return js.Subscribe("orders.created", handleOrderCreated, 
        nats.Durable("order-processor"),
        nats.DeliverAll(),
        nats.AckExplicit(),
    )
}

Observability isn’t just nice to have—it’s essential for production systems. When something goes wrong at 3 AM, you need to know why. OpenTelemetry gives you distributed tracing across all your services.

What if you could trace a single order through every service it touches? Here’s how to instrument a service:

func ProcessOrder(ctx context.Context, orderID string) error {
    ctx, span := tracer.Start(ctx, "ProcessOrder")
    defer span.End()

    // Add attributes to the span
    span.SetAttributes(
        attribute.String("order.id", orderID),
        attribute.Int64("start_time", time.Now().Unix()),
    )

    // Your business logic here
    err := reserveInventory(ctx, orderID)
    if err != nil {
        span.RecordError(err)
        return err
    }

    return nil
}

Circuit breakers prevent cascading failures when downstream services become unstable. The gobreaker library integrates beautifully with Go’s context pattern:

var cb *gobreaker.CircuitBreaker

func init() {
    cb = gobreaker.NewCircuitBreaker(gobreaker.Settings{
        Name:    "PaymentService",
        Timeout: 10 * time.Second,
    })
}

func ProcessPayment(ctx context.Context, payment *Payment) error {
    result, err := cb.Execute(func() (interface{}, error) {
        return paymentGateway.Charge(ctx, payment)
    })
    
    if err != nil {
        return fmt.Errorf("payment failed: %w", err)
    }
    
    return publishPaymentProcessed(ctx, result.(*PaymentResult))
}

Graceful shutdown ensures your service doesn’t drop messages when scaling down. Here’s a production-ready shutdown handler:

func main() {
    router := gin.Default()
    server := &http.Server{
        Addr:    ":8080",
        Handler: router,
    }

    go func() {
        if err := server.ListenAndServe(); err != nil && err != http.ErrServerClosed {
            log.Fatalf("server failed: %v", err)
        }
    }()

    quit := make(chan os.Signal, 1)
    signal.Notify(quit, syscall.SIGINT, syscall.SIGTERM)
    <-quit

    ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
    defer cancel()

    // Stop accepting new connections
    if err := server.Shutdown(ctx); err != nil {
        log.Fatalf("forced shutdown: %v", err)
    }

    // Close NATS connection
    natsConn.Close()
    log.Println("service stopped gracefully")
}

Dead letter queues handle messages that repeatedly fail processing. This prevents your main queues from getting stuck:

func handleOrderCreated(msg *nats.Msg) {
    ctx := context.Background()
    var event Event
    
    if err := json.Unmarshal(msg.Data, &event); err != nil {
        // Move to DLQ after multiple failures
        msg.NakWithDelay(5 * time.Second) // Retry after 5 seconds
        return
    }

    // Process the event
    if err := processOrderCreation(ctx, event); err != nil {
        metrics.DLQCounter.Add(ctx, 1)
        msg.Term() // Move to DLQ
        return
    }

    msg.Ack()
}

Docker deployment with health checks ensures your services stay healthy. Here’s a sample Dockerfile and health check:

FROM golang:1.21-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o main ./cmd

FROM alpine:latest
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY --from=builder /app/main .
EXPOSE 8080
HEALTHCHECK --interval=30s --timeout=3s \
  CMD wget --no-verbose --tries=1 --spider http://localhost:8080/health || exit 1
CMD ["./main"]

Building production-ready systems requires thinking about failure from day one. Event-driven architecture with proper observability turns potential disasters into manageable incidents. The combination of Go’s performance, NATS’s reliability, and OpenTelemetry’s insights creates a foundation that scales with your business.

I’ve deployed systems using this approach that handle millions of events daily with minimal downtime. The investment in proper architecture pays off when you can sleep through the night knowing your services will handle whatever comes their way. If this approach resonates with you, I’d love to hear about your experiences. Please share this article with your team and leave a comment about how you’re handling microservice complexity in your projects.

Share: Facebook Twitter Reddit LinkedIn WhatsApp Telegram Pinterest Email Instagram

golang

Build Production-Ready Event-Driven Microservices with Go, NATS, and OpenTelemetry: Complete Guide

Our Creations

We are on Medium

Similar Posts

How to Automate TLS Certificates in Go Using Caddy and Lego

Build Production-Ready Event-Driven Microservices with Go, NATS, and OpenTelemetry: Complete Guide

Building Robust Stream-Based Data Pipelines in Go for Real-World Scale

Build Production-Ready Event-Driven Microservices with Go, NATS JetStream, and OpenTelemetry

Building Resilient Go Services with Circuit Breakers and Intelligent Retries

Production-Ready Event-Driven Microservices: Go, NATS JetStream, and Kubernetes Complete Guide