Production-Ready Event-Driven Microservices with Go, NATS JetStream, and OpenTelemetry: Complete Tutorial

golang

Production-Ready Event-Driven Microservices with Go, NATS JetStream, and OpenTelemetry: Complete Tutorial

Learn to build production-ready event-driven microservices with Go, NATS JetStream & OpenTelemetry. Complete guide with error handling, tracing & deployment.

Aug 25, 2025

Production-Ready Event-Driven Microservices with Go, NATS JetStream, and OpenTelemetry: Complete Tutorial

I’ve been thinking about how modern systems need to handle scale and complexity while remaining observable and maintainable. That’s why I want to share my approach to building production-ready event-driven microservices using Go, NATS JetStream, and OpenTelemetry. Let’s walk through this together.

When designing event-driven systems, I focus on creating services that communicate through events rather than direct API calls. This approach gives us better scalability and fault tolerance. But how do we ensure these distributed systems remain reliable and observable?

Let me show you how I structure a typical e-commerce order processing system. We’ll have services for orders, inventory, notifications, and payments—each handling specific business capabilities while communicating through events.

Here’s how I set up the project structure:

event-driven-ecommerce/
├── cmd/
│   ├── order-service/
│   ├── inventory-service/
│   ├── notification-service/
│   └── payment-service/
├── internal/
│   ├── events/
│   ├── messaging/
│   ├── observability/
│   └── shared/
├── pkg/
│   └── models/

Event schema design is crucial. I define clear event structures with proper validation:

type EventMetadata struct {
    ID          string    `json:"id" validate:"required"`
    Type        string    `json:"type" validate:"required"`
    Version     string    `json:"version" validate:"required"`
    Source      string    `json:"source" validate:"required"`
    Timestamp   time.Time `json:"timestamp" validate:"required"`
}

Have you considered how you’d handle event versioning when your business requirements change?

For messaging, I use NATS JetStream because it provides persistence and exactly-once delivery semantics. Here’s how I set up the connection:

func ConnectToNATS(url string) (*nats.Conn, error) {
    nc, err := nats.Connect(url,
        nats.MaxReconnects(5),
        nats.ReconnectWait(2*time.Second),
        nats.DisconnectErrHandler(func(nc *nats.Conn, err error) {
            log.Printf("Disconnected from NATS: %v", err)
        }),
    )
    if err != nil {
        return nil, fmt.Errorf("NATS connection failed: %w", err)
    }
    return nc, nil
}

What happens if your messaging system goes down temporarily? That’s where proper reconnection logic becomes essential.

Observability is non-negotiable in distributed systems. I integrate OpenTelemetry from the start:

func InitTracer(serviceName string) (*tracesdk.TracerProvider, error) {
    exporter, err := jaeger.New(jaeger.WithCollectorEndpoint(
        jaeger.WithEndpoint("http://jaeger:14268/api/traces"),
    ))
    if err != nil {
        return nil, err
    }

    tp := tracesdk.NewTracerProvider(
        tracesdk.WithBatcher(exporter),
        tracesdk.WithResource(resource.NewWithAttributes(
            semconv.SchemaURL,
            semconv.ServiceNameKey.String(serviceName),
        )),
    )
    return tp, nil
}

When building publishers, I ensure they handle errors gracefully and include tracing context:

func (p *EventPublisher) PublishOrderCreated(ctx context.Context, order models.Order) error {
    _, span := tracer.Start(ctx, "publish_order_created")
    defer span.End()

    event := createOrderEvent(order)
    data, err := p.validator.SerializeEvent("order.created", event)
    if err != nil {
        span.RecordError(err)
        return fmt.Errorf("failed to serialize event: %w", err)
    }

    if err := p.jetStream.Publish("orders.created", data); err != nil {
        span.RecordError(err)
        return fmt.Errorf("failed to publish event: %w", err)
    }
    return nil
}

How would you handle duplicate messages or ensure ordered processing when it matters?

For consumers, I implement robust patterns including retries and dead-letter queues:

func (c *OrderConsumer) ProcessOrders() {
    sub, err := c.jetStream.Subscribe("orders.created", 
        func(msg *nats.Msg) {
            ctx := context.Background()
            if err := c.processMessage(ctx, msg); err != nil {
                log.Printf("Processing failed: %v", err)
                if shouldRetry(err) {
                    msg.Nak()
                } else {
                    msg.Term()
                }
            } else {
                msg.Ack()
            }
        },
        nats.MaxDeliver(3),
    )
}

Graceful shutdown is another critical aspect. I make sure services can complete ongoing work before terminating:

func (s *Service) Start() error {
    go func() {
        <-s.ctx.Done()
        log.Println("Shutting down gracefully...")
        s.stop()
    }()
    return s.run()
}

Deployment considerations include health checks and proper resource limits:

livenessProbe:
  httpGet:
    path: /health
    port: 8080
  initialDelaySeconds: 30
  periodSeconds: 10

readinessProbe:
  httpGet:
    path: /ready
    port: 8080
  initialDelaySeconds: 5
  periodSeconds: 5

Building production-ready event-driven systems requires careful attention to error handling, observability, and deployment strategies. By combining Go’s efficiency with NATS JetStream’s reliability and OpenTelemetry’s observability, we create systems that scale well while remaining maintainable.

What challenges have you faced when building distributed systems? I’d love to hear about your experiences and solutions. If you found this helpful, please share it with others who might benefit from these patterns. Feel free to leave comments or questions below!

Share: Facebook Twitter Reddit LinkedIn WhatsApp Telegram Pinterest Email Instagram

golang

Production-Ready Event-Driven Microservices with Go, NATS JetStream, and OpenTelemetry: Complete Tutorial

Our Creations

We are on Medium

Similar Posts

Building Enterprise CLI Tools: Complete Guide to Cobra and Viper Integration in Go

How to Trace Outbound HTTP Calls in Go Using Resty and OpenTelemetry

Build Production-Ready Event-Driven Microservices with Go, NATS JetStream, and Complete Observability

Building Production-Ready Event-Driven Microservices with Go, NATS JetStream, and OpenTelemetry: Complete Tutorial

Echo Redis Integration Guide: Build High-Performance Go Web Apps with go-redis Caching

Building Production-Ready Event-Driven Microservices with NATS JetStream and Go: Complete Tutorial