Production-Ready Event-Driven Microservices: Go, NATS JetStream, and Kubernetes Complete Guide

Learn to build production-ready event-driven microservices with Go, NATS JetStream & Kubernetes. Master scalable architecture, error handling & deployment.

Production-Ready Event-Driven Microservices: Go, NATS JetStream, and Kubernetes Complete Guide

I’ve been thinking about building robust microservices lately. Production systems demand reliability, especially when processing events like orders or payments. How do we ensure messages aren’t lost during failures? That’s where NATS JetStream and Go’s concurrency shine. Let me show you how I build production-ready systems.

First, our architecture needs clear event definitions. I create shared structures all services understand:

// internal/shared/events/events.go
type EventType string

const (
    OrderCreated EventType = "order.created"
    PaymentProcessed EventType = "payment.processed"
)

type BaseEvent struct {
    ID        string
    Type      EventType
    Timestamp time.Time
}

type OrderCreatedEvent struct {
    BaseEvent
    Items []OrderItem
}

For messaging, JetStream provides persistence. Here’s how I initialize it:

// internal/shared/messaging/jetstream.go
js, _ := jetstream.New(nc)
stream, _ := js.CreateStream(ctx, jetstream.StreamConfig{
    Name:     "ORDERS",
    Subjects: []string{"orders.>"},
})

Notice the orders.> subject pattern? It lets us categorize events efficiently. Have you considered how subject hierarchies simplify routing?

Services connect to NATS using resilient patterns:

nc, err := nats.Connect("nats://nats:4222",
    nats.MaxReconnects(10),
    nats.ReconnectWait(2*time.Second),
    nats.DisconnectErrHandler(logDisconnect),
)

This handles network blips gracefully. For order processing, I use Go’s channels with worker pools:

func (s *OrderService) ProcessOrders(ctx context.Context) {
    msgs, _ := s.js.Consume("orders.created", "order-service")
    
    for i := 0; i < 5; i++ { // 5 workers
        go func() {
            for msg := range msgs.Messages() {
                var event events.OrderCreatedEvent
                json.Unmarshal(msg.Data(), &event)
                
                if err := s.validateOrder(event); err != nil {
                    msg.Nak() // Negative acknowledgment
                    continue
                }
                
                msg.Ack()
                s.publishEvent(events.OrderValidated, event)
            }
        }()
    }
}

The worker pattern prevents overload. Each message gets either acknowledged or retried. What happens if validation fails repeatedly? That’s where dead-letter queues come in.

For payments, circuit breakers prevent cascading failures:

cb := gobreaker.NewCircuitBreaker(gobreaker.Settings{
    Name: "PaymentProcessor",
    Timeout: 30 * time.Second,
})

_, err := cb.Execute(func() (interface{}, error) {
    return s.processPayment(order)
})

Kubernetes deployments ensure resilience. Here’s a snippet from our payment service deployment:

# deployments/k8s/payment-deployment.yaml
livenessProbe:
  httpGet:
    path: /healthz
    port: 8080
readinessProbe:
  httpGet:
    path: /readyz
    port: 8080
gracePeriodSeconds: 30 # Allow time for in-flight requests

Observability is crucial. I instrument handlers with OpenTelemetry:

// internal/shared/monitoring/tracing.go
func TraceMiddleware() gin.HandlerFunc {
    return func(c *gin.Context) {
        ctx, span := tracer.Start(c.Request.Context(), c.Request.URL.Path)
        defer span.End()
        c.Request = c.Request.WithContext(ctx)
        c.Next()
    }
}

Distributed tracing connects events across services. How else would you debug a payment that got stuck between services?

Finally, graceful shutdowns prevent data loss:

func main() {
    server := startHTTPServer()
    
    sigChan := make(chan os.Signal, 1)
    signal.Notify(sigChan, syscall.SIGTERM, syscall.SIGINT)
    
    <-sigChan
    ctx, cancel := context.WithTimeout(context.Background(), 15*time.Second)
    defer cancel()
    
    if err := server.Shutdown(ctx); err != nil {
        log.Error().Err(err).Msg("Server shutdown failed")
    }
    s.messaging.Close()
}

This approach has served me well in production. The combination of Go’s efficiency, JetStream’s persistence, and Kubernetes’ orchestration creates bulletproof systems. What challenges have you faced with event-driven architectures?

If you found this useful, share it with your team. Comments? I’d love to hear about your implementation experiences!

// Our Network

More from our team

Explore our publications across finance, culture, tech, and beyond.

// More Articles

Similar Posts