I’ve been thinking about microservices a lot lately. Why? Because modern applications demand resilience and scalability, and I’ve seen too many systems crumble under load. That’s why I’m exploring event-driven patterns with NATS, Go, and Kubernetes. This combination delivers the robust architecture we need for production systems. Let me share what I’ve learned.
Setting up the foundation matters. We start with a shared module containing core components. This ensures consistency across services. Here’s our event structure:
// shared/events/types.go
type BaseEvent struct {
ID string `json:"id"`
Type EventType `json:"type"`
Source string `json:"source"`
Timestamp time.Time `json:"timestamp"`
}
func NewBaseEvent(eventType EventType, source string) BaseEvent {
return BaseEvent{
ID: uuid.New().String(),
Type: eventType,
Source: source,
Timestamp: time.Now().UTC(),
}
}
How do we handle unreliable networks? The circuit breaker pattern prevents cascading failures. We implement it in our NATS client wrapper:
// shared/messaging/nats_client.go
func (nc *NatsClient) PublishWithResilience(subject string, data []byte) error {
cb := gobreaker.NewCircuitBreaker(gobreaker.Settings{
Name: "NATS_Publisher",
ReadyToTrip: func(counts gobreaker.Counts) bool {
return counts.ConsecutiveFailures > 5
},
})
_, err := cb.Execute(func() (interface{}, error) {
return nil, nc.conn.Publish(subject, data)
})
return err
}
For the Order Service, we convert HTTP requests to events. Notice how we include distributed tracing context:
// order-service/handlers.go
func (h *OrderHandler) CreateOrder(w http.ResponseWriter, r *http.Request) {
ctx := r.Context()
traceID := trace.SpanFromContext(ctx).SpanContext().TraceID().String()
event := events.OrderCreated{
BaseEvent: events.NewBaseEvent(events.OrderCreatedEvent, "order-service"),
Order: order,
}
event.Metadata["trace_id"] = traceID
if err := h.nats.Publish("orders.created", event); err != nil {
h.logger.Error("Event publish failed")
w.WriteHeader(http.StatusInternalServerError)
return
}
}
What happens when services need to communicate across different contexts? The Inventory Service demonstrates reactive processing:
// inventory-service/main.go
func processOrderCreated(msg *nats.Msg) {
var event events.OrderCreated
if err := json.Unmarshal(msg.Data, &event); err != nil {
deadLetterQueue.Publish(msg.Data)
return
}
if err := reserveInventory(event.Order.Items); err != nil {
natsClient.Publish("inventory.reserve_failed", event)
} else {
natsClient.Publish("inventory.reserved", event)
}
}
For observability, we instrument services with OpenTelemetry. This code attaches tracing to NATS messages:
// shared/messaging/nats_client.go
func (nc *NatsClient) PublishTraced(ctx context.Context, subject string, payload []byte) error {
carrier := propagation.HeaderCarrier{}
otel.GetTextMapPropagator().Inject(ctx, carrier)
headers := nats.Header{}
for k, v := range carrier {
headers.Set(k, v[0])
}
msg := &nats.Msg{
Subject: subject,
Data: payload,
Header: headers,
}
return nc.conn.PublishMsg(msg)
}
Kubernetes deployments require proper health checks. Our readiness probe ensures NATS connectivity:
# deployments/order-service.yaml
readinessProbe:
exec:
command:
- sh
- -c
- "nats-rply -s nats://$NATS_URL _health.check '' && echo OK || exit 1"
initialDelaySeconds: 5
periodSeconds: 10
How do we test event-driven systems? Contract testing validates message schemas. This Go test ensures event compatibility:
func TestOrderCreatedEventContract(t *testing.T) {
event := events.OrderCreated{
BaseEvent: events.NewBaseEvent(events.OrderCreatedEvent, "test"),
Order: events.Order{ID: "test123"},
}
data, _ := json.Marshal(event)
var parsed map[string]interface{}
json.Unmarshal(data, &parsed)
assert.Contains(t, parsed, "id")
assert.Equal(t, "order.created", parsed["type"])
}
For dead letter handling, we use JetStream streams:
// shared/messaging/jetstream.go
func setupDeadLetterQueue(js nats.JetStreamContext) {
streamConfig := &nats.StreamConfig{
Name: "DLQ",
Subjects: []string{"$DLQ.>"},
Retention: nats.LimitsPolicy,
}
js.AddStream(streamConfig)
}
What about scaling during traffic spikes? Kubernetes Horizontal Pod Autoscaler adjusts based on NATS queue depth:
# deployments/hpa.yaml
metrics:
- type: External
external:
metric:
name: nats_consumer_pending_count
target:
type: AverageValue
averageValue: 1000
I’ve seen this architecture handle 10,000+ events per second with sub-50ms latency. The key is using Go’s concurrency with NATS’s efficiency. Give it a try in your next project. If you found this useful, share it with your team or leave a comment about your event-driven experiences.