=================== Metrics and tracing =================== * setup graphite as docker service * send simple events * technical events vs business events * SImple KPI reporting on top of graphite * use that to monitor SLA violations Done. There are more complex approaches, dtrace, new relic etc. But these require a (close to full time) dedicated DevOps staff to get real value, and the value is on the few percent from our core ops - that is when a few percent win is a Full time staff cost its worth doing. dtrace and python - use it to track a program Similar to new relic? https://docs.python.org/dev/howto/instrumentation.html