Monitoring Ruby applications with Heka and Grafana

Created by Maciej Mensfeld / @maciejmensfeld / mensfeld.pl

Why would you even bother?

Because things not always work as they should
Because hardware fails
Because good performance is a constant requirement
Because monitoring helps catching issues

But how?

It depends ;)

There are no universal solutions

So let's have a small test-case

SOA based architecture

20+ applications (and many more in development)

Few Rails based, mostly Sinatra

JSON API based endpoints

So let's have a small test-case

6 internal gems used by those apps

Moving towards zero IO apps (no writes to HDD)

Getting ready for fully ephemeral Docker containers

Heavy duty applications (500-700 req/s)

Multiple databases (200GB+)

Naive approach


around_filter :method do |task|
  benchmark(task.name) do
    task.process
  end
end


def benchmark(task_name)
  t = Time.now
  result = yield
  # Mongoid simple object
  Usage.create!(
    task_name: task_name,
    time_taken: (t.to_f - Time.now.to_f)*1000
  )
end

Naive approach

Seemed to work
Seemed to be fast enough
But it wasn't :-(
And if you forget about MongoDB TTL you end up with 500GB+ of logs

What would Jesus do?

He probably doesn't care ;)

But mozilla does!

Heka is an open source stream processing software system developed by Mozilla. Heka is a “Swiss Army Knife” type tool for data processing

So let's do this better!

We know that:

Storing data locally is too slow
Storing data locally is bad
TCP requests slow down business logic
Some losses are acceptable (fire & forget)
Polluting business logic with monitors is bad

What do we really need?

Way to hook up to anything easily
How often
How fast
How many errors
Nice charts
Cross app comparisons

What can we do?

Let's use AOP to wrap around monitoring logic
UDP should be more than enough
Heka to collect data
InfluxDB to store it
Grafana to graph it
Connection pool to avoid killing GC
Let's also wrap it as a gem :)

Aspect Oriented Programming

In computing, AOP is a programming paradigm that aims to increase modularity by allowing the separation of cross-cutting concerns.

There are many AOP libraries for Ruby. We use Aspector

With AOP we can create concerns that we can attach to any method of any class as a before, around and after action.

Simple aspect example


class Handler < BaseHandler
  def handle
    # Sends a UDP packet about method usage
    Usage.increment(key)
  end

  before options[:method], aspect_arg: true do |aspect|
    aspect.handle
  end
end

# Counter for Sinatra app requests
Handler.apply(App, method: :call, key: :request)

# Counter for number of invokations of process method
Handler.apply(Processor, method: :process, key: :processor)

More examples


# Monitor number of invokations, time taken and number of errors
Usage::ComplexHandler.apply(
  App, method: :call, key: :request
)
Usage::ComplexHandler.apply(
  FbService, method: :find, key: :fb_find
)

# We can always track how often do we save Mongoid objects
Usage::IncrementHandler.apply(
  Mongoid::Document, method: :save, key: :mg_save
)

Let's gra(fana)ph it!

Rich graphing
Mixed styling
Multiple dashboards
InfluxDB query editor
Annotations
Multiple data sources
JSON graph editor

JSON editor


{
  "target": "",
  "function": "max",
  "column": "value",
  "series": "stats.request.times.upper",
  "query": "select max(value) from \"stats.request.times.upper\"",
  "groupby_field": "",
  "alias": "Requests time upper"
},

Downsides

Self maintained
Hard to debug
Silently fails when network issues
Needs some configuration between components
Won't auto addapt to application changes
Useless for non-app (proxy, etc) level issues

THE END - Q & A

Maciej Mensfeld

- Aspector
- Heka
- InfluxDB
- Grafana

Monitoring Ruby applications with Heka and Grafana

Why would you even bother?

But how?

So let's have a small test-case

So let's have a small test-case

Naive approach

Naive approach

What would Jesus do?

But mozilla does!

Be friends with Heka because...

So let's do this better!

We know that:

What do we really need?

What can we do?

Aspect Oriented Programming

Simple aspect example

More examples

Let's gra(fana)ph it!

JSON editor

Downsides

THE END - Q & A