Kubelet knows which pods are running on the local node, so it’s easy/cheap/fast to ask it for the pods metadata. This avoids hitting the api-server which can often be under high load when all Fluentd reporters restart or try to request at the same time.
Tag Archives: Ruby
Verify Pagerduty reaches On-Call by Cron
We had a few incidents were on-call devs missed their calls because of various spam-blocking setups or “do not disturb” settings.
We now run a small service that test-notifies everyone once a month to make sure notifications go through. Notifications go out shortly before their ‘do not disturb’ stops so we do not wake them in the middle of the night, but still have a realistic situation.
Our setup has more logging/stats etc, but it goes something like this:
# configure user schedule require 'yaml' users = YAML.load <<~YAML - name: "John Doe" id: ABCD # cron: "* * * * * America/Los_Angeles" # every minute ... for local testing cron: "55 6 * * 2#1 America/Los_Angeles" # every first Tuesday of the month at 6:55am # ... more users here YAML # code to notify users require 'json' require 'faraday' def create_test_incident(user) connection = Faraday.new response = nil 2.times do response = connection.post do |req| req.url "https://api.pagerduty.com/incidents" req.headers['Content-Type'] = 'application/json' req.headers['Accept'] = 'application/vnd.pagerduty+json;version=2' req.headers['From'] = 'realusers@email.com' # incident owner req.headers['Authorization'] = "Token token=#{ENV.fetch("PAGERDUTY_TOKEN")}" req.body = { incident: { type: "incident", title: "Pagerduty Tester: Incident for #{user.fetch("name")}, press resolve", service: { id: ENV.fetch("SERVICE_ID"), type: "service_reference" }, assignments: [{ assignee: { id: user.fetch("id"), type: "user_reference" } }] } }.to_json end if response.status == 429 # pagerduty rate-limits to 6 incidents/min/service sleep 60 next end raise "Request failed #{response.status} -- #{response.body}" if response.status >= 300 end JSON.parse(response.body).fetch("incident").fetch("id") end # run on a schedule (no threading / forking) require 'serial_scheduler' require 'fugit' scheduler = SerialScheduler.new users.each do |user| scheduler.add("Notify #{user.fetch("name")}", cron: user.fetch("cron"), timeout: 10) do user_id = user.fetch("id") incident_id = PagerdutyTester.create_test_incident(user) puts "Created incident for #{user_id} https://#{ENV.fetch('SUBDOMAIN')}.pagerduty.com/incidents/#{incident_id}" rescue StandardError => e puts "Creating incident for #{user_id} failed #{e}" end end scheduler.run
Rails Production Readonly Console
Using sandbox console in production can be dangerous because of locking, so I ended up implementing a toggle-able readonly mode, so by default the console is readonly unless the user opts-in to writing.
It’s not perfect (can be circumvented with crafted sql), but a lot simpler than setting up different credentials and switching the connection.
Rails Sum ActiveSupport Instrument Times
We wanted to show the sum of multiple ActiveSupport notifications during a long process, so here is a tiny snipped to do that, an advanced version is used in Samson
# sum activesupport notification duration for given metrics def time_sum(metrics, &block) sum = Hash.new(0.0) add = ->(m, s, f, *) { sum[m] += 1000 * (f - s) } metrics.inject(block) do |inner, m| -> { ActiveSupport::Notifications.subscribed(add, m, &inner) } end.call sum end time_sum(["sql.active_record"]) { 10.times { User.first } } # {"sql.active_record" => 10.3}
Validating ActiveRecord Backlinks exist
Whenever a new association is added usually we also need the opposite association to ensure things get cleaned up properly during deletion.
To never forget this and audit the current state, these two tests can help.
def all_models models = Dir["app/models/**/*.rb"].grep_v(/\/concerns\//) models.size.must_be :>, 20 models.each { |f| require f } ActiveRecord::Base.descendants end it "explicity defines what should happen to dependencies" do bad = all_models.flat_map do |model| model.reflect_on_all_associations.map do |association| next if association.is_a?(ActiveRecord::Reflection::BelongsToReflection) next if association.options.key?(:through) next if association.options.key?(:dependent) "#{model.name} #{association.name}" end end.compact assert( bad.empty?, "These associations need a :dependent defined (most likely :destroy or nil)\n#{bad.join("\n")}" ) end it "links all dependencies both ways so dependencies get deleted reliably" do bad = all_models.flat_map do |model| model.reflect_on_all_associations.map do |association| next if association.name == :audits next if association.options.fetch(:inverse_of, false).nil? # disabled on purpose next if association.inverse_of "#{model.name} #{association.name}" end end.compact assert( bad.empty?, <<~TEXT These associations need an inverse association. For example project has stages and stage has project. If automatic connection does not work, use `:inverse_of` option on the association. If inverse association is missing AND the inverse should not destroyed when dependency is destroyed, use `inverse_of: nil`. #{bad.join("\n")} TEXT ) end