Async Sinatra (Eventmachine) vs Node Performance

Just made a small gist to compare base performance of these 2 evented alternatives, since i could not find any.

node wins @ 50ms vs 20ms per request

Its a very simple test, but if you got some real-world stuff to test its a nice starting point.

“Read ruby” (the ruby 1.9 book) as PDF

Open books (like Read Ruby organized by runpaint) are great,
but if there is no pdf version, how to read/print it properly !?

Converting to pdf…


require 'rubygems'
require 'open-uri'
require 'hpricot'

url = 'http://ruby.runpaint.org'

links = Hpricot(open(url).read).search('a').map{|link| link['href'] }
links.reject!{|link| link.include?('#') or link.include?('//') or link.include?('@') }
links -= ['/opensearch', '/toc']
links.unshift('/toc')
links.map!{|link| url+link }

content = links.map{|link| open(link).read }

html = "#{content * "/n/n/n/n"}"

out = 'temp.html'
File.open(out,'w'){|f| f.print html }

`wkhtmltopdf #{out} temp.pdf`

(or download the version from 2010-09-21)

Have fun!

Negative queries with solr in multiple fields

We recently did some negative queries and had a lot of ‘fun’ with solr.
After reading/testing a bit we found a simple rule: negative queries for single words do not work (dont ask me why…), but it can be fixed with an additional *:*

Does not work: title: -xxx / (-title:xxx)

When you are only interested in certain fields, query building gets rather conplex:

contains foo and bar -> title:(foo bar) OR description:(foo bar)
contains foo or bar -> title:(foo OR bar) OR description:(foo OR bar)
does not contain foo or bar-> -title:(foo bar *:*) AND -description(foo bar *:*)

The *:* is killed by acts_as_solr, so the parser needs a little fix too:


# lib/parser_methods.rb:80
# *:xxx -> *:xxx a : b -> a_t:b
query = "(#{query.gsub(/([^\*]) *: */, "\\1_t:")}) #{models}"

(see our branch on github)

Hope this helps someone!

Cached .all(:include=>[:xxx]) on associations

When fetching all associations with includes they are not cached, but could be, since they are still the same records(unlike with :select/:conditions etc)


user = User.first
user.comments.length # hits db
user.comments.length # cached

user = User.first
user.commens.all(:include=>:comenter).length  # hits db
user.commens.all(:include=>:comenter).length  # hits db
user.comments.length # hits db

Cached find all with includes
This can save requests when performing repetitive calls to the same record.


user = User.first
user.comments.load_target_with_includes([:commenter, :tags]).length # hits db
user.comments.load_target_with_includes([:commenter, :tags]).length # cached
user.comments.length # cached

Code


# do not load an association twice, when all we need are includes
# all(:include=>xxx) would always reload the target
class ActiveRecord::Associations::AssociationCollection
  def load_target_with_includes(includes)
    raise if @owner.new_record?

    if loaded?
      @target
    else
      @loaded = true
      @target = all(:include => includes)
    end
  end
end

Big updates block database, use slow_update_all

Sometimes big updates that affect millions of rows kill our database (all queries hang/are blocked).
Therefore we built a simple solution:


class ActiveRecord::Base
  def self.slow_update_all(set, where, options={})
    ids_to_update = find_values(:select => :id, :conditions => where)
    ids_to_update.each_slice(10_000) do |slice|
      update_all(set, :id => slice)
      sleep options[:sleep] if options[:sleep]
    end
    ids_to_update.size
  end
end

This needs ActiveRecord find_values extension

Michael Grosser, the Blog

Menu