gawk analyze log file

今天有个需求,需要分析服务器上面的log日志,找出其中响应时间大于400ms的请求。

log日志大概的样子如下:

Started GET "/xxx/xxx/l1/pnxx9" for 218.107.18.137 at Mon May 07 10:32:00 +0800 2012
  Processing by XXController#action as HTML
  Parameters: {"x"=>"xxx", "xx"=>"xxxx", "x"=>"xxxx", "x"=>nil}
...
...
...
... lot of render
Completed 200 OK in 222ms (Views: 166.1ms | ActiveRecord: 0.0ms)

每段记录是以空行隔开的,文件比较大,大概在二百兆的样子。

我用gawk来处理:

gawk -vRS= -F'\n' 'split($NF,a," ") match(a[5],/([0-9]*)/,b) {if (b[1] > 400) print $1,"\n",$2,"\n",$NF,"\n"}' production.log > analyze.log

处理时间不到一分钟,还是可以的。

Rails3 application booting process

booting process

  # == Booting process
  #
  # The application is also responsible for setting up and executing the booting
  # process. From the moment you require "config/application.rb" in your app,
  # the booting process goes like this:
  #
  #   1)  require "config/boot.rb" to setup load paths
  #   2)  require railties and engines
  #   3)  Define Rails.application as "class MyApp::Application < Rails::Application"
  #   4)  Run config.before_configuration callbacks
  #   5)  Load config/environments/ENV.rb
  #   6)  Run config.before_initialize callbacks
  #   7)  Run Railtie#initializer defined by railties, engines and application.
  #       One by one, each engine sets up its load paths, routes and runs its config/initializers/* files.
  #   9)  Custom Railtie#initializers added by railties, engines and applications are executed
  #   10) Build the middleware stack and run to_prepare callbacks
  #   11) Run config.before_eager_load and eager_load if cache classes is true
  #   12) Run config.after_initialize callbacks

How to call expire_fragment in rails console

Today i want expire a fragment cache in my view.

<% cache('zires') do %>
  ...
<% end %>

So, i enter rails console – Rails c production, but i found Rails.cache.exist?('zires') return false.

Why?

Because your cache key is wrong! Check fragment_cache_key(key), rails adds a dynamic namespace – :views binding on key.

The easiest way:

ActionController::Base.new.expire_fragment('zires', options = nil)

How does rails server static file in public directory

Rails3是如何处理在public目录下的静态文件的?

首先要说的是,Rails3.1在生产环境下默认是不开启静态文件服务的,因为Apache或者Nginx这些web服务器可以帮我们做这些,所以如果在生产环境下出现404,您可能需要在config/environments/production.rb中开启:

# Disable Rails's static asset server (Apache or nginx will already do this)
config.serve_static_assets = true

我们知道在Rails3之后,Rails和Rack的关系很大,提到Rack就不得不说中间件了(middleware),先来看看Rails中用到了哪些middleware:

rake middleware

use ActionDispatch::Static
use Rack::Lock
use ActiveSupport::Cache::Strategy::LocalCache
use Rack::Runtime
use Rails::Rack::Logger
use ActionDispatch::ShowExceptions
use ActionDispatch::DebugExceptions
use ActionDispatch::RemoteIp
use Rack::Sendfile
use ActionDispatch::Callbacks
use ActiveRecord::ConnectionAdapters::ConnectionManagement
use ActiveRecord::QueryCache
use ActionDispatch::Cookies
use ActionDispatch::Session::CookieStore
use ActionDispatch::Flash
use ActionDispatch::ParamsParser
use Rack::MethodOverride
use ActionDispatch::Head
use ActionDispatch::BestStandardsSupport
run Blog::Application.routes

OK,我们可以看到ActionDispatch::Static就是这个,是用来处理静态文件的。
看过ActionDispatch::Static的源码不难发现,用的主要是FileHandler:

class Static
  def initialize(app, path, cache_control=nil)
    @app = app
    @file_handler = FileHandler.new(path, cache_control)
  end
end

FileHandler符合标准的Rack中间件的定义,相应call方法:

class FileHandler
 def initialize(root, cache_control)
   @root          = root.chomp('/')
   @compiled_root = /^#{Regexp.escape(root)}/
   @file_server   = ::Rack::File.new(@root, cache_control)
 end

 def call(env)
   @file_server.call(env)
 end
end

原来就是`rack/utils`中的Rack::File了。Rack::File通过Request的路径到相应的目录下寻找文件。

通过ActionDispatch::Static的测试用例也可以看出它的作用:

def test_serves_static_index_at_root
  assert_html "/index.html", get("/index.html")
  assert_html "/index.html", get("/index")
  assert_html "/index.html", get("/")
  assert_html "/index.html", get("")
end

def test_serves_static_file_in_directory
  assert_html "/foo/bar.html", get("/foo/bar.html")
  assert_html "/foo/bar.html", get("/foo/bar/")
  assert_html "/foo/bar.html", get("/foo/bar")
end

def test_serves_static_index_file_in_directory
  assert_html "/foo/index.html", get("/foo/index.html")
  assert_html "/foo/index.html", get("/foo/")
  assert_html "/foo/index.html", get("/foo")
end

OK,到这里就应该清楚了,下面就只要指定路径就行了,也就是”Public”,在Rails::Application文件中:

if config.serve_static_assets
  middleware.use ::ActionDispatch::Static, paths["public"].first, config.static_cache_control
end

nginx 备忘

1)域名恶意绑定

今天发现一个域名被恶意绑定IP了,解决办法是在Nginx中加一个默认的server。
看过这篇How nginx processes a request文章应该知道,Nginx会找到一个server配置来处理请求:

  server {
    listen       80;
    server_name  nginx.org  www.nginx.org;
    ...
  }

上面的配置文件如果不是来自nginx.org或者www.nginx.org的话也会走这个server,这是因为我们没有指定default_server

  server{
    listen       80;
    server_name  nginx.org  www.nginx.org;
    ...
  }

  server{
    listen       80 default_server;
    rewrite   ^(.*) http://www.nginx.org permanent;
    ...
  }

上面的配置文件就可以了,如果不是来自指定域名的请求,就会走下面的default_server,然后再让它永久重定向到我们网站就解决域名恶意绑定的问题了。

2)gzip

看文档 HttpGzipModule模块

How to test mount rack app in rails3

在测试路由的时候,可以通过:

class RouteTest < ActionController::TestCase
  assert_routing '/posts/1', { :controller => "posts", :action => "show", :id => "1" }
end

我们也知道:

route = Dummy::Application.routes
route.recognize_path "/posts/1"
#=> { :controller => "posts", :action => "show", :id => "1" }

route.generate :controller => "posts", :action => "show", :id => "1"
#=> "/posts/1"

但是对于下面这种情况:

My::Application.routes.draw do
  mount Rack::App => "/rack"
end

通过route是找不到的

route = My::Application.routes
route.recognize_path "/rack"
#=> "No route matches /rack"

如何测试呢?现提供下面这个方法:

routes = Dummy::Application.routes.routes
server = routes.select { |r| r.app.instance_of?(Rack::App) }.pop
# test mount successful
assert !server.nil?
path   = server.path.spec.to_s
# test mount path correct
assert_equal('/rack', path)

注意:上面的测试只是覆盖了Rack App是否挂载在正确的路径上,而不会测试返回的内容是否正确!

Sprockets and Rails3.1

rails3.1的Asset Pipeline已经用的很Cool了. `Asset Pipeline`主要是通过Sprockets提供支持.

Sprockets

Sprockets很简单,需要一个Sprockets Environment实例来处理所有的assets. 在Rails中就是 `YourApp::Application.assets`,在终端中输出看下:

YourApp::Application.assets

#=>#<Sprockets::Environment:0x1c3c840 root="/xxx/xxxx/your_app", paths=["...", ...], digest="aa7d0db7619379e13b08335dee027df2">

可以看到这里有个paths实例变量,里面都是默认的rails的assets路径,并且这个paths可以在application.rb中配置的:

config.assets.paths << "#{Rails.root}/foo/bar"

Sprockets::BundledAsset

e = YourApp::Application.assets
asset = e['application.js']
# => #<Sprockets::BundledAsset ...>
asset.to_s
# => The content of asset
asset.length
asset.mtime
asset.pathname

javascript_include_tag “application”

其实就是在Sprockets Environment实例的paths中,寻找’application.js’文件,找到就返回其内容.

Sprockets Directives
  • require
  • require path inserts the contents of the asset source file specified by path. If the file is required multiple times, it will appear in the bundle only once.

  • include
  • include path works like require, but inserts the contents of the specified source file even if it has already been included or required.

  • require_directory
  • require_directory path requires all source files of the same format in the directory specified by path. Files are required in alphabetical order.

  • require_tree
  • require_tree path works like require_directory, but operates recursively to require all files in all subdirectories of the directory specified by path.

  • require_self
  • require_self tells Sprockets to insert the body of the current source file before any subsequent require or include directives.

  • depend_on
  • depend_on path declares a dependency on the given path without including it in the bundle. This is useful when you need to expire an asset’s cache in response to a change in another file.

API

The change you wanted was rejected.Maybe you tried to change something you didn’t have access to.

The change you wanted was rejected.Maybe you tried to change something you didn’t have access to.

422 Unprocessable Entity (WebDAV) (RFC 4918)
The request was well-formed but was unable to be followed due to semantic errors.

在rails应用中出现这个422错误基本上是由于 `authenticity token` 造成的.在提交表单的时候如果不想手动拼加上`authenticity token`可以这样:

class SomeController < ApplicationController
  protect_from_forgery :except => :some_action
 
  def some_action
    ...
  end
end

about append_features method

在写上一篇关于concern的deprecated文章时,还发现两个有意思的东西,一并记录下来。

append_features

When this module is included in another, Ruby calls append_features in this module, passing it the receiving module in mod. Ruby’s default implementation is to add the constants, methods, and module variables of this module to mod if this module has not already been added to mod or one of its ancestors. See also Module#include.

我们知道extended和included是ruby的两个hook方法,原来勾的就是这个append_features方法。

Module,Class的comparation

在`append_features`中有一句代码很简单,但却值得玩味:

def append_features(base)
  ...
  return false if base < self
  ...
end

原来module还可以比较:

mod <=> other_mod → -1, 0, +1, or nil

Comparison—Returns -1 if mod includes other_mod, 0 if mod is the same as other_mod, and +1 if mod is included by other_mod. Returns nil if mod has no relationship with other_mod or if other_mod is not a module.

但是

这里用来判断是否有继承关系,比较巧妙,如果base继承至self,说明方法已经在方法链上了,直接return false。

The InstanceMethods module inside ActiveSupport::Concern will be no longer included automatically

The InstanceMethods module inside ActiveSupport::Concern will be no longer included automatically. Please define instance methods directly in …

今天在一个Rails3.2的项目里发现了这个deprecated,去查看了一下源码,大家可以比较一下:

对比了,发现只是少了这么一句:

base.send :include, const_get("InstanceMethods")

不需要再定义InstanceMethods Module了,直接定义这些方法就行了。#Commit

old

module M
  extend ActiveSupport::Concern

  module ClassMethods
    ...
  end

  module InstanceMethods
    def some_instance_method
      ...
    end
  end
end

new

module M
  extend ActiveSupport::Concern

  module ClassMethods
    ...
  end
 
  # directly define method
  def some_instance_method
    ...
  end

end

Why?

ruby的include行为就是引入instance method,用InstanceMethods确实有点多此一举,不过我想老的版本引入InstanceMethods应该是为了和ClassMethods对应。