In Files

  • ripper/lib/ripper.rb
  • ripper/lib/ripper/core.rb
  • ripper/lib/ripper/filter.rb
  • ripper/lib/ripper/lexer.rb
  • ripper/lib/ripper/sexp.rb
  • ripper/ripper.c

Class/Module Index [+]

Quicksearch

Ripper

Ripper is a Ruby script parser.

You can get information from the parser with event-based style. Information such as abstract syntax trees or simple lexical analysis of the Ruby program.

Usage

Ripper provides an easy interface for parsing your program into a symbolic expression tree (or S-expression).

Understanding the output of the parser may come as a challenge, it's recommended you use PP to format the output for legibility.

require 'ripper'
require 'pp'

pp Ripper.sexp('def hello(world) "Hello, #{world}!"; end')
  #=> [:program,
       [[:def,
         [:@ident, "hello", [1, 4]],
         [:paren,
          [:params, [[:@ident, "world", [1, 10]]], nil, nil, nil, nil, nil, nil]],
         [:bodystmt,
          [[:string_literal,
            [:string_content,
             [:@tstring_content, "Hello, ", [1, 18]],
             [:string_embexpr, [[:var_ref, [:@ident, "world", [1, 27]]]]],
             [:@tstring_content, "!", [1, 33]]]]],
          nil,
          nil,
          nil]]]]

You can see in the example above, the expression starts with :program.

From here, a method definition at :def, followed by the method's identifier :@ident. After the method's identifier comes the parentheses :paren and the method parameters under :params.

Next is the method body, starting at :bodystmt (stmt meaning statement), which contains the full definition of the method.

In our case, we're simply returning a String, so next we have the :string_literal expression.

Within our :string_literal you'll notice two @tstring_content, this is the literal part for Hello, and !. Between the two @tstring_content statements is a :string_embexpr, where embexpr is an embedded expression. Our expression consists of a local variable, or var_ref, with the identifier (@ident) of world.

Resources

Requirements

  • ruby 1.9 (support CVS HEAD only)

  • bison 1.28 or later (Other yaccs do not work)

License

Ruby License.

Constants

EVENTS

This array contains name of all ripper events.

EXPR_ARG

newline significant, +/- is an operator.

EXPR_ARG_ANY

equals to (EXPR_ARG | EXPR_CMDARG)

EXPR_BEG

ignore newline, +/- is a sign.

EXPR_BEG_ANY

equals to (EXPR_BEG | EXPR_MID | EXPR_CLASS)

EXPR_CLASS

immediate after `class', no here document.

EXPR_CMDARG

newline significant, +/- is an operator.

EXPR_DOT

', no reserved words.

EXPR_END

newline significant, +/- is an operator.

EXPR_ENDARG

ditto, and unbound braces.

EXPR_ENDFN

ditto, and unbound braces.

EXPR_END_ANY

equals to (EXPR_END | EXPR_ENDARG | EXPR_ENDFN)

EXPR_FITEM

symbol literal as FNAME.

EXPR_FNAME

ignore newline, no reserved words.

EXPR_LABEL

flag bit, label is allowed.

EXPR_LABELED

flag bit, just after a label.

EXPR_MID

newline significant, +/- is an operator.

EXPR_NONE

equals to 0

EXPR_VALUE

equals to EXPR_BEG

PARSER_EVENTS

This array contains name of parser events.

SCANNER_EVENTS

This array contains name of scanner events.

Version

version of Ripper

Public Class Methods

dedent_string(input, width) → Integer click to toggle source

USE OF RIPPER LIBRARY ONLY.

Strips up to width leading whitespaces from input, and returns the stripped column width.

 
               static VALUE
parser_dedent_string(VALUE self, VALUE input, VALUE width)
{
    int wid, col;

    StringValue(input);
    wid = NUM2UINT(width);
    col = dedent_string(input, wid);
    return INT2NUM(col);
}
            
lex(src, filename = '-', lineno = 1, **kw) click to toggle source

Tokenizes the Ruby program and returns an array of an array, which is formatted like [[lineno, column], type, token, state]. The filename argument is mostly ignored. By default, this method does not handle syntax errors in src, use the raise_errors keyword to raise a SyntaxError for an error in src.

require 'ripper'
require 'pp'

pp Ripper.lex("def m(a) nil end")
#=> [[[1,  0], :on_kw,     "def", FNAME    ],
     [[1,  3], :on_sp,     " ",   FNAME    ],
     [[1,  4], :on_ident,  "m",   ENDFN    ],
     [[1,  5], :on_lparen, "(",   BEG|LABEL],
     [[1,  6], :on_ident,  "a",   ARG      ],
     [[1,  7], :on_rparen, ")",   ENDFN    ],
     [[1,  8], :on_sp,     " ",   BEG      ],
     [[1,  9], :on_kw,     "nil", END      ],
     [[1, 12], :on_sp,     " ",   END      ],
     [[1, 13], :on_kw,     "end", END      ]]
 
               # File ripper/lib/ripper/lexer.rb, line 50
def Ripper.lex(src, filename = '-', lineno = 1, **kw)
  Lexer.new(src, filename, lineno).lex(**kw)
end
            
lex_state_name(integer) → string click to toggle source

Returns a string representation of lex_state.

 
               static VALUE
ripper_lex_state_name(VALUE self, VALUE state)
{
    return rb_parser_lex_state_name(NUM2INT(state));
}
            
new(src, filename="(ripper)", lineno=1) → ripper click to toggle source

Create a new Ripper object. src must be a String, an IO, or an Object which has gets method.

This method does not starts parsing. See also #parse and ::parse.

 
               static VALUE
ripper_initialize(int argc, VALUE *argv, VALUE self)
{
    struct parser_params *p;
    VALUE src, fname, lineno;

    TypedData_Get_Struct(self, struct parser_params, &parser_data_type, p);
    rb_scan_args(argc, argv, "12", &src, &fname, &lineno);
    if (RB_TYPE_P(src, T_FILE)) {
        p->lex.gets = ripper_lex_io_get;
    }
    else if (rb_respond_to(src, id_gets)) {
        p->lex.gets = ripper_lex_get_generic;
    }
    else {
        StringValue(src);
        p->lex.gets = lex_get_str;
    }
    p->lex.input = src;
    p->eofp = 0;
    if (NIL_P(fname)) {
        fname = STR_NEW2("(ripper)");
        OBJ_FREEZE(fname);
    }
    else {
        StringValueCStr(fname);
        fname = rb_str_new_frozen(fname);
    }
    parser_initialize(p);

    p->ruby_sourcefile_string = fname;
    p->ruby_sourcefile = RSTRING_PTR(fname);
    p->ruby_sourceline = NIL_P(lineno) ? 0 : NUM2INT(lineno) - 1;

    return Qnil;
}
            
parse(src, filename = '(ripper)', lineno = 1) click to toggle source

Parses the given Ruby program read from src. src must be a String or an IO or a object with a gets method.

 
               # File ripper/lib/ripper/core.rb, line 17
def Ripper.parse(src, filename = '(ripper)', lineno = 1)
  new(src, filename, lineno).parse
end
            
sexp(src, filename = '-', lineno = 1, raise_errors: false) click to toggle source
EXPERIMENTAL

Parses src and create S-exp tree. Returns more readable tree rather than ::sexp_raw. This method is mainly for developer use. The filename argument is mostly ignored. By default, this method does not handle syntax errors in src, returning nil in such cases. Use the raise_errors keyword to raise a SyntaxError for an error in src.

require 'ripper'
require 'pp'

pp Ripper.sexp("def m(a) nil end")
  #=> [:program,
       [[:def,
        [:@ident, "m", [1, 4]],
        [:paren, [:params, [[:@ident, "a", [1, 6]]], nil, nil, nil, nil, nil, nil]],
        [:bodystmt, [[:var_ref, [:@kw, "nil", [1, 9]]]], nil, nil, nil]]]]
 
               # File ripper/lib/ripper/sexp.rb, line 34
def Ripper.sexp(src, filename = '-', lineno = 1, raise_errors: false)
  builder = SexpBuilderPP.new(src, filename, lineno)
  sexp = builder.parse
  if builder.error?
    if raise_errors
      raise SyntaxError, builder.error
    end
  else
    sexp
  end
end
            
sexp_raw(src, filename = '-', lineno = 1, raise_errors: false) click to toggle source
EXPERIMENTAL

Parses src and create S-exp tree. This method is mainly for developer use. The filename argument is mostly ignored. By default, this method does not handle syntax errors in src, returning nil in such cases. Use the raise_errors keyword to raise a SyntaxError for an error in src.

require 'ripper'
require 'pp'

pp Ripper.sexp_raw("def m(a) nil end")
  #=> [:program,
       [:stmts_add,
        [:stmts_new],
        [:def,
         [:@ident, "m", [1, 4]],
         [:paren, [:params, [[:@ident, "a", [1, 6]]], nil, nil, nil]],
         [:bodystmt,
          [:stmts_add, [:stmts_new], [:var_ref, [:@kw, "nil", [1, 9]]]],
          nil,
          nil,
          nil]]]]
 
               # File ripper/lib/ripper/sexp.rb, line 70
def Ripper.sexp_raw(src, filename = '-', lineno = 1, raise_errors: false)
  builder = SexpBuilder.new(src, filename, lineno)
  sexp = builder.parse
  if builder.error?
    if raise_errors
      raise SyntaxError, builder.error
    end
  else
    sexp
  end
end
            
slice(src, pattern, n = 0) click to toggle source
EXPERIMENTAL

Parses src and return a string which was matched to pattern. pattern should be described as Regexp.

require 'ripper'

p Ripper.slice('def m(a) nil end', 'ident')                   #=> "m"
p Ripper.slice('def m(a) nil end', '[ident lparen rparen]+')  #=> "m(a)"
p Ripper.slice("<<EOS\nstring\nEOS",
               'heredoc_beg nl $(tstring_content*) heredoc_end', 1)
    #=> "string\n"
 
               # File ripper/lib/ripper/lexer.rb, line 269
def Ripper.slice(src, pattern, n = 0)
  if m = token_match(src, pattern)
  then m.string(n)
  else nil
  end
end
            
tokenize(src, filename = '-', lineno = 1, **kw) click to toggle source

Tokenizes the Ruby program and returns an array of strings. The filename and lineno arguments are mostly ignored, since the return value is just the tokenized input. By default, this method does not handle syntax errors in src, use the raise_errors keyword to raise a SyntaxError for an error in src.

p Ripper.tokenize("def m(a) nil end")
   # => ["def", " ", "m", "(", "a", ")", " ", "nil", " ", "end"]
 
               # File ripper/lib/ripper/lexer.rb, line 24
def Ripper.tokenize(src, filename = '-', lineno = 1, **kw)
  Lexer.new(src, filename, lineno).tokenize(**kw)
end
            

Public Instance Methods

column → Integer click to toggle source

Return column number of current parsing line. This number starts from 0.

 
               static VALUE
ripper_column(VALUE self)
{
    struct parser_params *p;
    long col;

    TypedData_Get_Struct(self, struct parser_params, &parser_data_type, p);
    if (!ripper_initialized_p(p)) {
        rb_raise(rb_eArgError, "method called for uninitialized object");
    }
    if (NIL_P(p->parsing_thread)) return Qnil;
    col = p->lex.ptok - p->lex.pbeg;
    return LONG2NUM(col);
}
            
debug_output → obj click to toggle source

Get debug output.

 
               VALUE
rb_parser_get_debug_output(VALUE self)
{
    struct parser_params *p;

    TypedData_Get_Struct(self, struct parser_params, &parser_data_type, p);
    return p->debug_output;
}
            
debug_output = obj click to toggle source

Set debug output.

 
               VALUE
rb_parser_set_debug_output(VALUE self, VALUE output)
{
    struct parser_params *p;

    TypedData_Get_Struct(self, struct parser_params, &parser_data_type, p);
    return p->debug_output = output;
}
            
encoding → encoding click to toggle source

Return encoding of the source.

 
               VALUE
rb_parser_encoding(VALUE vparser)
{
    struct parser_params *p;

    TypedData_Get_Struct(vparser, struct parser_params, &parser_data_type, p);
    return rb_enc_from_encoding(p->enc);
}
            
end_seen? → Boolean click to toggle source

Return true if parsed source ended by +_END_+.

 
               VALUE
rb_parser_end_seen_p(VALUE vparser)
{
    struct parser_params *p;

    TypedData_Get_Struct(vparser, struct parser_params, &parser_data_type, p);
    return RBOOL(p->ruby__end__seen);
}
            
error? → Boolean click to toggle source

Return true if parsed source has errors.

 
               static VALUE
ripper_error_p(VALUE vparser)
{
    struct parser_params *p;

    TypedData_Get_Struct(vparser, struct parser_params, &parser_data_type, p);
    return RBOOL(p->error_p);
}
            
filename → String click to toggle source

Return current parsing filename.

 
               static VALUE
ripper_filename(VALUE self)
{
    struct parser_params *p;

    TypedData_Get_Struct(self, struct parser_params, &parser_data_type, p);
    if (!ripper_initialized_p(p)) {
        rb_raise(rb_eArgError, "method called for uninitialized object");
    }
    return p->ruby_sourcefile_string;
}
            
lineno → Integer click to toggle source

Return line number of current parsing line. This number starts from 1.

 
               static VALUE
ripper_lineno(VALUE self)
{
    struct parser_params *p;

    TypedData_Get_Struct(self, struct parser_params, &parser_data_type, p);
    if (!ripper_initialized_p(p)) {
        rb_raise(rb_eArgError, "method called for uninitialized object");
    }
    if (NIL_P(p->parsing_thread)) return Qnil;
    return INT2NUM(p->ruby_sourceline);
}
            
parse click to toggle source

Start parsing and returns the value of the root action.

 
               static VALUE
ripper_parse(VALUE self)
{
    struct parser_params *p;

    TypedData_Get_Struct(self, struct parser_params, &parser_data_type, p);
    if (!ripper_initialized_p(p)) {
        rb_raise(rb_eArgError, "method called for uninitialized object");
    }
    if (!NIL_P(p->parsing_thread)) {
        if (p->parsing_thread == rb_thread_current())
            rb_raise(rb_eArgError, "Ripper#parse is not reentrant");
        else
            rb_raise(rb_eArgError, "Ripper#parse is not multithread-safe");
    }
    p->parsing_thread = rb_thread_current();
    rb_ensure(ripper_parse0, self, ripper_ensure, self);

    return p->result;
}
            
state → Integer click to toggle source

Return scanner state of current token.

 
               static VALUE
ripper_state(VALUE self)
{
    struct parser_params *p;

    TypedData_Get_Struct(self, struct parser_params, &parser_data_type, p);
    if (!ripper_initialized_p(p)) {
        rb_raise(rb_eArgError, "method called for uninitialized object");
    }
    if (NIL_P(p->parsing_thread)) return Qnil;
    return INT2NUM(p->lex.state);
}
            
token → String click to toggle source

Return the current token string.

 
               static VALUE
ripper_token(VALUE self)
{
    struct parser_params *p;
    long pos, len;

    TypedData_Get_Struct(self, struct parser_params, &parser_data_type, p);
    if (!ripper_initialized_p(p)) {
        rb_raise(rb_eArgError, "method called for uninitialized object");
    }
    if (NIL_P(p->parsing_thread)) return Qnil;
    pos = p->lex.ptok - p->lex.pbeg;
    len = p->lex.pcur - p->lex.ptok;
    return rb_str_subseq(p->lex.lastline, pos, len);
}
            
yydebug → true or false click to toggle source

Get yydebug.

 
               VALUE
rb_parser_get_yydebug(VALUE self)
{
    struct parser_params *p;

    TypedData_Get_Struct(self, struct parser_params, &parser_data_type, p);
    return RBOOL(p->debug);
}
            
yydebug = flag click to toggle source

Set yydebug.

 
               VALUE
rb_parser_set_yydebug(VALUE self, VALUE flag)
{
    struct parser_params *p;

    TypedData_Get_Struct(self, struct parser_params, &parser_data_type, p);
    p->debug = RTEST(flag);
    return flag;
}
            
There is an updated format of the API docs for this version here.