RDoc::Markup parses plain text documents and attempts to decompose them into their constituent parts. Some of these parts are high-level: paragraphs, chunks of verbatim text, list entries and the like. Other parts happen at the character level: a piece of bold text, a word in code font. This markup is similar in spirit to that used on WikiWiki webs, where folks create web pages using a simple set of formatting rules.
RDoc::Markup itself does no output formatting: this is left to a different set of classes.
RDoc::Markup is extendable at runtime: you can add new markup elements to be recognised in the documents that RDoc::Markup parses.
RDoc::Markup is intended to be the basis for a family of tools which share the common requirement that simple, plain-text should be rendered in a variety of different output formats and media. It is envisaged that RDoc::Markup could be the basis for formatting RDoc style comment blocks, Wiki entries, and online FAQs.
This code converts input_string
to HTML. The conversion takes
place in the convert
method, so you can use the same RDoc::Markup converter to convert multiple input
strings.
require 'rdoc/markup/to_html' h = RDoc::Markup::ToHtml.new puts h.convert(input_string)
You can extend the RDoc::Markup parser to recognise new markup sequences, and to add special processing for text that matches a regular expression. Here we make WikiWords significant to the parser, and also make the sequences {word} and <no>text…</no> signify strike-through text. When then subclass the HTML output class to deal with these:
require 'rdoc/markup' require 'rdoc/markup/to_html' class WikiHtml < RDoc::Markup::ToHtml def handle_special_WIKIWORD(special) "<font color=red>" + special.text + "</font>" end end m = RDoc::Markup.new m.add_word_pair("{", "}", :STRIKE) m.add_html("no", :STRIKE) m.add_special(/\b([A-Z][a-z]+[A-Z]\w+)/, :WIKIWORD) wh = WikiHtml.new wh.add_tag(:STRIKE, "<strike>", "</strike>") puts "<body>#{wh.convert ARGF.read}</body>"
List entries look like:
* text 1. text [label] text label:: text
Flag it as a list entry, and work out the indent for subsequent lines
Take a block of text and use various heuristics to determine it's structure (paragraphs, lists, and so on). Invoke an event handler as we identify significant chunks.
# File rdoc/markup.rb, line 98 def initialize @am = RDoc::Markup::AttributeManager.new @output = nil end
Add to the sequences recognized as general markup.
# File rdoc/markup.rb, line 115 def add_html(tag, name) @am.add_html(tag, name) end
Add to other inline sequences. For example, we could add WikiWords using something like:
parser.add_special(/\b([A-Z][a-z]+[A-Z]\w+)/, :WIKIWORD)
Each wiki word will be presented to the output formatter via the accept_special method.
# File rdoc/markup.rb, line 128 def add_special(pattern, name) @am.add_special(pattern, name) end
Add to the sequences used to add formatting to an individual word (such as
bold). Matching entries will generate attributes that the
output formatters can recognize by their name
.
# File rdoc/markup.rb, line 108 def add_word_pair(start, stop, name) @am.add_word_pair(start, stop, name) end
For debugging, we allow access to our line contents as text.
# File rdoc/markup.rb, line 361 def content @lines.as_text end
We take a string, split it into lines, work out the type of each line, and from there deduce groups of lines (for example all lines in a paragraph). We then invoke the output formatter using a Visitor to display the result.
# File rdoc/markup.rb, line 138 def convert(str, op) lines = str.split(/\r?\n/).map { |line| Line.new line } @lines = Lines.new lines return "" if @lines.empty? @lines.normalize assign_types_to_lines group = group_lines # call the output formatter to handle the result #group.each { |line| p line } group.accept @am, op end