How to parse HTML getting the text from H1 … H6?

Use #map to get an array of text first.Then #join to construct an string with your choice of delimiter .



#!/usr/bin/env ruby

require 'nokogiri'

html = <<-STRING
<h1>foo</h1><h2>bar</h2>

bar


STRING

doc = Nokogiri::HTML::DocumentFragment.parse(html)


doc.css('h1, h2').map(&:text).join(" ") # => "foo bar"


More: