Parsing simple XML with Nokogiri

12,993

Solution 1

Replace this:

@links = doc.xpath('//links/item').map do |i| 
  {'title' => i.xpath('//title'), 'url' => i.xpath('//url')} 

with:

@links = doc.xpath('//links/item').map do |i| 
  {'title' => i.xpath('title'), 'url' => i.xpath('url')} 

Explanation:

//title 

and

//url

are absolute XPath expressions and they select all (respectively) title and all url elements in the XML document.

Contrast this with:

title

and

url

These are relative XPath expressions and select all (respectively) title and url children of the current node only.

Solution 2

The trouble here is that the Xpath //title searches for titles from the root of the document, and so returns all title tags. Using the Xpath title searches within the context of the given node, like you want. Ditto on url.

@links = doc.xpath('//links/item').map do |i|
  {'title' => i.xpath('title'), 'url' => i.xpath('url')}
end
Share:
12,993

Related videos on Youtube

Vincent
Author by

Vincent

Updated on May 06, 2022

Comments

  • Vincent
    Vincent about 2 years

    I have the following XML:

    <links>
    
      <item>
        <title>Title 1</title>
        <url>http://www.example.com/url-1</url>
      </item>
    
      <item>
       <title>Title 2</title>
       <url>http://www.example.com/url-2</url>
      </item>
    
      <item>
        <title>Title 3</title>
        <url>http://www.example.com/url-3</url>
      </item>
    
    </links>
    

    And, I would like to convert it to a HTML list:

    <ul>
      <li><a href="http://www.example.com/url-1">Title 1</a></li>
      <li><a href="http://www.example.com/url-2">Title 2</a></li>
      <li><a href="http://www.example.com/url-3">Title 3</a></li>
    </ul>
    

    Currently I have this:

    Controller:

    require 'nokogiri'
    doc = Nokogiri::XML(...)
    
    @links = doc.xpath('//links/item').map do |i|
      {'title' => i.xpath('//title'), 'url' => i.xpath('//url')}
    end
    

    Template:

    <ul>
      <% @links.each do |l| %>
        <li><a href="<%= l['url'] %>"><%= l['title'] %></a></li>
      <% end %>
    </ul> 
    

    Resulting HTML:

    <ul>
      <li><a href="http://www.example.com/url-1http://www.example.com/url-2http://www.example.com/url-3">Title 1Title 2Title 3</a></li>
      <li><a href="http://www.example.com/url-1http://www.example.com/url-2http://www.example.com/url-3">Title 1Title 2Title 3</a></li>
      <li><a href="http://www.example.com/url-1http://www.example.com/url-2http://www.example.com/url-3">Title 1Title 2Title 3</a></li>
    </ul>
    

    What am I doing wrong? Is there a more optimal way of doing this?

  • Matchu
    Matchu over 13 years
    Answer revoked, +1, since I assume you actually know what you're talking about. I don't know Xpath, and just guessed xD
  • Dimitre Novatchev
    Dimitre Novatchev over 13 years
    @Matchu: Yes, I do know XPath and happen to be ranked #1 by rep in this tag. :) But your answer was correct -- you needn't delete it. Undelete it and I'll upvote.
  • Matchu
    Matchu over 13 years
    Thanks :) I'm usually pretty picky about just letting one answer be up when there are dupes within seconds, though, since my OCD wins out over my insatiable hunger for rep. Thanks, though! You are a gentleman and a scholar :o
  • Dimitre Novatchev
    Dimitre Novatchev over 13 years
    @Matchu: Please... Undelete your answer. I want to upvote it.
  • Dimitre Novatchev
    Dimitre Novatchev over 13 years
    Wow... I saw your undelete only now. Of course, the fully deserved +1. And I admired your explanation of the RegEx that decides if a number representation is composite!!!