XPath and and or syntax, any shorter way to write this Xpath

16,716

Solution 1

If you had XPath 2.0 available, you should try the matches() function or even tokenize() as suggested by Ranon in his great answer.

With XPath 1.0, one way to shorten the expression could be this:

/Products/Product[
    CategoryPath/ProductCategoryPath[
        contains(., 'Damen')
            and (  contains(., 'Halbschuhe')
                or contains(.,    'Sneaker')
                or contains(., 'Ballerinas') )] ]

A convenient oneliner for easier copy-paste:

/Products/Product[CategoryPath/ProductCategoryPath[contains(.,'Damen') and (contains(.,'Halbschuhe') or contains(.,'Sneaker') or contains(.,'Ballerinas'))]]

I tried to preserve your expression exactly how it was, none of the changes should change the behaviour in any way.

There are some even shorter solutions that would have to take assumptions about the XML structure etc., but those could be broken in some hidden way we can't see without the full context, so we're not going that way.

Solution 2

If your XPath engine supports XPath 2.0, it can be done in an even more convenient (and probably efficient) way:

//Product[
  CategoryPath/ProductCategoryPath[
    tokenize(., '\s') = ('Halbschuhe', 'Sneaker', 'Ballerinas') and contains(., 'Damen')
  ]
]

fn:tokenize($string, $token) splits a string on a regex (here using whitespace, you also could provide a space only). = compares on a set based semantics, so if any of the strings on the left side equal any of the strings on the right side, it returns true.

Share:
16,716
Bram
Author by

Bram

Updated on June 23, 2022

Comments

  • Bram
    Bram over 1 year

    I'm filtering a big file that contains types of shoes for children, man as wel as woman.

    Now I want to filter out certain types of woman shoes, the following xpath works but there is a xpath length limitation with the program I'm using. So I'm wondering if there a shorter / more efficient way to construct this xpath

    /Products/Product[contains(CategoryPath/ProductCategoryPath,'Halbschuhe') and contains(CategoryPath/ProductCategoryPath,'Damen') or  contains(CategoryPath/ProductCategoryPath,'Sneaker') and contains(CategoryPath/ProductCategoryPath,'Damen') or contains(CategoryPath/ProductCategoryPath,'Ballerinas') and contains(CategoryPath/ProductCategoryPath,'Damen')]
    

    Edit: Added requested file sample

    <Products>
        <!-- snip -->
        <Product ProgramID="4875" ArticleNumber="GO1-f05-0001-12">
            <CategoryPath>
                <ProductCategoryID>34857489</ProductCategoryID>
                <ProductCategoryPath>Damen &gt; Sale &gt; Schuhe &gt; Sneaker &gt; Sneaker Low</ProductCategoryPath>
                <AffilinetProductCategoryPath>Kleidung &amp; Accessoires?</AffilinetProductCategoryPath>
            </CategoryPath>
            <Price>
                <DisplayPrice>40.95 EUR</DisplayPrice>
                <Price>40.95</Price>
            </Price>
        </Product>
        <!-- snip -->
    </Products>
    
  • Bram
    Bram over 10 years
    Just tried that doesn't work so either it doesnt support xpath 2.0 or there is an other reason why its not working. Any idea how to shorten it not using xpath 2.0?
  • Jens Erat
    Jens Erat over 10 years
    What error message do you get? If you tell us which engine you're using we can tell you more about whether it supports 2.0. Otherwise you'll be stuck to @Slanecs answer. fn:tokenize(...) should be preferred if possible though as it's probably faster and more general.
  • Bram
    Bram over 10 years
    Super this works gives exactly the same number of records as my lengthy xpath, thanks a ton!
  • Bram
    Bram over 10 years
    get xpath error , not matching records found. Going to use slanec's method thx a ton for the help.
  • Jens Erat
    Jens Erat over 10 years
    Which XPath engine/interpreter are you using?
  • Bram
    Bram over 10 years
    WP import all, import program for feeds, semi related question you know a good simple XML editor program that you can use to open XML files then filter using Xpath and then save the filtered records as an xml file.
  • Jens Erat
    Jens Erat over 10 years
    Have a look at BaseX which actually is an XML database (for XQuery which has XPath built-in), but is free, very lightweight and easy to use with great visual feedback when using the GUI.
  • Jens Erat
    Jens Erat over 10 years
    "WP import all" will probably use PHP's DOMXPath which only supports XPath 1.0, so bad luck. :)