Making Xerces parse a string instead of a file

15,539

Solution 1

Create a MemBufInputSource and parse that:

xercesc::MemBufInputSource myxml_buf(myxml.c_str(), myxml.size(),
                                     "myxml (in memory)");
parser->parse(myxml_buf);

Solution 2

Use the following overload of XercesDOMParser::parse():

void XercesDOMParser::parse(const InputSource& source);

passing it a MemBufInputSource:

MemBufInputSource src((const XMLByte*)myxml.c_str(), myxml.length(), "dummy", false);
parser->parse(src);

Solution 3

Im doing it another way. If this is incorrect, please tell me why. It seems to work. This is what parse expects:

DOMDocument* DOMLSParser::parse(const DOMLSInput * source )

So you need to put in a DOMLSInput instead of a an InputSource:

xercesc::DOMImplementation * impl = xercesc::DOMImplementation::getImplementation();
xercesc::DOMLSParser *parser = (xercesc::DOMImplementationLS*)impl)->createLSParser(xercesc::DOMImplementation::MODE_SYNCHRONOUS, 0);
xercesc::DOMDocument *doc;

xercesc::Wrapper4InputSource source (new xercesc::MemBufInputSource((const XMLByte *) (myxml.c_str()), myxml.size(), "A name");
parser->parse(&source);
Share:
15,539

Related videos on Youtube

Andry
Author by

Andry

Made in Italy, with some parts from Japan, now living and working in Denmark. Basically just a Software Engineer :) Main interests I really like programming and developing stuff. My main focus in web application development and testing. Among the technologies I am passionate about, you can find: ASP.NET, .NET, C#, VB.NET, WCF, C/C++ (all hail Bjarne), Boost C++ Library, Java, Javascript + dialects (Typescript, etc.), HTML5, CSS + dialects (SASS, LESS, etc.). I have a lot of interests in Mathematics as well. Among engineering technologies, I really like working with: Wolfram Mathematica, Wolfram Language, Matlab. I also like creating graphics, animations and presentation using the most popular applications in the Adobe families (I saw Flash growing since when it was still a baby, 5.0, and it was still Macromedia, RIP). (Human) Languages I like travelling a lot, and the best part of travelling is experiencing new languages, new sounds new people and culture. I actually love Asia. I lived in Japan and worked there. Made also a lot of friends and gonna return there soon (I hope). I can speak Italian, English, Japanese and still learning this fancy language called Danish. Sports I also like sports. I practiced many but now I almost focused my entire life on swimming and open water swimming. Now I made a full transition to diving (no scuba!). Other I also like photography, reading lots of books, food (strange food) and music. But most of all I like sharing knowledge with other people. More knowledge in the world means a better world! That's why I am #SOreadytohelp!

Updated on April 06, 2022

Comments

  • Andry
    Andry about 2 years

    I know how to create a complete dom from an xml file just using XercesDOMParser:

    xercesc::XercesDOMParser parser = new xercesc::XercesDOMParser();
    parser->parse(path_to_my_file);
    parser->getDocument(); // From here on I can access all nodes and do whatever i want
    

    Well, that works... but what if I'd want to parse a string? Something like

    std::string myxml = "<root>...</root>";
    xercesc::XercesDOMParser parser = new xercesc::XercesDOMParser();
    parser->parse(myxml);
    parser->getDocument(); // From here on I can access all nodes and do whatever i want
    

    I'm using version 3. Looking inside the AbstractDOMParser I see that parse method and its overloaded versions, only parse files.

    How can I parse from a string?

  • Fred Foo
    Fred Foo over 13 years
    It's a "fake system id" that's used in error messages and "any entities which are referred to from this entity via relative paths/URLs will be relative to this fake system id". See API docs.
  • Andry
    Andry over 13 years
    larsmans could you please tell me why, when using your code and correctly printing the xml, when I call Terminate() my app goes on Segmentation Fault?????
  • Fred Foo
    Fred Foo over 13 years
    @Andry, I can't tell with just this info. Can you try copying the string with new char[] and setting the 4th (adoptBuffer) ctor argument to true? (see xerces.apache.org/xerces-c/apiDocs-3/…)
  • Andry
    Andry over 13 years
    Well I discovered it... see here... absurd ahaha xerces.apache.org/xerces-c/faq-parse-2.html#faq-7
  • Fred Foo
    Fred Foo over 13 years
    @Andry: I know, the Xerces rules for memory allocation are overly complicated. They seem never to have heard of RAII. Too bad.
  • Silver
    Silver about 11 years
    How can I figure out in what namespace MemBufInputSource and Wrapper4InputSource are in? I'm having serious trouble with namespaces in xerces. Ty
  • Jesse Chisholm
    Jesse Chisholm over 8 years
    @Andry - sadly, the absurd ahaha link has aged into oblivion. :( Appears to have become: xerces.apache.org/xerces-c/faq-parse-3.html#faq-7
  • Mike S
    Mike S over 8 years
    How did you solve the seg fault? I'm having the same problem.
  • villasv
    villasv over 8 years
    It's on xercesc namespace, but you also need #include <xercesc/framework/MemBufInputSource.hpp> . I'm two years late, but I had the same issue and someone else can have it again later.
  • villasv
    villasv over 8 years
    Also +1 for also specifying the necessary cast.
  • user23573
    user23573 over 8 years
    Thanks for hinting this. This answer seems to be closer to the actual DOM Programming Guide
  • steiryx
    steiryx almost 4 years
    I tried to use this but it seems to fail, when i tried to replace double quotes to single and add \n\ on each line, parsing seems ok. I did that based on MemParse.cpp sample in xerces. Do you know the problem with that?
  • Silver
    Silver almost 4 years
    Hi, This is an answer from 7 year ago so I have no clue what I was doing at the time. Maybe if you can elaborate I can think with you. Where/what line did you change the quotes and add the \n?
  • chars
    chars about 2 years
    Don't know if this is why any of above comments were seeing seg faults but MemBufInputSource doesn't seem to work unless you first initialize the system. More detail in answer below.