getChildNodes giving unexpected result

19,082

Solution 1

That's because you have 2 TEXT_NODE (#text) between each child nodes.

The following included the text nodes and their corresponding values.

<object flag="complete" id="objId" version="1">
    <TEXT_NODE />
    <variable_value variable_id="varId">ValueGoesHere</variable_value>
    <reference item_ref="2"/>
    <TEXT_NODE />
</object>

This can be verified by modifying your code:

DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
        Document dom = dBuilder.parse(new ByteArrayInputStream(S.getBytes()));
        NodeList collected_objects = dom.getElementsByTagName("object");
        System.out.println("Number of collected objects are "
                + collected_objects.getLength());

        for (int i = 0; i < collected_objects.getLength(); i++) {

            Node aNode = collected_objects.item(i);
            // get children of "objects"
            NodeList refNodes = aNode.getChildNodes();

            System.out.println("# of chidren are " + refNodes.getLength());

            //
            for (int x = 0; x < refNodes.getLength(); x++) {
                Node n = refNodes.item(x);
                System.out.println(n.getNodeType() + " = " + n.getNodeName() + "/" + n.getNodeValue());
            }

            // print attributes of "objects"

            NamedNodeMap attributes = aNode.getAttributes();
            for (int a = 0; a < attributes.getLength(); a++) {
                Node theAttribute = attributes.item(a);
                System.out.println(theAttribute.getNodeName() + "="
                        + theAttribute.getNodeValue());

            }

        }

The output:

Number of collected objects are 2
# of chidren are 5
3 = #text/          
1 = variable_value/null
3 = #text/          
1 = reference/null
3 = #text/        
flag=complete
id=objId
version=1
# of chidren are 3
3 = #text/          
1 = reference/null
3 = #text/        
comment=objComment
flag=complete
id=objId
version=1

Where, 3 = TEXT_NODE and 1 = ELEMENT_NODE.

Solution 2

Make sure you don't have whitespaces between <object> node children. Whitespaces are considered childnodes and returned as such.

Testing if

childNode.getNodeType() == Node.ELEMENT_NODE

should be enough.

Solution 3

You are only counting ELEMENT node types. You can change your code to include the below check if you are interested in only child elements

 if (aNode.getNodeType() == Node.ELEMENT_NODE) 
{
...
}
Share:
19,082
user837208
Author by

user837208

Updated on July 16, 2022

Comments

  • user837208
    user837208 almost 2 years

    My XML looks like this-

    <collected_objects>
            <object flag="complete" id="objId" version="1">
              <variable_value variable_id="varId">ValueGoesHere</variable_value>
              <reference item_ref="2"/>
            </object>
            <object comment="objComment" flag="complete" id="objId" version="1">
              <reference item_ref="1"/>
            </object>
    </collected_objects>
    

    I am processing it using below code-

    Document dom = parser.getDocument();
        NodeList collected_objects = dom.getElementsByTagName("object");
        System.out.println("Number of collected objects are " + collected_objects.getLength());
    
            for (int i = 0; i < collected_objects.getLength(); i++) {
    
                Node aNode = collected_objects.item(i);
                //get children of "objects"         
                NodeList refNodes = aNode.getChildNodes();
    
                System.out.println("# of chidren are " + refNodes.getLength());
    
                //print attributes of "objects"
    
                NamedNodeMap attributes = aNode.getAttributes();
                for (int a = 0; a < attributes.getLength(); a++) {
                 Node theAttribute = attributes.item(a);
                 System.out.println(theAttribute.getNodeName() + "=" + theAttribute.getNodeValue());
    
            }
    
    }
    

    it outputs as-

    Number of collected objects are 2
    # of chidren are 5
    flag=complete
    id=objId
    version=1
    # of chidren are 3
    comment=objComment
    flag=complete
    id=objId
    version=1
    

    My question is why "# of chidren are" are 5 and 3 respectively? Shouldn't I be expecting 2 and 1 respectively ? because first object has "variable_value" and "reference" and second object has only "reference"

    Essentially, my intent is to process children of "objects".