How to get element's value from XML using SAX parser in startElement?

33,878

You can get the element's name in startElement and endElement. You can also get attributes in startElement. Values you should get in characters.

Here is a very basic example on how to get the value of an element using a ContentHandler:

public class YourHandler extends DefaultHandler {

    boolean inFirstNameElement = false;

    public class startElement(....) {
        if(qName.equals("firstName") {
            inFirstNameElement = true;
        }
    }

    public class endElement(....) {
        if(qName.equals("firstName") {
            inFirstNameElement = false;
        }
    }

    public class characters(....) {
        if(inFirstNameElement) {
            // do something with the characters in the <firstName> element
        }
    }
}

If you have a simple example, setting boolean flags for each tag is OK. If you have a more complex scenario, you might prefer store the flags in a map using element names as keys, or even create one or more Employee classes mapped to your XML, instantiate them every time <employee> is found in startElement, populate its properties, and add it to a Collection in endElement.

Here is a complete ContentHandler example that works with your example file. I hope it helps you get started:

public class SimpleHandler extends DefaultHandler {

    class Employee {
        public String firstName;
        public String lastName;
        public String location;
        public Map<String, String> attributes = new HashMap<>();
    }
    boolean isFirstName, isLastName, isLocation;
    Employee currentEmployee;
    List<Employee> employees = new ArrayList<>();

    @Override
    public void startElement(String uri, String localName, String qName,
            Attributes atts) throws SAXException {
        if(qName.equals("employee")) {
            currentEmployee = new Employee();
            for(int i = 0; i < atts.getLength(); i++) {
                currentEmployee.attributes.put(atts.getQName(i),atts.getValue(i));
            }
        }
        if(qName.equals("firstName")) { isFirstName = true; }
        if(qName.equals("lastName"))  { isLastName = true;  }
        if(qName.equals("location"))  { isLocation = true;  }
    }

    @Override
    public void endElement(String uri, String localName, String qName)
            throws SAXException {
        if(qName.equals("employee")) {
            employees.add(currentEmployee);
            currentEmployee = null;
        }
        if(qName.equals("firstName")) { isFirstName = false; }
        if(qName.equals("lastName"))  { isLastName = false;  }
        if(qName.equals("location"))  { isLocation = false;  }
    }

    @Override
    public void characters(char[] ch, int start, int length) throws SAXException {
        if (isFirstName) {
            currentEmployee.firstName = new String(ch, start, length);
        }
        if (isLastName) {
            currentEmployee.lastName = new String(ch, start, length);
        }
        if (isLocation) {
            currentEmployee.location = new String(ch, start, length);
        }
    }

    @Override
    public void endDocument() throws SAXException {
        for(Employee e: employees) {
            System.out.println("Employee ID: " + e.attributes.get("id"));
            System.out.println("  First Name: " + e.firstName);
            System.out.println("  Last Name: " + e.lastName);
            System.out.println("  Location: " + e.location);
        }
    }
}
Share:
33,878
sakura
Author by

sakura

Outstanding is not mean you got everything. Don't be proud of yourselft

Updated on July 29, 2022

Comments

  • sakura
    sakura almost 2 years

    Is it possible to get the content of an element from a XML file in startElement function that is the override function of the SAX handler?

    Below is the specification.

    1) XML file

    <employees>
       <employee id="111">
          <firstName>Rakesh</firstName>
          <lastName>Mishra</lastName>
          <location>Bangalore</location>
       </employee>
       <employee id="112">
          <firstName>John</firstName>
          <lastName>Davis</lastName>
          <location>Chennai</location>
       </employee>
       <employee id="113">
          <firstName>Rajesh</firstName>
          <lastName>Sharma</lastName>
          <location>Pune</location>
       </employee>
    </employees>
    

    2) startElement function

    @Override
    public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
        .......code in here..........
    }
    

    3) Expected result

    element name   : employee
    attribute name : id
    attribute value: 111
    firstName      : Rakesh
    lastName       : Mishra
    location       : Bangalore
    
    element name   : employee
    attribute name : id
    attribute value: 112
    firstName      : John
    lastName       : Davis
    location       : Chennai
    
    element name   : employee
    attribute name : id
    attribute value: 113
    firstName      : Rajesh
    lastName       : Sharma
    location       : Pune
    
  • sakura
    sakura almost 10 years
    Excuse me!!! How about this option? In case we don't know the tagName or something, but we want to get tagName, attName, attValue, and tagValue at once. Is it possible?
  • helderdarocha
    helderdarocha almost 10 years
    As shown above, in the startElement method you can read the tag name (qName) and all attributes you can read from the atts variable (atts.getQName(i) and atts.getValue(i)), but to read the tag's text value you need to use the characters method and use flags as shown above. If you run the example above you should get the result you are expecting.
  • sakura
    sakura almost 10 years
    How about different xml file? Do we need to implement other codes? As I notice, your code use specific tagName in condition. If we don't know the specific one, what should we do?
  • helderdarocha
    helderdarocha almost 10 years
    You can simply print the tag name if you wish, and instead of setting flags based on the tag name, do that based on their relative positions (create a Map and store the names and contexts as you read them).
  • helderdarocha
    helderdarocha almost 10 years
    SAX is intended for sequential reading of an XML file, which is necessary when you need to extract bits of information from large files. If you want to obtain all the data at once by simply using methods to extract the data you wish, you might prefer to use an object model API, such as DOM or XPath.
  • Ludovic Kuty
    Ludovic Kuty over 5 years
    Please note that "SAX parsers may return all contiguous character data in a single chunk, or they may split it into several chunks". Thus you need to accumulate data.