XPath select all text content for a <div> except for a specific tag <h5>
17,101
Try the following XPath expression:
//div[@class='detalhescolunadados_blocos'][1]//text()[not(ancestor::h5)]
This will return:
$ xmllint --html --shell so.html
/ > xpath //div[@class='detalhescolunadados_blocos'][1]//text()[not(ancestor::h5)]
Object is a Node Set :
Set contains 2 nodes:
1 TEXT
content=
2 TEXT
content= Sala de estar/jantar,2 vagas de gar...
Author by
bslima
Updated on July 24, 2022Comments
-
bslima almost 2 years
I searched and tried several solutions for this problem but none of them worked: I have this HTML
<div class="detalhes_colunadados"> <div class="detalhescolunadados_blocos"> <h5>Descrição completa</h5> Sala de estar/jantar,2 vagas de garagem cobertas.<br> </div> <div class="detalhescolunadados_blocos"> <h5>Valores</h5> Venda: R$ 600.000,00<br> Condomínio: R$ 660,00<br> </div> </div>
And wanna to extract by XPath only the text content in the first div class="detalhescolunadados_blocos" that are not h5 tags.
I tried: //div[@class='detalhescolunadados_blocos']/[1]/*[not(self::h5)]
-
Gilles Quenot over 11 yearsWhy not using
xmllint --html --xpath '//foo' file.html
? =) -
nwellnhof over 11 yearsThanks for pointing me to the
--xpath
option. It's actually undocumented. -
bslima over 11 yearsThanks a lot, i was forgetting that the text part is child of h5, i inclusive tried //text()[not(self::h5)].