Extract GET URIs or their responses from Wireshark capture to separate file(s)

19,661

Solution 1

While this may be doable with Wireshark, it is orders of magnitude easier with Bro.

Extracting URIs

Simply run it with your trace file:

bro -r <trace>

This invocation generates a bunch of log files in the current directory. The one you are interested in is http.log. You can filter the output to obtain only the GET requests:

bro-cut id.orig_h id.resp_h method host uri < http.log | awk '$3 == "GET"'

Example output:

192.168.1.104   212.96.161.238  GET update.avg.com  /softw/90/update/avg9infowin.ctf
192.168.1.104   77.67.44.206    GET backup.avg.cz   /softw/90/update/u7avi1777u1705ff.bin
192.168.1.104   198.189.255.75  GET aa.avg.com  /softw/90/update/u7iavi2511u2510ff.bin
192.168.1.104   77.67.44.206    GET backup.avg.cz   /softw/90/update/x8xplsb2_118c8.bin

As you can see, the last two columns make up the full URL. To remove the space in-between, you could use awk to concatenate the last two fields.

Extracting Files

Note: the upcoming Bro 2.1 release will have major improvements for file extractions. Until then, you can extract all files from a HTTP stream by specifying the MIME type of the files to store:

bro -r <trace> 'HTTP::extract_file_type = /video\/avi/'

Bro sniffs the MIME type of a HTTP body and if it matches the regular expression /video\/avi/, it creates a file with the prefix http-item. You can change the prefix name by redefining the HTTP::extraction_prefix variable.

Solution 2

Solution for Question 1:-

Use tshark utility. Easy to install, just "sudo apt-get install tshark"

The command I use for the same is :-

tshark  -R 'tcp.port==80 && (http.request.method == "GET" || http.request.method=="HEAD" || http.request.method=="POST" )'  -r eth2uplink_00001    -Tfields -e ip.dst -e http.request.method -e http.request.full_uri > requests_eth2_00001

Refer to all the Wireshark display filters here :- https://www.wireshark.org/docs/dfref/

This is way better approach than using Bro, Because, Bro is very complicated to install as it has specific dependencies and they could hardly be met.

I currently don't have a solution for question 2, but I believe it can be constructed something on similar lines. Following options may be useful to what you are trying.

-O Only show packet details of these protocols, comma separated

-x add output of hex and ASCII dump (Packet Bytes) You could refer tshark --help for more info.

Hope this helps. Thanks.

Share:
19,661
TheLostOne
Author by

TheLostOne

Updated on June 05, 2022

Comments

  • TheLostOne
    TheLostOne almost 2 years

    Issue

    I use Wireshark to capture a HTTP video stream and I've use the following filter to filter out the relevant GET requests.

    http.request.uri contains "identifier" && http.request.method == "GET" && ip.addr == xxx.xxx.xxx.xxx
    

    Questions

    1. Is it possible to extract all get GET URLs to separate a .txt file?

    2. Or is possible to extract the raw response packets (without the header) which match the filter above to separate files so that I have a bunch of individual video files eventually?

    I hope I made myself clear enough ;-)

    Thank you

  • gertvdijk
    gertvdijk over 8 years
    I'm new to bro. How do I create a trace file? A quick look at the documentation doesn't provide an answer to me. The question is in the scope of a Wireshark capture (PCAP) so it would be useful to include how to create a bro trace file from a PCAP.
  • mavam
    mavam over 8 years
    Wireshark also uses libpcap to get packets, either from a trace or from a live interface. For reproducibility, one typically creates a trace file as opposed to sniffing from an interface. That's independent of Bro/Wireshark. On UNIX systems, just use tcpdump. On Windows, you can also use Wireshark to save packets from an interface in file, then you have a trace. By the way, Bro can also read traffic from a live interface when passing it bro -i <iface>.
  • gertvdijk
    gertvdijk over 8 years
    Ah, so you mean a trace file is just a pcap file? That's confusing if you don't know that. ;-)
  • mavam
    mavam over 8 years
    You got it. The community often uses the words interchangeably, with PCAP being the de-facto standard for network traffic traces.
  • 16851556
    16851556 about 3 years
    @mavam the app name is now "zeek" likely. Not "bro". Can you please mention here and in the answer how to get trace file about which you are writing?