Grok Issue with Multiple IP's in NginX Logstash

5,035

Not sure if you're still having this issue, but if so, here's what will work for you.

Given this log format:

log_format custom '$remote_addr - $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent" "$host" "$http_x_forwarded_for"';

the grok pattern you've specified doesn't take into account the addition of the "$host" "$http_x_forwarded_for" portion.

Not sure why your grok isn't failing, but it should.

In any event, this pattern will work with the log format above:

%{IP:clientip} %{NOTSPACE:ident} %{NOTSPACE:auth} \[%{HTTPDATE:timestamp}\] "%{WORD:verb} %{DATA:request} HTTP/%{NUMBER:httpversion}" %{NUMBER:response} (?:%{NUMBER:bytes}|-) (?:"(?:%{URI:referrer}|-)"|%{QS:referrer})(?:;|) %{QS:agent} "%{NOTSPACE:host}" "(?<x_forwarded_for>%{IP:xff_clientip}, .*)"

And results in the following fields

httpversion      1.1
request          /api/filter/14928/content?api_key=apikey&site=website
timestamp        28/Sep/2015:12:39:56·+1000
auth             -
host             my.website.com
agent            "-"
x_forwarded_for    1.144.97.102,·1.144.97.102,·1.144.97.102,·127.0.0.1,·172.31.26.59
clientip         172.31.7.219
bytes            101
response         403
xff_clientip     1.144.97.102
ident            -
port    
verb             GET
referrer    

Note that you've got a couple of new fields than you would have had before.

The first ("x_forward_for" => 1.144.97.102, 1.144.97.102, 1.144.97.102, 127.0.0.1, 172.31.26.59) is the contents of the last set of quotes, or $http_x_forwarded_for from the log format.
The second ("xff_clientip" => 1.144.97.102) is just the first IP in that list, which should translate to the actual source IP of the request.

If it were me, I'd also run the x_forwarded_for field through a mutate filter to break it into an array:

mutate {
  split  => { "x_forwarded_for" => ", " }
}
Share:
5,035

Related videos on Youtube

geniestacks
Author by

geniestacks

Updated on September 18, 2022

Comments

  • geniestacks
    geniestacks over 1 year

    i've got an issue with logging from my webservers, which has an elb and then a varnish layer in front of nginx layer.

    varnish is setup properly for X-Forwarded-For and logs come through normally with the correct 'client.ip' being logged.

    however, nginx logs are coming through with a whole list of IP's in the request. the default grok behaviour seems to set the client IP to the last in the list ie. the elb and varnish servers, which messes up my client.ip field for nginx logs. the correct client IP should be the first (or at least first few) in the list.

    heres an example:

    172.31.7.219 - - [28/Sep/2015:12:39:56 +1000] "GET /api/filter/14928/content?api_key=apikey&site=website HTTP/1.1" 403 101 "-" "-" "my.website.com" "1.144.97.102, 1.144.97.102, 1.144.97.102, 127.0.0.1, 172.31.26.59"

    problem is i haven't been able to tweak the grok to handle such a result, the heroku grok debugger doesn't seem to work for this query and my grok -- but they are working in logstash ie. not tagging grok failure.

    i've attempted to debug the specific parts but i haven't found a way to do what i need with IP/IPORHOST where there is a comma separated list of IP addresses. i need to be able to specify which IP it should use. ie. the first in the list should be the client.ip not the last.

    my nginx grok is:

    NGINXACCESS %{IP:clientip} %{NGUSER:ident} %{NGUSER:auth} \[%{HTTPDATE:timestamp}\] "%{WORD:verb} %{DATA:request} HTTP/%{NUMBER:httpversion}" %{NUMBER:response} (?:%{NUMBER:bytes}|-) (?:"(?:%{URI:referrer}|-)"|%{QS:referrer})(?:;|) %{QS:agent}

    any ideas on grok to cover that log?

    • GregL
      GregL over 8 years
      Can you the access_log and log_format directives from your Nginx configuration? Also, is the grok failing, and if so, where?
    • geniestacks
      geniestacks over 8 years
      this is the custom log format for all nginx logs: ``` log_format custom '$remote_addr - $remote_user [$time_local] ' '"$request" $status $body_bytes_sent ' '"$http_referer" "$http_user_agent" ' '"$host" ' '"$http_x_forwarded_for" '; ```
  • Anton Roslov
    Anton Roslov over 7 years
    You can also try something like "(?<x_forwarded_for>%{IP:clientip}(?:, .*)?)" when the list consists of a single IP