When logging when is an error fatal?

logging error-handling log4net log4j

18,839

Solution 1

I consider fatal errors to be when your application can't do any more useful work. Non-fatal errors are when there's a problem but your application can still continue to function, even at a reduced level of functionality or performance.

Examples of fatal errors include:

Running out of disk space on the logging device and you're required to keep logging.
Total loss of network connectivity in a client application.
Missing configuration information if no default can be used.

Non-fatal errors would include:

A server where a single session fails for some reason but you can still service other clients.
An intermittent error, such as lost session, if a new session can be established.
Missing configuration information if a default value can be used.

Solution 2

An error is Fatal if something is missing or a situation occurs for which the application can simply not continue. Possible examples are a missing required config.file or when an exception 'bubbles up' and is caught by an unhandled exception handler

Solution 3

I would use fatal if my next step is for the application to terminate, or merely not do any more subsequent work. If the application is part of a batch or there are multiple processes running, this can be useful for tracing what happened.

If there is a chance of recovery (e.g., loss of network connection with retries for a while) I would not use a fatal.

If I have multiple service threads activated by a main thread and one of them fails because of some bad input but the application can still serve new requests, I do not consider it fatal.

Solution 4

To make this answer short and sweet, if your application crashes, I would consider that fatal. If you cannot connect to an important resource such as a database or a required service, that would be fatal. Overall, I would say that if it keeps your application from running correctly and affects the user, I would classify it as a fatal error.

But the most important way to classify errors is to consistently follow a rule of thumb such as rule 69 in C++ Coding Standards:

"Develop a practical, consistent, and rational error handling policy early in design, and then stick to it."

View more solutions

18,839

Author by

Jason Whitehorn

. _///_, . / ` ' '> ) o' __/_'> ( / _/ )_\'> ' "__/ /_/\_> ____/_/_/_/ /,---, _/ / "" /_/_/_/ /_(_(_(_ \ ( \_\_\\_ )\ \'__\_\_\_\__ ).\ //____|___\__) )_/ | _ \'___'_( /' \_ (-'\'___'_\ __,'_' __) \ \\___(_ __/.__,' ,((,-,__\ '", __\_/. __,' '"./_._._-'

Updated on June 05, 2022

Comments

Jason Whitehorn almost 2 years

In logging frameworks like log4j & log4net you have the ability to log various levels of information. Most of the levels have obvious intentions (such as what a "Debug" log is vs. a "Error"). However, one thing that I have always been timid on was classifying my logging as "Fatal".

What type of errors are so severe that they should be classified as fatal? While this is slightly case driven, what are some of the rules-of-thumb that you use when deciding between logging an exception as fatal or just simply error?
Mitch Wheat over 15 years

Loss of network connectivity might not be fatal. It might be temporary.
Mitch Wheat over 15 years

If the application has crashed, it's a bit late to log the fact!
paxdiablo over 15 years

I meant total loss as in "not recoverable" or "took too long to recover", which is why non-fatal errors includes the intermittent version.
kevindaub over 15 years

However, you would be able to log some information about the crash.
vikingsteve over 10 years

Nice answer, but one thing - how can an application easily determine if a loss of network connectivity is temporary or fatal?
paxdiablo over 10 years

@vikingsteve, it can try periodically for x seconds (for some sensible value of x). If it recovers, it was temporary. If not, it's "permanent". That's one possibility. Or, if seven threads can communicate fine and it's just the eighth one having troubles, that too could be a non-total loss.
TestyTest about 8 years

Use Fatal in the upper ring of the exception tree, most of the time this is the try-catch block in a Main void for a Console app or the Application_Error for an ASP.NET/MVC app or a an ErrorHandler behavior in case of a WCF service. Use Error when you as programmer have full control you want to log a special faulty situation but you have a fallback path that is recoverable (for example a default value).
paxdiablo about 8 years

@Patrick, that may be good advice, I won't comment on that since it seems to me not so much a hard-and-fast rule - there may well be situations where it makes more sense to handle fatal issues locally rather than just delivering them blindly up through the hierarchy (for example, more information may be available as to what the error was at the lower levels). However, I believe it's tangential to the question which is what sorts of things should be considered fatal.