Derivative of sigmoid

37,334

Solution 1

The two ways of doing it are equivalent (since mathematical functions don't have side-effects and always return the same input for a given output), so you might as well do it the (faster) second way.

Solution 2

Dougal is correct. Just do

f = 1/(1+exp(-x))
df = f * (1 - f)

Solution 3

A little algebra can simplify this so that you don't have to have df call f.
df = exp(-x)/(1+exp(-x))^2

derivation:

df = 1/(1+e^-x) * (1 - (1/(1+e^-x)))
df = 1/(1+e^-x) * (1+e^-x - 1)/(1+e^-x)
df = 1/(1+e^-x) * (e^-x)/(1+e^-x)
df = (e^-x)/(1+e^-x)^2

Solution 4

You can use the output of your sigmoid function and pass it to your SigmoidDerivative function to be used as the f(x) in the following:

dy/dx = f(x)' = f(x) * (1 - f(x))
Share:
37,334
rflood89
Author by

rflood89

I started getting into software development when I was 12 using the Game Maker program by prof Mark Overmars (not the footballer :P) and dabbled with its built in scripting language called GML which then booted me into software development for good. I then went on to learn C++ using Dev-Cpp which I like to think i'm half decent at. Currently I am in my 3rd of a computer science degree and using Emacs religiously (manly because its epic and can see myself molding the tool for my needs rather than bending backwards something which can't be modified) but hopefully I would like to be working in a programming environment with like minded people in the very short future.

Updated on July 31, 2022

Comments

  • rflood89
    rflood89 almost 2 years

    I'm creating a neural network using the backpropagation technique for learning.

    I understand we need to find the derivative of the activation function used. I'm using the standard sigmoid function

    f(x) = 1 / (1 + e^(-x))
    

    and I've seen that its derivative is

    dy/dx = f(x)' = f(x) * (1 - f(x))
    

    This may be a daft question, but does this mean that we have to pass x through the sigmoid function twice during the equation, so it would expand to

    dy/dx = f(x)' = 1 / (1 + e^(-x)) * (1 - (1 / (1 + e^(-x))))
    

    or is it simply a matter of taking the already calculated output of f(x), which is the output of the neuron, and replace that value for f(x)?