Determining Bias for Neural network Perceptrons?

13,737

Solution 1

The short answer is, it depends...

  1. In most cases (I believe) you can just treat the bias just like any other weight (so it might get initialised to some small random value), and it will get updated as you train your network. The idea is that all the biases and weights will end up converging on some useful set of values.

  2. However, you can also set the weights manually (with no training) to get some special behaviours: for example, you can use the bias to make a perceptron behave like a logic gate (assume binary inputs X1 and X2 are either 0 or 1, and the activation function is scaled to give an output of 0 or 1).

OR gate: W1=1, W2=1, Bias=0

AND gate: W1=1, W2=1, Bias=-1

You can solve the classic XOR problem by using AND and OR as the first layer in a multilayer network, and feed them into a third perceptron with W1=3 (from the OR gate), W2=-2 (from the AND gate) and Bias=-2, like this:

perceptron XOR

(Note: these values will be different if your activation function is scaled to -1/+1, ie a SGN function)


  1. As to how to set the learning rate, that also depends(!) but I think usually something like 0.01 is recommended. Basically you want the system to learn as quickly as possible, but not so quickly that the weights fail to converge properly.

Solution 2

Since @Richard has already answered the greater part of the question I'll only elaborate on the learning rate. From what I've read (and it's working) there is a very simple formula that you can use in order to update the learning rate for each iteration k and it is:

learningRate_k = constant/k

Here obviously the 0th iteration is excluded since you'll be dividing by zero. The constant can be whatever you want it to be (except 0 of course since it will not be making any sense :D) but the easiest is naturally 1 so you get

learningRate_k = 1/k

The resulting series obeys two basic rules:

  • lim_(t->inf) SUM from k=1 to t (learningRate_k) = inf
  • lim_(t->inf) SUM from k=1 to t (learningRate_k^2) < inf

Note that the convergence of your perceptron is directly connected to the learning rate series. It starts big (for k=1 you get 1/1=1) and gets smaller and smaller with each and every update of your perceptron since - as in real life - when you encounter something new at the beginning you learn a lot but later on you learn less and less.

Share:
13,737
Admin
Author by

Admin

Updated on June 13, 2022

Comments

  • Admin
    Admin almost 2 years

    This is one thing in my beginning of understand neural networks is I don't quite understand what to initially set a "bias" at? I understand the Perceptron calculates it's output based on:

    P * W + b > 0

    and then you could calculate a learning pattern based on b = b + [ G - O ] where G is the Correct Output, and O is the actual Output (1 or 0) to calculate a new bias...but what about an initial bias.....I don't really understand how this is calculated, or what initial value should be used besides just "guessing", is there any type of formula for this?

    Pardon if Im mistaken on anything, Im still learning the whole Neural network idea before I implement my own (crappy) one.

    The same goes for learning rate.....I mean most books and such just kinda "pick one" for μ.