How to convert distance into probability?
Solution 1
I think there are multiple way of doing this:
as Adam suggested using 1/d / sum(1/d)
use the square, or even higher ordered of inverse of distance, e.g 1/d^2 / sum(1/d^2), This will make the class probability distribution more skewed. For example if 1/d generated 40%/60% prediction, the 1/d^2 may gave a 10%/90%.
use softmax (https://en.wikipedia.org/wiki/Softmax_function), the exponential of negative distance.
use exp(-d^2)/sigma^2 / sum[exp(-d^2)/sigma^2], this will imitate the Gaussian Distribution likelihoods. Sigma could be the average within-cluster distance, or simply set to 1 for all clusters.
Solution 2
You could try to inverse your distances to get a likelihood measure. I.e. the bigger the distance x, the smaller the inverse of it. Then, you can normalize as in probability = (1/distance) / (sum (1/distance) )
niko_dry
Updated on June 08, 2022Comments
-
niko_dry almost 2 years
Сan anyone shine a light to my matlab program? I have data from two sensors and i'm doing a
kNN
classification for each of them separately. In both cases training set looks like a set of vectors of 42 rows total, like this:[44 12 53 29 35 30 49; 54 36 58 30 38 24 37;..]
Then I get a sample, e.g.
[40 30 50 25 40 25 30]
and I want to classify the sample to its closest neighbor. As a criteria of proximity I use Euclidean metrics, sqrt(sum(Y2)), whereY
is a difference between each element and it gives me an array of distances between Sample and each Class of Training Set.So, two questions:
- Is it possible to convert distance into distribution of probabilities, something like: Class1: 60%, Class 2: 30%, Class 3: 5%, Class 5: 1%, etc.
added: Up to this moment I'm using formula:
probability = distance/sum of distances
, but I cannot plot a correctcdf
or histogram. This gives me a distribution in some way, but I see a problem there, because if distance is large, for example 700, then the closest class will get a biggest probability, but it'd be wrong because the distance is too big to be compared with any of classes.- If I would be able to get two probability density functions, I guess then I would do some product of them. Is it possible?
Any help or remark is highly appreciated.