Python implementation of the Wilson Score Interval?

python algorithm statistics ranking

12,992

Solution 1

Reddit uses the Wilson score interval for comment ranking, an explanation and python implementation can be found here

#Rewritten code from /r2/r2/lib/db/_sorts.pyx

from math import sqrt

def confidence(ups, downs):
    n = ups + downs

    if n == 0:
        return 0

    z = 1.0 #1.44 = 85%, 1.96 = 95%
    phat = float(ups) / n
    return ((phat + z*z/(2*n) - z * sqrt((phat*(1-phat)+z*z/(4*n))/n))/(1+z*z/n))

Solution 2

I think this one has a wrong wilson call, because if you have 1 up 0 down you get NaN because you can't do a sqrt on the negative value.

The correct one can be found when looking at the ruby example from the article How not to sort by average page:

return ((phat + z*z/(2*n) - z * sqrt((phat*(1-phat)+z*z/(4*n))/n))/(1+z*z/n))

Solution 3

To get the Wilson CI without continuity correction, you can use proportion_confint in statsmodels.stats.proportion. To get the Wilson CI with continuity correction, you can use the code below.

# cf. 
# [1] R. G. Newcombe. Two-sided confidence intervals for the single proportion, 1998
# [2] R. G. Newcombe. Interval Estimation for the difference between independent proportions:        comparison of eleven methods, 1998

import numpy as np
from statsmodels.stats.proportion import proportion_confint

# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # 
def propci_wilson_cc(count, nobs, alpha=0.05):
    # get confidence limits for proportion
    # using wilson score method w/ cont correction
    # i.e. Method 4 in Newcombe [1]; 
    # verified via Table 1
    from scipy import stats
    n = nobs
    p = count/n
    q = 1.-p
    z = stats.norm.isf(alpha / 2.)
    z2 = z**2   
    denom = 2*(n+z2)
    num = 2.*n*p+z2-1.-z*np.sqrt(z2-2-1./n+4*p*(n*q+1))    
    ci_l = num/denom
    num = 2.*n*p+z2+1.+z*np.sqrt(z2+2-1./n+4*p*(n*q-1))
    ci_u = num/denom
    if p == 0:
        ci_l = 0.
    elif p == 1:
        ci_u = 1.
    return ci_l, ci_u


def dpropci_wilson_nocc(a,m,b,n,alpha=0.05):
    # get confidence limits for difference in proportions
    #   a/m - b/n
    # using wilson score method WITHOUT cont correction
    # i.e. Method 10 in Newcombe [2]
    # verified via Table II    
    theta = a/m - b/n        
    l1, u1 = proportion_confint(count=a, nobs=m, alpha=0.05, method='wilson')
    l2, u2 = proportion_confint(count=b, nobs=n, alpha=0.05, method='wilson')
    ci_u = theta + np.sqrt((a/m-u1)**2+(b/n-l2)**2)
    ci_l = theta - np.sqrt((a/m-l1)**2+(b/n-u2)**2)     
    return ci_l, ci_u


def dpropci_wilson_cc(a,m,b,n,alpha=0.05):
    # get confidence limits for difference in proportions
    #   a/m - b/n
    # using wilson score method w/ cont correction
    # i.e. Method 11 in Newcombe [2]    
    # verified via Table II  
    theta = a/m - b/n    
    l1, u1 = propci_wilson_cc(count=a, nobs=m, alpha=alpha)
    l2, u2 = propci_wilson_cc(count=b, nobs=n, alpha=alpha)    
    ci_u = theta + np.sqrt((a/m-u1)**2+(b/n-l2)**2)
    ci_l = theta - np.sqrt((a/m-l1)**2+(b/n-u2)**2)     
    return ci_l, ci_u


# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # 
# single proportion testing 
# these come from Newcombe [1] (Table 1)
a_vec = np.array([81, 15, 0, 1])
m_vec = np.array([263, 148, 20, 29])
for (a,m) in zip(a_vec,m_vec):
    l1, u1 = proportion_confint(count=a, nobs=m, alpha=0.05, method='wilson')
    l2, u2 = propci_wilson_cc(count=a, nobs=m, alpha=0.05)
    print(a,m,l1,u1,'   ',l2,u2)

# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # 
# difference in proportions testing 
# these come from Newcombe [2] (Table II)
a_vec = np.array([56,9,6,5,0,0,10,10],dtype=float)
m_vec = np.array([70,10,7,56,10,10,10,10],dtype=float)
b_vec = np.array([48,3,2,0,0,0,0,0],dtype=float)
n_vec = np.array([80,10,7,29,20,10,20,10],dtype=float)

print('\nWilson without CC')
for (a,m,b,n) in zip(a_vec,m_vec,b_vec,n_vec):
    l, u = dpropci_wilson_nocc(a,m,b,n,alpha=0.05)
    print('{:2.0f}/{:2.0f}-{:2.0f}/{:2.0f} ; {:6.4f} ; {:8.4f}, {:8.4f}'.format(a,m,b,n,a/m-b/n,l,u))

print('\nWilson with CC')
for (a,m,b,n) in zip(a_vec,m_vec,b_vec,n_vec):
    l, u = dpropci_wilson_cc(a,m,b,n,alpha=0.05)
    print('{:2.0f}/{:2.0f}-{:2.0f}/{:2.0f} ; {:6.4f} ; {:8.4f}, {:8.4f}'.format(a,m,b,n,a/m-b/n,l,u))

HTH

Solution 4

The accepted solution seems to use a hard-coded z-value (best for performance).

In the event that you wanted a direct python equivalent of the ruby formula from the blogpost with a dynamic z-value (based on the confidence interval):

import math

import scipy.stats as st


def ci_lower_bound(pos, n, confidence):
    if n == 0:
        return 0
    z = st.norm.ppf(1 - (1 - confidence) / 2)
    phat = 1.0 * pos / n
    return (phat + z * z / (2 * n) - z * math.sqrt((phat * (1 - phat) + z * z / (4 * n)) / n)) / (1 + z * z / n)

Solution 5

If you'd like to actually calculate z directly from a confidence bound and want to avoid installing numpy/scipy, you can use the following snippet of code,

import math

def binconf(p, n, c=0.95):
  '''
  Calculate binomial confidence interval based on the number of positive and
  negative events observed.  Uses Wilson score and approximations to inverse
  of normal cumulative density function.

  Parameters
  ----------
  p: int
      number of positive events observed
  n: int
      number of negative events observed
  c : optional, [0,1]
      confidence percentage. e.g. 0.95 means 95% confident the probability of
      success lies between the 2 returned values

  Returns
  -------
  theta_low  : float
      lower bound on confidence interval
  theta_high : float
      upper bound on confidence interval
  '''
  p, n = float(p), float(n)
  N    = p + n

  if N == 0.0: return (0.0, 1.0)

  p = p / N
  z = normcdfi(1 - 0.5 * (1-c))

  a1 = 1.0 / (1.0 + z * z / N)
  a2 = p + z * z / (2 * N)
  a3 = z * math.sqrt(p * (1-p) / N + z * z / (4 * N * N))

  return (a1 * (a2 - a3), a1 * (a2 + a3))


def erfi(x):
  """Approximation to inverse error function"""
  a  = 0.147  # MAGIC!!!
  a1 = math.log(1 - x * x)
  a2 = (
    2.0 / (math.pi * a)
    + a1 / 2.0
  )

  return (
    sign(x) *
    math.sqrt( math.sqrt(a2 * a2 - a1 / a) - a2 )
  )


def sign(x):
  if x  < 0: return -1
  if x == 0: return  0
  if x  > 0: return  1


def normcdfi(p, mu=0.0, sigma2=1.0):
  """Inverse CDF of normal distribution"""
  if mu == 0.0 and sigma2 == 1.0:
    return math.sqrt(2) * erfi(2 * p - 1)
  else:
    return mu + math.sqrt(sigma2) * normcdfi(p)

View more solutions

12,992

Author by

Jeff Bauer

I write software for medical-based applications and spend a lot of time riding my bike. Current technologies I work with: Python, Django, AWS Current open source projects I contribute to: boto salt Ubuntu cloud-init

Updated on June 02, 2022

Comments

Jeff Bauer almost 2 years

After reading How Not to Sort by Average Rating, I was curious if anyone has a Python implementation of a Lower bound of Wilson score confidence interval for a Bernoulli parameter?
- luke14free about 12 years
  
  for more precision if n*p-cap*(1-p-cap) is below a certain threshold, say 30-35 I'd use a t-distribution with df: (pos+neg)-2 instead of a normal distr. anyhow. just my two cents though
agf about 12 years

If you're just going to post a link, do it in a comment. If you're posting it as an answer, provide more info from the content and / or pull out the code so not everyone needs to follow the link, and the answer has value even if the link goes dead.
Jeff Bauer about 12 years

I've added the code from Steef's link so I can accept his answer.
Vladtn over 11 years

This answer should be corrected to include the modification below!
Amelio Vazquez-Reina almost 11 years

@Vladtn I just updated it with Gullevek's answer. Let me know if there is anything else wrong with it.
Wesley over 10 years

I'd just like to add that for a 95% confidence interval the z score should be 1.96, not 1.6.
Anentropic over 10 years

@Wesley yes, and I believe 1.0 = 85% was also wrong, have updated the answer... there is a table of values here dummies.com/how-to/content/…
Loisaida Sam Sandberg over 6 years

Broken link to explanation/implementation :\
Tom N Tech over 5 years

The link is broken :(
Pip almost 5 years

The link is fixed.
rjh over 2 years

I'm confused about z = 1.0. If compared against the st.norm.ppf-based solution below, it looks like 1.0 is equivalent to a confidence interval of 68.269% which is a bit of a strange default.
WCN over 2 years

This does not give the result given by NIST in their recipe, which involves solving for the root of an equation. itl.nist.gov/div898/handbook/prc/section2/prc241.htm
Thomas David Baker over 2 years

print(binconf(50, 100)) => (0.26291792852889806, 0.41206457669597374) … 50 positive events, 100 total events gives a range where the upper bound is below 0.5?
onestop about 2 years

"I think this one has a wrong wilson call" - which one?