Partial Correlations in R

16,294

Reading from help('pcor') in the value section

Value

estimate a matrix of the partial correlation coefficient between two variables

p.value a matrix of the p value of the test

statistic a matrix of the value of the test statistic

n the number of samples

gn the number of given variables

method the correlation method used

The details section gives

Details

Partial correlation is the correlation of two variables while controlling for a third or more other variables.

For your result

$estimate
                Bugs.Project Orgs.Project Changes.Project
Bugs.Project       1.0000000    0.3935535       0.9749296
Orgs.Project       0.3935535    1.0000000      -0.1800788
Changes.Project    0.9749296   -0.1800788       1.0000000

The partial correlation of Changes.Project and Orgs.Project is -0.1800788. This is the correlation of Changes.Project and Orgs.Project controlling for Bugs.Project

The partial correlation of Changes.Project and Bugs.Project is 0.9747296. This is the correlation of Changes.Project and Bugs.Project controlling for Orgs.Project

The partial correlation of Orgs.Project and Bugs.Project is 0.3935535. This is the correlation of Orgs.Project and Bugs.Project controlling for Changes.Project

You could get same information (if you are only interested in this third case) from

pcor.test(y.data$Orgs.Project, y.data$Bugs.Project, y.data$Changes.Project)
Share:
16,294
user1897691
Author by

user1897691

Updated on June 16, 2022

Comments

  • user1897691
    user1897691 almost 2 years

    I am trying to compute a partial correlation in R. I have the two data sets that I want to compare and currently only one controlled variable. (This will change in the future)

    I have looked online to try to work this out myself but it is difficult to understand the terminology used on the websites I have looked at. Can someone please explain how I would go about doing this and perhaps provide a simple example?

    Data is in the following form:

                    Project.Name Bugs.Project Changes.Project Orgs.Project
    1     platform_external_svox            4             161            2
    3 platform_packages_apps_Nfc           13             223            2
    5      platform_system_media           36             307            2
    7     platform_external_mtpd            2              30            2
    9            platform_bionic           42            1061            4
    

    I want the correlation between Bugs.Project and Orgs.Project with Changes.Project as a controlled variable. I have downloaded the ppcor library since it looks like it has the functionality that I need. I am unsure how to use it, however. How do I add my data to a matrix and use the pcor function?

    This is what I've been trying:

    y.data <- data.frame(
    bpp=c(projRelateBugsOrgs[2]),
    opp=c(projRelateBugsOrgs[4]),
    cpp=c(projRelateBugsOrgs[3])
    )
    
    test <- pcor(y.data)
    

    I just used an example I found and tried to use my data in place of theirs. I don't understand my output.

    It looks like this:

    $estimate
                    Bugs.Project Orgs.Project Changes.Project
    Bugs.Project       1.0000000    0.3935535       0.9749296
    Orgs.Project       0.3935535    1.0000000      -0.1800788
    Changes.Project    0.9749296   -0.1800788       1.0000000
    
    $p.value
                    Bugs.Project Orgs.Project Changes.Project
    Bugs.Project     0.00000e+00  2.09795e-07       0.0000000
    Orgs.Project     2.09795e-07  0.00000e+00       0.0264442
    Changes.Project  0.00000e+00  2.64442e-02       0.0000000
    
    $statistic
                    Bugs.Project Orgs.Project Changes.Project
    Bugs.Project        0.000000     5.190442       53.122165
    Orgs.Project        5.190442     0.000000       -2.219625
    Changes.Project    53.122165    -2.219625        0.000000
    
    $n
    [1] 150
    
    $gp
    [1] 1
    
    $method
    [1] "pearson"
    

    I think I want something from the $estimate table but I'm not exactly sure what it's giving me,

  • user1897691
    user1897691 over 11 years
    First off, thank you for suggesting the help manual. I didn't know that was in R. Also, what I'm attempting is to find the correlation between the number of bugs in a project and the number of organizations that worked on that project. The number of changes on a project is a factor to consider since more changes usually means more bugs. What I mean by controlled variable is just that it needs to be taken into account but it's not what I'm looking for, if that makes sense?
  • mnel
    mnel over 11 years
    I worked that out in the end. Hopefully my edit is now clearer.
  • rg255
    rg255 about 11 years
    @mnel sorry to add to a closed question but this is just a quick addition: I am also using pcor & pcor.test - when I use just one controlling variable I get the same correlation values from pcor and pcor.test, but then I use two controlling variables by saying pcor.test(x,y,z=c(z1,z2)) which gives a different result to the pcor output - could you possibly explain this please? is it wrong to use the pcor.test function for more than one controlling variable?