How can I save my secret keys and password securely in my version control system?

python django git version-control

36,164

Solution 1

Heroku pushes the use of environment variables for settings and secret keys:

The traditional approach for handling such config vars is to put them under source - in a properties file of some sort. This is an error-prone process, and is especially complicated for open source apps which often have to maintain separate (and private) branches with app-specific configurations.

A better solution is to use environment variables, and keep the keys out of the code. On a traditional host or working locally you can set environment vars in your bashrc. On Heroku, you use config vars.

With Foreman and .env files Heroku provide an enviable toolchain to export, import and synchronise environment variables.

Personally, I believe it's wrong to save secret keys alongside code. It's fundamentally inconsistent with source control, because the keys are for services extrinsic to the the code. The one boon would be that a developer can clone HEAD and run the application without any setup. However, suppose a developer checks out a historic revision of the code. Their copy will include last year's database password, so the application will fail against today's database.

With the Heroku method above, a developer can checkout last year's app, configure it with today's keys, and run it successfully against today's database.

Solution 2

You're exactly right to want to encrypt your sensitive settings file while still maintaining the file in version control. As you mention, the best solution would be one in which Git will transparently encrypt certain sensitive files when you push them so that locally (i.e. on any machine which has your certificate) you can use the settings file, but Git or Dropbox or whoever is storing your files under VC does not have the ability to read the information in plaintext.

Tutorial on Transparent Encryption/Decryption during Push/Pull

This gist https://gist.github.com/873637 shows a tutorial on how to use the Git's smudge/clean filter driver with openssl to transparently encrypt pushed files. You just need to do some initial setup.

Summary of How it Works

You'll basically be creating a .gitencrypt folder containing 3 bash scripts,

clean_filter_openssl 
smudge_filter_openssl 
diff_filter_openssl

which are used by Git for decryption, encryption, and supporting Git diff. A master passphrase and salt (fixed!) is defined inside these scripts and you MUST ensure that .gitencrypt is never actually pushed. Example clean_filter_openssl script:

#!/bin/bash

SALT_FIXED=<your-salt> # 24 or less hex characters
PASS_FIXED=<your-passphrase>

openssl enc -base64 -aes-256-ecb -S $SALT_FIXED -k $PASS_FIXED

Similar for smudge_filter_open_ssl and diff_filter_oepnssl. See Gist.

Your repo with sensitive information should have a .gitattribute file (unencrypted and included in repo) which references the .gitencrypt directory (which contains everything Git needs to encrypt/decrypt the project transparently) and which is present on your local machine.

.gitattribute contents:

* filter=openssl diff=openssl
[merge]
    renormalize = true

Finally, you will also need to add the following content to your .git/config file

[filter "openssl"]
    smudge = ~/.gitencrypt/smudge_filter_openssl
    clean = ~/.gitencrypt/clean_filter_openssl
[diff "openssl"]
    textconv = ~/.gitencrypt/diff_filter_openssl

Now, when you push the repository containing your sensitive information to a remote repository, the files will be transparently encrypted. When you pull from a local machine which has the .gitencrypt directory (containing your passphrase), the files will be transparently decrypted.

Notes

I should note that this tutorial does not describe a way to only encrypt your sensitive settings file. This will transparently encrypt the entire repository that is pushed to the remote VC host and decrypt the entire repository so it is entirely decrypted locally. To achieve the behavior you want, you could place sensitive files for one or many projects in one sensitive_settings_repo. You could investigate how this transparent encryption technique works with Git submodules http://git-scm.com/book/en/Git-Tools-Submodules if you really need the sensitive files to be in the same repository.

The use of a fixed passphrase could theoretically lead to brute-force vulnerabilities if attackers had access to many encrypted repos/files. IMO, the probability of this is very low. As a note at the bottom of this tutorial mentions, not using a fixed passphrase will result in local versions of a repo on different machines always showing that changes have occurred with 'git status'.

Solution 3

The cleanest way in my opinion is to use environment variables. You won't have to deal with .dist files for example, and the project state on the production environment would be the same as your local machine's.

I recommend reading The Twelve-Factor App's config chapter, the others too if you're interested.

Solution 4

I suggest using configuration files for that and to not version them.

You can however version examples of the files.

I don't see any problem of sharing development settings. By definition it should contain no valuable data.

Solution 5

An option would be to put project-bound credentials into an encrypted container (TrueCrypt or Keepass) and push it.

Update as answer from my comment below:

Interesting question btw. I just found this: github.com/shadowhand/git-encrypt which looks very promising for automatic encryption

View more solutions

36,164

Author by

Chris W.

Python was my first love, but Javascript is growing on me.

Updated on April 07, 2020

Comments

Chris W. about 4 years

I keep important settings like the hostnames and ports of development and production servers in my version control system. But I know that it's bad practice to keep secrets (like private keys and database passwords) in a VCS repository.

But passwords--like any other setting--seem like they should be versioned. So what is the proper way to keep passwords version controlled?

I imagine it would involve keeping the secrets in their own "secrets settings" file and having that file encrypted and version controlled. But what technologies? And how to do this properly? Is there a better way entirely to go about it?

I ask the question generally, but in my specific instance I would like to store secret keys and passwords for a Django/Python site using git and github.

Also, an ideal solution would do something magical when I push/pull with git--e.g., if the encrypted passwords file changes a script is run which asks for a password and decrypts it into place.

EDIT: For clarity, I am asking about where to store production secrets.
- John Mee almost 12 years
  
  Actually front up some money to keep the whole repo private.
- Chris W. almost 12 years
  
  @JohnMee I actually already pay for a private repository, but the point remains--you shouldnt keep sensitive information in your repository.
- msw almost 12 years
  
  I think a large part of the reason satisfying answers will be hard to get is that the old-fashioned plaintext password to connect to a database is a relic of a less hostile era. The proper answer is something like "your code shouldn't need a secret", but the systems you are accessing don't give you much choice.
- User almost 12 years
  
  Will you access this confidential data only from Python? Then I would recommend a special file object.
- Chris W. almost 12 years
  
  @user1320237 a special file object? What is that? Where can I find documentation?
- Colonel Panic almost 12 years
  
  Why? There's zilch value in version controlling passwords for external services. The principal value of version control is that you can inspect historic revisions of your application known to be in working order and run them. However, old passwords are useless to you. If they've been revoked, they won't ever work again.
- Jonathan Hartley over 11 years
  
  @ColonelPanic what you say makes sense, but in that case, what do you recommend?
- User over 10 years
  
  Possible duplicate: programmers.stackexchange.com/questions/205606/…
Chris W. almost 12 years

It would be nice to have something which I could automate. Such that if my encrypted password file changes it automatically decrypts the new file.
Chris W. almost 12 years

But then where to store the canonical password records? It would make me nervous to have that data sitting only in a configuration file on a machine which might blow up some day.
schneck almost 12 years

Interesting question btw. I just found this: github.com/shadowhand/git-encrypt which looks very promising for automatic encryption.
Chris W. almost 12 years

Wow, great. The description of git-encrypt sounds like exactly what I'm looking for " When working with a remote git repository which is hosted on a third-party storage server, data confidentiality sometimes becomes a concern. This article walks you through the procedures of setting up git repositories for which your local working directories are as normal (un-encrypted) but the committed content is encrypted." (Of course, I only want a subset of my content encrypted...)
Chris W. almost 12 years

But i'll still need to keep track of those secrets somewhere. E.g., keypass or something along those lines, right?
Tony Abou-Assaleh almost 12 years

@schneck post your comment as an answer so that Chris could accept it - sounds like it's what he's looking for.
Chris W. almost 12 years

Oh very interesting. This sounds almost exactly like what I want (except its encrypting the entire repository).
Chris W. almost 12 years

It seems like environment variables are a good way to run the application with the secret settings... but it still doesnt answer the question of where to keep those settings.
Samy Dindane almost 12 years

You should usually have a README file for each of your apps. In there, specify which environment variables should be set, and every time you deploy a project, just follow the steps and set each of them. You can also create a shell script with many export MY_ENV_VAR=, and when you deploy, just fill it with the right values and source it. If by keep you mean version the settings, you shouldn't be doing this in the first place.
dgh almost 12 years

You could either keep all the sensitive setting files for multiple applications in one encrypted repository or add the encrypted repository with the sensitive settings to your project as a Git submodule as described here git-scm.com/book/en/Git-Tools-Submodules .
dgh almost 12 years

Storing production passwords/settings in an (encrypted) submodules is not uncommon. stackoverflow.com/questions/11207284/… . It would even make it easier to manage settings across projects.
Steve Buzonas almost 12 years

@ChrisW. If the machine blows up, you don't necessarily need the password anymore... However, if you only have one copy of the data on your production machine, that should raise up a red flag. But that doesn't mean it should be in VCS. There should be RAID, full backups supplemented by incremental backups on magnetic and optical media. A lot of corporations have a change control procedure which may dictate how and where to store passwords and other sensitive materials on paper as well.
tiktak almost 12 years

@ChrisW I dont' want to be rough but it seems you don't tell us the truth and the passwords you want to store are not used in Development but in production. Isn't this true? Otherwise, why do you care for a development or test machine and development passwords? Nobody would do that.
tiktak almost 12 years

BTW, at our company, all the development passwords are available on paper and on the intranet. Because they have no value. They're there because the software we develop needs authentication.
tiktak almost 12 years

.dist files are what I was talking about: examples of real configuration files. A good practice is that it should be possible to run the software only by renaming by removing the ".dist" extension (or better: copying), that is, you should be able to try software in seconds, without having to configure it during the whole day.
tiktak almost 12 years

Then the software needs to decrypt the password file.
Chris W. almost 12 years

@tiktak, you are correct--my question is about what do with regard to production passwords. I dont particularly care about storing development passwords in A VCS in the clear. Sorry if I haven't made that clear enough.
Willian almost 12 years

Well, only when deploying the site the password get decrypted and written to a plain text password file
Chris W. almost 12 years

Also, upvote for The Twelve-Factor App--really great stuff.
Nikolay Fominyh almost 12 years

This answer hasn't enough attention, but it most coincides with the linux way.
Jonathan Hartley over 11 years

So if environment vars are set in your bashrc, and you're deploying a new server, then what creates the bashrc? Doesn't that just move the passwords out of your source code repo, and into your deployment config? (which is presumably also in the source code repo, or in a repo of its own?)
Jonathan Hartley over 11 years

@Samy: And if you've automated deployment?
Samy Dindane over 11 years

@JonathanHartley: Updating your production environment shouldn't change environment variables. If you want to update them, you should do it manually, whether you deploy a new version of your application or not.
Jonathan Hartley over 11 years

I'm worried about the case of creating new server instances. This might be done automatically to scale for traffic, or for some shops, this is a routine part of deploying a new version of your application. Setting environment variables manually each time doesn't seem appropriate in these cases.
Samy Dindane over 11 years

@JonathanHartley I'm not an expert in the subject but the most simple way I can think about is having a shell script that creates another script which will be used to set all the needed environment variables (in other servers) by one single execution.
Jonathan Hartley over 11 years

@Samy: Right, that makes sense, and it's where my thoughts were headed too. But in that case, then the shell script is kept in a source code repo, right? So we're back where we started with the OP's problem.
Samy Dindane over 11 years

@JonathanHartley: I don't believe it's a bad idea to version useful scripts. The generated script should be ignored though.
Jonathan Hartley over 11 years

@Samy: Are we talking about a generated script? I missed that part. What generates it and when? Thanks for your patience.
Samy Dindane over 11 years

@JonathanHartley: Two answers before this one, I was talking about a script that creates another one. :) No problem, glad to help.
Jonathan Hartley over 11 years

@Samy I still don't understand how environment variables would be set. The 12 factor app page doesn't make that clear either (unless you're on Heroku, which my current project is not.) Are we saying that a generating script needs to ask a central config store "I'm machine X, please give me my config data", and that responds with the values of environment variables that should be set. In that case, I don't think you need a generated script any more. I'm wildly speculating here, am I barking up the right tree?
Steve about 9 years

@JonathanHartley your .bashrc shouldn't be in the code repo for your Django app.
Jonathan Hartley about 9 years

@SteveSP Agreed! I don't quite understand what you're getting at. Sorry to be dense.
Steve about 9 years

@JonathanHartley Hm maybe I misunderstood your comment? It sounded like you were saying that the deployment config/.bashrc was in the source code repo.
Jonathan Hartley about 9 years

Sorry, my comment is ambiguous, but that's because I am genuinely confused. I love the sound of this answer's point of view, but have never fully understood it. If I'm deploying to several different environments, each of which contains several hosts, and perhaps several types of hosts, then obviously I need to automate the creation of the .bashrc files that will exist on each host to set its environment variables. So is the answer saying I should have a second repo, separate from my source, which contains all the settings which will become environment variables in .bashrc on deployment?
Steve about 9 years

@JonathanHartley I don't think you need to automate the creation of the .bashrc. You could use some central wiki or README or something (which is not tracked in a VCS and that only company employees have access to) that contains these passwords and instructions on how to set them as environment variables. Environment variables only need to be configured once so each developer would do this when they first set up their laptops (or the servers for staging/production).
Jonathan Hartley almost 9 years

They only need to be configured once PER machine you deploy to. If your deployment process is "spin up a new machine and test it's OK before redirecting traffic to it and then shoot the old one in the head", which IMHO is best practice, then you really do need to automate the creation of whatever sets the env vars.
Hedde van der Heide over 7 years

Regulation and implementation of storing private data depends on the policy of the company the project's for. I highly doubt the project's source code is the proper place as any third party tester or programmer could see these
danio over 7 years

@JonathanHartley I would say that if you want to automate deployment then you store passwords for the deployment within that system. They are nothing to do with the development repo. e.g. if using something like puppet, puppet is responsible for deploying the bashrc and the passwords are therefore managed within that.
geekley almost 6 years

It might be worth checking github.com/AGWA/git-crypt for an updated solution. It has the advantage of allowing individual files to be encripted, and it claims to be "provably semantically secure". The author of the gist himself suggested that this tool is better, at github.com/shadowhand/git-encrypt .
Chris W. over 2 years

Just thought I'd note that about a decade later I agree (as does the industry more broadly?) that saving secrets in VCS is "bad". So what do you do? Use some combination of environment variables, config files, and the secrets management helpers of your platform/infrastructure provider. Sadly difficult to be more specific since often the path of least resistance depends strongly on the environment you're running your app in. (I've switched the accepted answer to this response.)