Git: Removing carriage returns from source-controlled files
Solution 1
The approach you’ll have to use depends on how public your repository is.
If you don’t mind or care about changing all SHAs because you’re more or less the only one using it but want to have this issue sorted out for all times, you can run a git filter-branch
and apply dos2unix
to all files in each commit. (If you’re sharing the repository, everyone else needs more or less to completely renew it, so this is potentially dangerous.)
So the better option and also an easier way would be to change it only in the current heads. This means that your past commits still have \r\n
endings but unless you’re doing much cherry-picking from the past this should not be a problem. The diff tools might complain a bit more often, of course, but normally you’ll only diff with commits in the vicinity, so this issue resolves itself as the commits accumulate.
And UNIX line endings are standard, you’re correct about that. Best approach is to setup your editor to only write these endings even on windows. Otherwise, there is also a autocrlf
setting which you can use.
Addition to the history rewriting part:
Last time I did the same, I used the following command to change all files to unix endings.
#!/bin/bash
all2dos() { find * -exec dos2unix {} \; }
export -f all2dos
git filter-branch -f --tree-filter 'all2dos' --tag-name-filter cat --prune-empty -- --all
Solution 2
This crlf thing drove us crazy when we converted from svn to git (in a central (bare) like) scm environment. The thing that ultimately got us was we copied the global .gitconfig file to everyone's user root (yep both windows and linux) with the initial one coming from a Windows system and having core.autocrlf=true and core.safecrlf=false which played havoc on the linux users (like bash scripts didn't work and all those awful ^M's). So we initially did a checkout and clone script that did a dos2unix after these commands. Then I ran across the core.autocrlf and core.safecrlf config items and set them based on the O/S:
Windows: core.autocrlf=true and core.safecrlf=false Linux: core.autocrlf=input and core.safecrlf=false
These were set with: ---on Windows---
git config --global core.autocrlf true
git config --global core.safecrlf false
---on Linux---
git config --global core.autocrlf input
git config --global core.safecrlf false
Then for our Linux developers we setup a little bash script /usr/local/bin/gitfixcrlf:
#!/bin/sh
# remove local tree
git ls-files -z | xargs -0 rm
# checkout with proper crlf
git checkout .
Which they only had to run on their local sandbox clones once. Any future cloning was done correctly. Any future push pulls now were handled correctly. So, this solved our multiple O/S issues with linefeeds. Also Note that Mac falls in the same config as Linux.
Solution 3
For the continuing solution, have a look at the core.autocrlf (and core.safecrlf) config parameters.
Doing this once to your whole repository will just create one commit that's pretty impossible to merge with (since every line in those files will be modified), but once you get past it, it should be no big deal. (Yes, you could use git filter-branch
to make the modification all the way through history, but that's a bit scary.)
Related videos on Youtube
Muhammad Huzaifa
I got into programming when I was 9, and started doing it professionally when I was 14. Before my startup I worked at Spotify for over five years.
Updated on May 18, 2020Comments
-
Muhammad Huzaifa almost 4 years
I've got a Git repository that has some files with DOS format (
\r\n
line endings). I would like to just run the files throughdos2unix
(which would change all files to UNIX format, with\n
line endings), but how badly would this affect history, and is it recommended at all?I assume that the standard is to always use UNIX line endings for source-controlled files, and optionally switch to OS-specific line endings locally?
-
Muhammad Huzaifa about 14 yearsRelated question for people interested in this: stackoverflow.com/questions/446244/…
-
-
Muhammad Huzaifa about 14 yearsThanks. Right now I'm the only person working on the repository, since it's pretty "young", so rewriting history shouldn't be a problem. But how well would
git filter-branch
play with github (I've put the repository on there)? -
Debilski about 14 yearsI think, you’d have to delete all branches and tags on github to ensure that they can be created again. (It might work without that, but maybe it’s better to start anew.) Alternatively, you delete the whole repo and then just push it again. This should be find with github unless some people have cloned from it. Then they will need to do the same, depending on how fluent they are with git.
-
Muhammad Huzaifa about 14 yearsAlright. I just removed the repository and re-pushed it with the reworked history. I needed to fix some issues with some old commit messages being multi-line too, anyways.
-
Muhammad Huzaifa about 14 yearsThe code you posted didn't work well for me, so I wrote the following:
git filter-branch --tree-filter 'grep -Irl --exclude-dir=.git "" . | xargs sudo dos2unix -p' HEAD
-
Senthil A Kumar over 13 yearsCan git recognize if a file is a text file or not? Coz dos2unix doesn't work on binary files, so how does this work while running in a GIT repo that contains text files as well as binary files?