how to use patch and diff to merge two files and automatically resolve conflicts

38,993

Solution 1

You don't need patch for this; it's for extracting changes and sending them on without the unchanged part of the file.

The tool for merging two versions of a file is merge, but as @vonbrand wrote, you need the "base" file from which your two versions diverged. To do a merge without it, use diff like this:

diff -DVERSION1 file1.xml file2.xml > merged.xml

It will enclose each set of changes in C-style #ifdef/#ifndef "preprocessor" commands, like this:

#ifdef VERSION1
<stuff added to file1.xml>
#endif
...
#ifndef VERSION1
<stuff added to file2.xml>
#endif

If a line or region differs between the two files, you'll get a "conflict", which looks like this:

#ifndef VERSION1
<version 1>
#else /* VERSION1 */
<version 2>
#endif /* VERSION1 */

So save the output in a file, and open it in an editor. Search for any places where #else comes up, and resolve them manually. Then save the file and run it through grep -v to get rid of the remaining #if(n)def and #endif lines:

grep -v '^#if' merged.xml | grep -v '^#endif' > clean.xml

In the future, save the original version of the file. merge can give you much better results with the help of the extra information. (But be careful: merge edits one of the files in-place, unless you use -p. Read the manual).

Solution 2

sdiff (1) - side-by-side merge of file differences

Use the --output option, this will interactively merge any two files. You use simple commands to select a change or edit a change.

You should make sure that the EDITOR environment variable is set. The default editor for commands like "eb" is usually ed, a line editor.

EDITOR=nano sdiff -o merged.txt file1.txt file2.txt

Solution 3

merge(1) is probably nearer to what you want, but that requires a common ancestor to your two files.

A (dirty!) way of doing it is:

  1. Get rid of the first and last lines, use grep(1) to exclude them
  2. Smash the results together
  3. sort -u leaves a sorted list, eliminates duplicates
  4. Replace first/last line

Humm... something along the lines:

echo '<resources>'; grep -v resources file1 file2 | sort -u; echo '</resources>'

might do.

Solution 4

Here a simple solution that works merging up to 10 files:

#!/bin/bash

strip(){
    i=0
    for f; do
        sed -r '
            /<\/?resources>/ d
            s/>/>'$((i++))'/
        ' "$f"
    done
}

strip "$@" | sort -u -k1,1 -t'>' | sed '
    1 s|^|<resources>\n|
    s/>[0-9]/>/
    $ a </resources>
'

please note the arg that comes first has the precedence so you have to call:

script b.xml a.xml

to get common values kept from b.xml rather than a.xml.

script b.xml a.xml outs:

<resources>
   <color name="in_b_but_different_val">#BBBBBB</color>
   <color name="not_in_a">#AAAAAA</color>
   <color name="not_in_b">#AAAAAA</color>
   <color name="not_in_b_too">#AAAAAA</color>
   <color name="same_in_b">#AAABBB</color>
</resources>

Solution 5

Another horrible hack - could be simplified, but :P

#!/bin/bash

i=0

while read line
do
    if [ "${line:0:13}" == '<color name="' ]
    then
        a_keys[$i]="${line:13}"
        a_keys[$i]="${a_keys[$i]%%\"*}"
        a_values[$i]="$line"
        i=$((i+1))
    fi
done < a.xml

i=0

while read line
do
    if [ "${line:0:13}" == '<color name="' ]
    then
        b_keys[$i]="${line:13}"
        b_keys[$i]="${b_keys[$i]%%\"*}"
        b_values[$i]="$line"
        i=$((i+1))
    fi
done < b.xml

echo "<resources>"

i=0

for akey in "${a_keys[@]}"
do
    print=1

    for bkey in "${b_keys[@]}"
    do
        if [ "$akey" == "$bkey" ]
        then
            print=0
            break
        fi
    done

    if [ $print == 1 ]
    then
        echo "  ${a_values[$i]}"
    fi

    i=$(($i+1))
done

for value in "${b_values[@]}"
do
    echo "  $value"
done

echo "</resources>"
Share:
38,993

Related videos on Youtube

CodeNoob
Author by

CodeNoob

Updated on September 18, 2022

Comments

  • CodeNoob
    CodeNoob almost 2 years

    I have read about diff and patch but I can't figure out how to apply what I need. I guess its pretty simple, so to show my problem take these two files:

    a.xml

    <resources>
       <color name="same_in_b">#AAABBB</color>
       <color name="not_in_b">#AAAAAA</color>
       <color name="in_b_but_different_val">#AAAAAA</color>
       <color name="not_in_b_too">#AAAAAA</color>
    </resources>
    

    b.xml

    <resources>
       <color name="same_in_b">#AAABBB</color>
       <color name="in_b_but_different_val">#BBBBBB</color>
       <color name="not_in_a">#AAAAAA</color>
    </resources>
    

    I want to have an output, which looks like this (order doesn't matter):

    <resources>
       <color name="same_in_b">#AAABBB</color>
       <color name="not_in_b">#AAAAAA</color>
       <color name="in_b_but_different_val">#BBBBBB</color>
       <color name="not_in_b_too">#AAAAAA</color>
       <color name="not_in_a">#AAAAAA</color>
    </resources>
    

    The merge should contain all lines along this simple rules:

    1. any line which is only in one of the files
    2. if a line has the same name tag but a different value, take the value from the second

    I want to apply this task inside a bash script, so it must not nessesarily need to get done with diff and patch, if another programm is a better fit

    • tripleee
      tripleee over 11 years
      diff can tell you which lines are in one file but not the other, but only on the granularity of entire lines. patch is only suitable for making the same changes to a similar file (perhaps a different version of the same file, or an entirely different file where however the line numbers and surrounding lines for each change are identical to your original file). So no, they are not particularly suitable for this task. You might want to have a look at wdiff but the solution probably requires a custom script. Since your data looks like XML, you might want to look for some XSL tool.
  • CodeNoob
    CodeNoob over 11 years
    does work in this particular example, but NOT in general: If the name in_b_but_different_val has a value of #00AABB sort will put that on top and erases the second value instead of the first one
  • frostschutz
    frostschutz over 11 years
    for the optimal solution in this case you'd have to parse the XML, with a real XML parser not the hacks above, and produce a new merged XML output from that. diff / patch / sort etc. are just all hacks tailored to "particular examples", for a general solution they're simply the wrong tools
  • vonbrand
    vonbrand over 11 years
    @alzheimer, whip up something simple to show us...
  • tripleee
    tripleee over 11 years
    echo is the default action, so xargs echo is superfluous. Why don't you simply tr '\n' '|' anyway?
  • frostschutz
    frostschutz over 11 years
    Good point - it's just a quick hack. I'll edit it.
  • lockwobr
    lockwobr almost 8 years
    I added something for if I had a conflict sed -e "s/^#else.*$/\/\/ conflict/g"
  • CMCDragonkai
    CMCDragonkai about 6 years
    I find using vim as the EDITOR as better. But this is the best solution, it comes with the diff command too!
  • Stephen Kitt
    Stephen Kitt about 4 years
    Could you explain how join would be used in this particular case?
  • Kusalananda
    Kusalananda about 4 years
    So, "using join" may be a correct answer, but it's useless unless one knew how to apply join to this particular issue. The join utility crucially does not read XML, for example.
  • G-Man Says 'Reinstate Monica'
    G-Man Says 'Reinstate Monica' about 4 years
    The problem / requirements in the question you linked to  are significantly different from those in this question.