Cracking IT Interview



File Comparison - cmp , comm , diff


Sometimes we may be in confusion that two files are actually the same file or there is any difference. Comparing two files manually is a bit lengthy and complicated specially when the file is large in size. That's why Unix offers us these commands to compare files instead putting your head into file contents jargon. All these commands are different in their own way of use, lets check it out one by one. 






cmp 

"Stands for compare"

Two files are compared byte by byte for each line in file. The location of the first mismatch is printed on the screen. If they differ, the byte and line numbers are printed at which the first difference occurred.


If both files are identical, "cmp" command doesn't display any message and simply return to the prompt.


Syntax:

cmp file1 file2


example:

$cmp story.txt ShortStory.txt
story.txt ShortStory.txt differ: char 446, line 4
$


Options to use with cmp:


  • cmp -l file1 file2  - prints the byte number and differing bytes for each difference


$cmp -l story.txt ShortStory.txt
   446  12 162
   447 162 160
   448 160 151
   449 151 154
   451 154 141
   452 141 162

     ....

      .....


  • cmp -s file1 file2  - prints nothing but only exit code: 0, 1, 2

 

$cmp -s file1.txt file2.txt
$echo $?
1
$cat file3.txt
ls
$cat file4.txt
ls
$cmp -s file3.txt file4.txt
$echo $?
0
$


We check exit status by "echo $?" where "$?" is the shell variable which stores the last exit status.


Exit status in cmp command:

0, two files are identical

1, files are not identical 

2, inaccessible / missing arguments



  • cmp -z file1 file2  - Compare file sizes first, and fail the comparison if they are not equal


$cmp -z story.txt ShortStory.txt
story.txt ShortStory.txt differ: size
$







comm

"Stands for common"


This command displays you output in 3 columns:


  • 1st column contains lines unique to the 1st file


  • 2nd column contains lines unique to the 2nd file, and


  • 3rd column contains lines common to both the files.


Syntax:

comm file1 file2


example:

$comm file3.txt file4.txt
                ls
cd
        rm
$cat file3.txt
ls
cd
$cat file4.txt
ls
rm
$


Options to use with comm:


  • comm -1 file1 file2  - Suppressing printing of column 1


  • comm -2 file1 file2  - Suppressing printing of column 2


  • comm -3 file1 file2  - Suppressing printing of column 3 


  • comm -i file1 file2   - Case insensitive comparison of lines


Example:


$comm -1 file3.txt file4.txt
        ls
rm
$comm -2 file3.txt file4.txt
        ls
cd
$comm -3 file3.txt file4.txt
cd
        rm
$comm -i file3.txt file4.txt
                ls
cd
        rm
$






diff

"Stands for Difference"


This command tells you, which lines in one file has to be changed to make two files identical.


Syntax:

diff file1 file2


example:

$diff file4.txt file3.txt
2c2
< rm
---
> cd
$diff file3.txt file4.txt
2c2
< cd
---
> rm
$cat file3.txt file4.txt
ls
cd
ls
rm
$


Options to use with diff:


  • diff -i file1 file2  - ignore-case, ignore case sensitivity in file contents 


  • diff -E file1 file2  - ignore-tab-expansion, ignore changes due to tab expansion 


  • diff -b file1 file2  - ignore-space-change, ignore changes in the amount of white space 


  • diff -w file1 file2  - ignore-all-space, ignore all white space 


  • diff -B file1 file2  - ignore-blank-lines, ignore changes whose lines are all blank  



Examples:


$diff -i file3.txt file4.txt
3,4c3,8
< cd
<
---
> cd
>
>
>
>
>
$diff -w file3.txt file4.txt
4a5,8
>
>
>
>
$diff -wB file3.txt file4.txt
$



Questions & Answers:


Qs: There are two files and I want to see only the common lines between these two files ?


comm -1 -2 file1 file2


Qs: Difference between cmp, comm and diff commands ?


please refer in page.


​NEXT ->