Cracking IT Interview



​Pattern Search:

grep (Globally search Regular Expression & Print it)


This command is used to search a pattern in a file basically. A pattern is nothing but a string, keyword or any name which may be or may not be consisting of regular expression. When the pattern is found and matched in a file, the matching pattern lines are printed on the screen. Also, it filters out the file names when it is used in conjunction with "ls" or some other commands with the help of pipe ("|") connector. In this case, file names are given as a pattern with command.  


So, whether we have to search a particular string in a file and filter-out OR searching a file in a particular location, grep filter plays an important role in both the cases.







Example:


$cat player.txt
id    name    team
1     Amber   Blues
2     Boby    Reds
3     Cris    Blues
4     Rich    Blues
5     Mishel  Reds
6     Alfred  Blues
7     Gil     Blues
$


We have taken file "player.txt" as an example where we have to search for the pattern "Blues". Just we wanted to know the name of the players which are in team "Blues".


$grep Blues player.txt
1     Amber   Blues
3     Cris    Blues
4     Rich    Blues
6     Alfred  Blues
7     Gil     Blues
$


Similarly, for "Reds" team player search:


$grep Reds player.txt
2     Boby    Reds
5     Mishel  Reds
$


File search using grep command:


$ls -l
total 128
drwxr-xr-x  3 pinku  guest   512 Dec 24 08:10 A_dir
drwxr-xr-x  2 pinku  guest   512 Dec 24 08:12 B_dir
drwxr-xr-x  3 pinku  guest   512 Dec 25 00:14 C_dir
-rw-r-xr--  1 pinku  guest  6345 Dec 29 00:53 Cocoon
drwxrwxrwx  3 pinku  guest   512 Dec 24 08:53 G_dir
-rw-r--r--  1 pinku  guest  1017 Dec 29 01:04 ShortStory.txt.gz
-rw-r--r--  1 pinku  guest  5120 Jan  7 05:30 arch1.tar
-rw-r--r--  1 pinku  guest   187 Jan  7 05:24 arch2.tar
-rw-r--r--  1 pinku  guest   375 Jan  7 05:31 arch3.tar
drwxr--r--  2 pinku  guest   512 Jan 14 10:54 change_umask_dir
-rw-r--r--  1 pinku  guest     0 Jan 14 11:03 check_file
drwxr-xr-x  2 pinku  guest   512 Jan 14 10:50 check_perm_dir
-rwxr-xr-x  1 pinku  guest     0 Jan  6 03:40 file1.txt
-rwxr-xr-x  1 pinku  guest     0 Dec 24 03:06 file2.txt
-rw-r--r--  1 pinku  guest    42 Jan  1 01:18 file3.txt.gz
-rw-r--r--  2 pinku  guest    77 Jan  1 13:24 file4.txt
-rwxrwxrwx  3 pinku  guest    52 Dec 25 00:10 file5


These are the long list of files with only "ls -l" command and we have to filter out files where file names containing string "arch" :


$ls -l | grep arch
-rw-r--r--  1 pinku  guest  5120 Jan  7 05:30 arch1.tar
-rw-r--r--  1 pinku  guest   187 Jan  7 05:24 arch2.tar
-rw-r--r--  1 pinku  guest   375 Jan  7 05:31 arch3.tar
$


Similarly, for files where names containing "file" as the part of filename:


$ls -l | grep file
-rw-r--r--  1 pinku  guest     0 Jan 14 11:03 check_file
-rwxr-xr-x  1 pinku  guest     0 Jan  6 03:40 file1.txt
-rwxr-xr-x  1 pinku  guest     0 Dec 24 03:06 file2.txt
-rw-r--r--  1 pinku  guest    42 Jan  1 01:18 file3.txt.gz
-rw-r--r--  2 pinku  guest    77 Jan  1 13:24 file4.txt
-rwxrwxrwx  3 pinku  guest    52 Dec 25 00:10 file5








Options to use with grep :


  • grep -c "pattern" filename : this will count the total number of filtered lines which we get after matching pattern in a filename.


$grep -c Blues player.txt
5


  • grep -n "pattern" filename: displays the line numbers at the beginning of each line long with the filtered line numbers.


$grep -n Blues player.txt
2:1     Amber   Blues
4:3     Cris    Blues
5:4     Rich    Blues
7:6     Alfred  Blues
8:7     Gil     Blues
$


  • grep -w "pattern" filename : This will match with exact matching pattern, not with the part of pattern.


$grep Blu player.txt
1     Amber   Blues
3     Cris    Blues
4     Rich    Blues
6     Alfred  Blues
7     Gil     Blues


Here, even if team name is "Blues" but it displays the result in spite of giving the pattern "Blu", but using "-w" option this is different case.


$grep -w Blu player.txt
$grep -w Blues player.txt
1     Amber   Blues
3     Cris    Blues
4     Rich    Blues
6     Alfred  Blues
7     Gil     Blues
$


  • grep -i "pattern" filename : ignore case, using this option we can avoid case sensitivity i.e, uppercase and lowercase letters with the given patterns.


$grep blues player.txt


This pattern given above does not work because of case sensitivity, "blues" is different from "Blues" here.


$grep -i bLuEs player.txt
1     Amber   Blues
3     Cris    Blues
4     Rich    Blues
6     Alfred  Blues
7     Gil     Blues


  • grep -e : This option is used for multiple pattern search with a single command and single output.


Lets search for the player "Amber" and "Mishel" that which teams they are playing in.


$grep -e Amber -e Mishel player.txt
1     Amber   Blues
5     Mishel  Reds
$

  • grep -o "pattern" filename : This option lists out all the matching pattern found in a file, only pattern not full lines in a file. This is very helpful while counting all occurrence of the pattern in a file in conjunction with "wc -l"  command and "|" pipe operator. This is not possible with "-c" option because it counts only the first occurrence of pattern in a file.


$cat test.txt
Unix Linux Unix
Linux Unix

Linux Linux
Unix Unix
$


Lets take the file test.txt and check for "-c" as well as "-o" options.


$grep -c Unix test.txt
3
$grep -o Unix test.txt
Unix
Unix
Unix
Unix
Unix
$
$grep -o Unix test.txt | wc -l
       5

$








  • grep -v "pattern" filename : inverse selection, it is kind of negation with the given pattern. It will display all the lines in a file which does not contain the given pattern.


$grep -v Blues player.txt
id    name    team
2     Boby    Reds
5     Mishel  Reds
$
$grep -v Unix test.txt

Linux Linux
$


This option is very useful to negate / delete a blank lines from a file. Lets take "test.txt" file which contains one blank line.


$ cat test.txt
Unix Linux Unix
Linux Unix

Linux Linux

Unix Unix
$
$ grep -v ^$ test.txt
Unix Linux Unix
Linux Unix
Linux Linux
Unix Unix
$

Here, "^$" represents the blank line pattern. "^" symbol represents the beginning of the line and "$" symbol represents the end of the line. But the above command is only for display purpose to the standard output. If you open the file, you will find that blank lines are still present in a file. So please follow the below command to delete blank lines permanently from a file:     


$ cat test.txt
Unix Linux Unix
Linux Unix

Linux Linux

Unix Unix
$ grep -v ^$ test.txt >> $$; mv $$ test.txt
$
$ cat test.txt
Unix Linux Unix
Linux Unix
Linux Linux
Unix Unix
$


  • grep -l "pattern" * : This option will list out all the files containing the given pattern with command. This is very useful searching for the file when you forget the file name or not sure about the name of the files.


$ grep -l Unix *
test.txt
grep: umaskfile_check: Permission denied
$
$ grep -l Linux *
test.txt
grep: umaskfile_check: Permission denied
$

 


Click Here to Continue with

Regular Expressions

NEXT->