Thursday, September 24, 2009

Working with text files in Unix/Linux (part 1/3)

Removing repeated lines in a text, the strategy to remove repeated lines in a text is not very

intuitive at the first, the operation is divided in parts:


1) convert if possible all the lines to lowercase.
2) sort the lines so repeated lines are together.
3) remove repeated lines.



$ cat file1.txt file2.txt file3.txt | tr [A-Z] [a-z] | sort | uniq


View only the lines that are repeated:
$ cat file1.txt file2.txt file3.txt | tr [A-Z] [a-z] | sort | uniq -D


View the amount of uniq files:
$ cat file1.txt file2.txt file3.txt | tr [A-Z] [a-z] | sort | uniq | wc


Print a file from end to start:
$ tac file1.txt file2.txt

0 comentarios: