Follow @rb_rudra

Thursday 18 October 2012

How to remove duplicate lines inside a text file?


The Unix shell environment is designed for the reading and manipulation of text files (among other tasks). The configuration files, scripts and source code are simple text files that can be read in any text editor. For that reason, there are commands for tasks such as combining files, removing lines and columns and searching for information. By combining shell commands with the scripting languages "awk" and "sed," you can perform high level editing tasks, including removing duplicate lines from one or more text files, from the command line without ever opening



An awk solution seen on #bash (Freenode):
awk '!seen[$0]++' filename

 

No comments:

Post a Comment