Extract new lines ONLY with diff command in linux

I needed a way to output only new lines in a text file quickly–without considering the line order. I wanted to use diff to exclude old lines and exclude unchanged lines. This is how I did it.

Start by sorting both your files. cYrus posted this solution on this website.

sort file-a > s-file-a
sort file-b > s-file-b
diff s-file-a s-file-b

That makes sure that the diff command in linux will not say that lines have changed if they contain the same text but are in a different order. (In other words, it helps diff ignore the order that lines are in to compare only the text that’s changed).

Next we need to make sure that only new lines are output. According to this diff tutorial, you can change the line format of each line that diff outputs. I modified their Line Formats Example by taking out the %l after the old and unchanged lines.

diff –ignore-blank-lines -b –ignore-case
–old-line-format=”
–new-line-format=’%l

–unchanged-line-format=”
old.csv new.csv > output.csv

Leave a Reply

Your email address will not be published. Required fields are marked *