Thursday, March 7, 2013

Count distinct values of a field in a text file on Unix

Here's a simple command to count the number of distinct values in a field in a delimited text file on *nix.
For example, we can assume that the text file is comma separated and the field being counted is the second field.

cut -d ',' -f 2 /files/data.csv | uniq | wc -l

On the other hand, if you also wanted a count of how many times each value has occurred, then use the following command:
cut -d ',' -f 2 /files/data.csv | uniq -c

Simple does it - no need for awk or sed!

No comments:

LinkWithin

Related Posts Plugin for WordPress, Blogger...