In order to get the mean value of column 1 (or any other) you type:

`$ awk 'BEGIN{s=0;}{s=s+$1;}END{print s/NR;}' file`

In order to get the standard deviation of column 1 you type:

`$ awk '{sum+=$1; sumsq+=$1*$1} END {print sqrt(sumsq/NR - (sum/NR)^2)}' file`

or

`$ awk '{delta = $1 - avg; avg += delta / NR; mean2 += delta * ($1 - avg); } END { print sqrt(mean2 / NR); }' file`

The second option is working better with large numbers of data, without having the possibility for overflow.

Sources: utah.edu/awk , commandlinefu.com/standard deviation with awk

I replaced NR with (NR-1) to get the right result

awk ‘{delta = $1 – avg; avg += delta / NR; mean2 += delta * ($1 – avg); } END { print sqrt(mean2 / (NR-1)); }’ file

Indeed! Thanks for the input.

That’s only if it’s a “sample” standard deviation (i.e. the data does not represent the entire dataset).

Indeed! The (N-1) is the sample standard deviation when we do not know the population, while putting simply N means that we know all the examined population.

Thanks for pointing out, as this was written a lot of time ago!