Tag Archive

amateur astronomy awk bash be b[e] supergiant cartoon conference convert exoplanet fedora figaro fits fun galaxy history iraf jupiter latex linux magellanic clouds massive star matplotlib meteor mypaper ondrejov observatory optical paper peblo photometry planet pro-am pyraf python scisoft skinakas observatory small magellanic cloud smc spectroscopy starlink talk theli ubuntu university of crete video x-ray

Merging catalogs and creating unique identifier in bash

For a certain project I had created a number of photometric catalogs, each one corresponding to a specific observing field. I would like to construct the final (merged) one but for this I needed to add a unique source identifier at the beginning of each row. I decided to create a F#-**** tag for each source with “F#” corresponding to the field id and **** to a counter for each source per field. The final command was:

for i in {1,2,4,5,6,7,8,9,10,11,12,13,16};do echo F$i.matches.all.cat;awk -v id="$i" 'FNR>1 {print "F"id"-"1+c++, $0}' F$i.matches.all.cat >> results.tmp; done

So the command reads all the specific numbers for which a catalog with a filename of F*.matches.all.cat exists. The number of each field ($i) is parsed as an external variable (id) to awk which places it as the unique identifier “Fid-counter” with the incremental “counter” (1+c++) corresponding actually to the number of row (1+counter to begin from 1 instead of 0 – FNR avoids the first line of each catalog which is a column description). All results are written appended to the output file results.tmp (created automatically when non-existing).

Then, we can use sed to add the header:

sed -i '1i\#SourceID ...' results.tmp

Log and awk

When using the log function of awk, then what we get as a result is the natural logarithmic of the input, like:

...$ awk 'BEGIN{print log(100)}'
...$ 4.60517

So in order to obtain the logarithm of base 10 (or any other base), we just need to divide the result with the logarithm of the base, like:

...$ awk 'BEGIN{print log(100)}/log(10)'
...$ 2

External variable inside awk

Sometimes, a variable may be needed to get inside awk (for or while loop for example). In order to pass the variable inside awk, include it like:
awk -v i=$i '{ .. }' test

Mean value and standard deviation of a column using awk

In order to get the mean value of column 1 (or any other) you type:
$ awk 'BEGIN{s=0;}{s=s+$1;}END{print s/NR;}' file

In order to get the standard deviation of column 1 you type:
$ awk '{sum+=$1; sumsq+=$1*$1} END {print sqrt(sumsq/NR - (sum/NR)^2)}' file


$ awk '{delta = $1 - avg; avg += delta / NR; mean2 += delta * ($1 - avg); } END { print sqrt(mean2 / NR); }' file

The second option is working better with large numbers of data, without having the possibility for overflow.

Sources: utah.edu/awk , commandlinefu.com/standard deviation with awk