Tag Archive


amateur astronomy awk bash b[e] supergiant cartoon conference convert evolved star exoplanet fedora figaro fits fun galaxy iraf large magellanic cloud latex linux lmc machine learning magellanic clouds massive star matplotlib meteor mypaper paper peblo photometry planet pro-am pyraf python red supergiant scisoft skinakas observatory small magellanic cloud smc spectroscopy starlink talk ubuntu university of crete video x-ray yellow hypergiant

Reading columns from a text file – fast and easy!

Although the are many information on how to use the genfromtxt (from numpy) there is not a clear demonstration (that I found…) to show how to easily read a text file with columns of data (including strings – it is straightforward to read columns of numbers with loadtxt for example).

Suppose that we have the following text file (named ‘input.txt’):

#object	q1	q2
id-001	120.	2212.
id-002	145.	1222.
id-222	123.	1142.

Then we can just use:

g = genfromtxt('input.txt',dtype=None,names=True)

This will read all the file. dtype=None means that the command will decide what type of data is each element and name=True means that we can use the names from the first raw (header) to call each column, i.e.:

print g['object']
print g['q2']

will print:

['id-001' 'id-002' 'id-222']
[ 2212. 1222. 1142.]

That easy, that fast !

Note on how to make multiple plots with pylab

iPython is a great tool to work with Python interactively, imitating a MatLab environment especially when running with matplotlib and numpy [the pylab module, all can be automatically imported by starting pylab like: ipython –pylab]. So you can use it to load data and start playing around with them. Although I like to script more these actions it can be really fast and easy to use it when you just want to see some data and do some minor tasks. But since I don’t write them down these actions I keep forgetting them (well… after a serious number of repetitions this would be of no need … but up to that point let me keep a note!).

So, one of these is hot to make multiple plots. After creating (or most probably loading) the data and making all the necessary steps then we follow this typing:
In [23]: fig = figure()
In [24]: ax = fig.add_subplot(111)
In [25]: p1 = ax.plot(x,y,'r-')
In [26]: p2 = ax.plot(x,z,'g-')
In [27]: draw()

which gives us the following result:

Buggy behaviour of unpack within numpy.loadtxt

The routine loadtxt from numpy (numpy.loadtxt) can be used to load columns of data from various files. This works great as long as numbers exist and becomes buggy with strings.

Suppose we have a file (named ‘test.test’) with these values:

3103725.  1616.93596535  13.656  0  2011-05-23T23:49:35  
3139474.  1405.95436047  13.643  0  2011-05-23T23:51:16  
3026925.  1370.07921223  13.683  0  2011-05-23T23:54:40 
...

First, if we want to assign many values we have to use unpack parameter, like:
>>> x, y, z = loadtxt('test.test', unpack=True, usecols=(0,1,2))

and if one of these columns is strings then we should define this by using dtype:
>>> x, y, z = loadtxt('test.test', unpack=True, dtype=(float,float,'S19'), usecols=(0,1,4)) or
>>> x, y, z = loadtxt('test.test', unpack=True, dtype={'names':['n1','n2','s1'],'formats':['f','f','S19']}, usecols=(0,1,4))

and of course we should get 3 columns with the the third one to be a column of strings.

But … this will not happen! The unpack parameter is buggy up to 1.3.0 version (at least) and it has been corrected at a later version (it works after version 1.6 for sure). So since I didn’t have that version (and I didn’t want to spend time upgrading) a solution was given by Derek H. within AstroPy list. The unpack parameter is not used at all and the assignment is done by this workaround:
>>> a = loadtxt('test.test',dtype=[('x','f'),('y','f'),('z','S19')],usecols=[0,1,4])
x,y,z = a['x'], a['y'], a['z']