Batch-processing Snapshots with Pynbody

Using pynbody as a batch system

One typical use for scripting software like python is to automate common tasks. You can use pynbody to run a standard series of analysis routines – for instance to produce small png files on the cluster where you’re running simulations. That way, you can just download 1 MB of images rather than the ~1 GB snapshot files to find out how a simulation is progressing.

Going a step further, you might also create a .data file using the python pickle module, giving fundamental properties of the snapshot like the mass of stars, hot gas, cold gas, radial profiles, a rotation curve. That way, when you want to make future plots comparing snapshots from different times or different parameter choices, you only have to use pickle to quickly open the 50 kB data file instead of opening the 1 GB snapshot file.

doall.py

There is an example script in pynbody/examples called doall.py that does a lot of generic analysis of a cosmological galaxy simulation. To go through it piece by piece, let’s look at the four main sections: the Preamble the part that pickle data, the part that make standard plots and the part that make standard images:

Preamble

import numpy as np
import pynbody
import pynbody.plot as pp
import pynbody.plot.sph
import pynbody.filt as filt
import pynbody.units as units
import pynbody.analysis.profile as profile
import sys, os, glob, pickle

simname = sys.argv[1]   # use first argument as simulation name to analyze
pp.plt.ion()            # plot in interactive mode to show all plots as 
                        #     they are made

s = pynbody.load(simname) # load file into pynbody
h = s.halos()             # find and load halos (using AHF if in path)
diskf = filt.Disc('40 kpc','2 kpc')  # filter for particles in disk
notdiskf = filt.Not(filt.Disc('40 kpc','3 kpc')) # particles outside of disk
i=1
if (len(sys.argv) > 2):
    # if there's a second argument, it can be a photogenic file
    # and analysis will follow the halo that contains those particles
    photiords = np.genfromtxt(sys.argv[2],dtype='i8')
    frac = np.float(len(np.where(np.in1d(photiords,h[i]['iord']))[0]))/len(photiords)
    print('i: %d frac: %.2f'%(i,frac))
    while(((frac) < 0.5) & (i<100)): 
        i=i+1
        frac = np.float(len(np.where(np.in1d(photiords,h[i]['iord']))[0]))/len(photiords)
        print('i: %d frac: %.2f'%(i,frac))
else:
    # otherwise follow largest halo with at least 2 stars
    while len(h[i].star) <2: i=i+1

if (i==100): sys.exit()
pynbody.analysis.angmom.faceon(h[i])  # align halo faceon
s.physical_units()   # change to physical units

Most of the lines in the preamble are commented as to their purpose. Overall,

  1. The appropriate modules are loaded with the import statements
  2. The simulation is loaded based on a command line argument
  3. If a photogenic file (gasoline thing) is given, the halo that includes those particle ids is loaded.
  4. The halo is rotated so the disk is face-on

pickle data

Jtot = np.sqrt(((np.multiply(h[i]['j'].transpose(),h[i]['mass']).sum(axis=1))**2).sum())            # calculate angular momentum
W = np.sum(h[i]['phi']*h[i]['mass']) # halo potential energy
K = np.sum(h[i]['ke']*h[i]['mass'])  # halo kinetic energy
absE = np.fabs(W+K)    # total halo energy
mvir=np.sum(h[i]['mass'].in_units('Msol'))  # virial mass
rvir=pynbody.array.SimArray(np.max(h[i]['r'].in_units('kpc')),'kpc')
# 3D density profile
rhoprof = profile.Profile(h[i],dim=3,type='log')
# Rotation curve
rcpro = profile.Profile(h[i], type='equaln', nbins = 50, max = '40 kpc')
# surface brightness profile
diskstars = h[i].star[filt.Disc('20 kpc','3 kpc')]
sbprof = profile.Profile(diskstars,type='equaln')
# Kinematic decomposition
#decompprof = pynbody.analysis.decomp(h[i])
#dec = h[i].star['decomp']

### Save important numbers using pickle.  
pickle.dump({'z':s.properties['z'],
             'time':s.properties['time'].in_units('Gyr'),
             'rvir':rvir,
             'mvir':mvir,
             'mgas': np.sum(h[i].gas['mass'].in_units('Msol')),
             'mstar': np.sum(h[i].star['mass'].in_units('Msol')),
#             'mdisk': np.sum(h[i].star[np.where(dec == 1)]['mass'].in_units('Msol')),
#             'msphere': np.sum(h[i].star[np.where(dec == 2)]['mass'].in_units('Msol')),
#             'mbulge': np.sum(h[i].star[np.where(dec == 3)]['mass'].in_units('Msol')),
#             'mthick': np.sum(h[i].star[np.where(dec == 4)]['mass'].in_units('Msol')),
#             'mpseudob': np.sum(h[i].star[np.where(dec == 5)]['mass'].in_units('Msol')),
             'mgashot': np.sum(h[i].gas[filt.HighPass('temp',1e5)]['mass'].in_units('Msol')),
             'mgascool': np.sum(h[i].gas[filt.LowPass('temp',1e5)]['mass'].in_units('Msol')),
             'Jtot':Jtot,'lambda':(Jtot / np.sqrt(2*np.power(mvir,3)*rvir*units.G)).in_units('1'),
             'denprof':{'r':rhoprof['rbins'].in_units('kpc'), 
                        'den':rhoprof['density']},
             'rotcur':{'r':rcpro['rbins'].in_units('kpc'), 
                       'vc':rcpro['rotation_curve_spherical'].in_units('km s^-1'),
                       'fourier':rcpro['fourier']},
             'sb':{'r':sbprof['rbins'].in_units('kpc'), 
                   'sb':sbprof['sb,i']}
             },
            open(simname+'.data','w'), pickle.HIGHEST_PROTOCOL)

  1. Some special pieces of data about the halo are calculated for storage.
  2. The data is saved using the python pickle module (pickle.dump)

Note

The kinematic decomposition section is commented out because it is a little slow.

  • The pickled data can be used to make comparisons as in the following code
import pynbody, sys
import matplotlib.pyplot as plt
import pickle, glob, numpy as np

runs = ['c.1td.05rp.1','mugs','esn1.2','td.01c.1rp.1','rp.175','noradpres','2crasy']
labels = ['Fiducial','MUGS','120% SN energy','Low Diffusion','High Diffusion','No ESF',
          'High ESF']
cs = ['r','k','c','g','y','m','b']
for j,r in enumerate(runs):
    simname=r+'/01024/g1536.01024'

    d = pickle.load(open(simname+'.data'))

    if j>0:
        plt.plot(d['rotcur']['r'],d['rotcur']['vc'],color=cs[j], 
                 lw=1, label=labels[j],alpha=0.5)
        if j==6:
            rad = d['rotcur']['r']
            plt.plot(rad,(2.0/np.pi)*np.max(d['rotcur']['vc'])*np.arctan(rad),
                     '--b',label='arctangent fit')
    else:
        plt.plot(d['rotcur']['r'],d['rotcur']['vc'],color=cs[j], 
                 lw=2, label=labels[j])
        rad = d['rotcur']['r']
        plt.plot(rad,(2.0/np.pi)*np.max(d['rotcur']['vc'])*np.arctan(4*rad),
                 '--r',label='arctangent fit')

plt.savefig('rc.eps')


make standard plots


### Make plots
try:
	pp.sbprofile(h[i],filename=simname+'.sbprof.png',center=False)
	pp.sfh(h[i],filename=simname+'.sfh.png',nbins=500)
	pp.rotation_curve(h[i],filename=simname+'.rc.png',quick=True,
					  max='40 kpc',center=False)
	pp.rotation_curve(h[i],filename=simname+'.rcparts.png',quick=True,
                          parts=True, legend=True, max='40 kpc',center=False)
	pp.rho_T(h[i],filename=simname+'.phase.png')
	pp.ofefeh(h[i].stars, filename=simname+'.ofefeh.png',
                  weights=h[i].stars['mass'].in_units('Msol'), scalemin=1e3,
                  scalemax=1e6, x_range=[-3,0.3],y_range=[-0.5,1.0])
	pp.mdf(h[i].stars,filename=simname+'.mdf.png', range=[-4,0.5])
	pp.density_profile(h[i].dark,filename=simname+'.dmprof.png',center=False)
	pp.guo(h,baryfrac=True,filename=simname+'.guo.png')
	pp.schmidtlaw(h[i],filename=simname+'.schmidt.png',center=False)
	pp.satlf(h[i],filename=simname+'.satlf.png')
except:
	pass

  • Inside of a try: except: statement, a series of standard plots are created. Because of the try: except: pass, if any of them fails, the script will continue.
  • For each of these procedures, the filename argument lets you save the plot as the given filename.
  • In some of the plots, fixed axis ranges are given so that quick movies can be made by putting all the image frames together.

make standard images

diskgas=s.gas[diskf]
### Make pictures: 
try:
    pp.sph.image(h[i].gas,filename=simname+'.facegas.png',width=30)
    pp.sph.image(h[i].star,filename=simname+'.facestar.png',width=30)
    pp.sph.image(diskgas,qty='temp',filename=simname+'.tempgasdiskface.png',
                 width=30,vmin=3,vmax=7)
    s.gas['hiden'] = s.gas['rho']*s.gas['HI']
    s.gas['hiden'].units = s.gas['rho'].units
    pynbody.plot.image(s.gas,qty='hiden',units='m_p cm^-2',width=1000,
                       center=False,filename=simname+'.hi500kpc.png',
                       vmin=14,vmax=22)
    pynbody.plot.image(s.gas,qty='hiden',units='m_p cm^-2',width=500,
                       center=False,filename=simname+'.hi250kpc.png',
                       vmin=14,vmax=22)
    oviif = pynbody.analysis.ionfrac.calculate(s.gas)
    s.gas['oviden'] = s.gas['rho']*s.gas['OxMassFrac']*oviif
    s.gas['oviden'].units = s.gas['rho'].units
    soviim = pynbody.plot.image(s.gas[notdiskf],qty='oviden',
                                units='16 m_p cm^-2', width=1000,
                                filename=simname+'.ovi500kpc.png',
                                vmin=12,vmax=17)
    s.gas['oxden'] = s.gas['rho']*s.gas['OxMassFrac']
    s.gas['oxden'].units = s.gas['rho'].units
    pynbody.plot.image(s.gas,qty='oxden',units='16 m_p cm^-2',
                       width=500,center=False,
                       filename=simname+'.ox500kpc.png',vmin=12,vmax=18)
    pynbody.analysis.angmom.sideon(h[i])
    pp.sph.image(h[i].gas,filename=simname+'.sidegas.png',width=30)
    pp.sph.image(h[i].star,filename=simname+'.sidestar.png',width=30)
    pp.sph.image(s.gas,qty='temp',filename=simname+'.tempgasside.png',
                 width=320,vmin=3,vmax=7)
    pp.sph.image(s.gas,qty='temp',filename=simname+'.tempgasdiskside.png',
                 width=30,vmin=3,vmax=7)
    pynbody.plot.image(s.gas,qty='temp',width=500,center=False,
                       filename=simname+'.temp500kpc.png',vmin=3,vmax=7)
except:
    pass
  • The images are similar to the plots in that they get saved to the filename argument.
  • One difference is that the images require a width to say how much of the simulation they should include.
  • vmin and vmax represent the dynamic range of the image.
  • A couple of auxillary quantities are calculated like the enrichment of a certain ion species.