Tohban OVRO-LWA Imaging Tutorial

From EOVSA Wiki
Jump to navigation Jump to search

Initial Setup

The OVRO-LWA has three solar modes that can operate concurrently. These are (1) the beamformer, which creates a high-resolution spectrogram of the solar activity each day, (2) a slow visibility mode that records data in CASA ms format for all 352 antennas and all 3072 frequencies at 10-s cadence, and (3) a fast visibility mode that records data for a 48-antenna subset (generally the outer antennas) and 768 frequencies at 1-s cadence. The recorders that record the data are all activated separately, so it is not guaranteed that data from all three modes are available at any one time. Also, because of the vast data volume most of the recorded data are not saved, but rather are overwritten after a day or so, hence any data that are wanted must be explicitly saved by copying it to another location. Again because of the large volume of data, such copying is too slow to save much data (at least at present), so we can generally save only about an hour of data per day.

Note: This tutorial only describes how to work with the slow visibility data at the moment.

Python Environment

The imaging pipeline is written in Python 3, so in order to use it one must set up a Python 3 environment. These instructions assume you are working in your own home directory on the Pipeline machine at OVRO. First enter the bash shell if you are not already in it. Type echo $0 to see what shell you are in. If that returns something other than -bash, type bash to enter the shell. Next check if you have the line alias loadpyenv3.8='source /home/user/.setenv_pyenv38' in your ~/.bash_aliases file. If not, add it using your favorite editor, then activate it with source ~/.bash_aliases. From there, you can type loadpyenv3.8 to enter the Python 3.8 environment. Finally, from your home folder, type git clone https://github.com/binchensun/ovro-lwa-solar to install the OVRO-LWA code. To test your Python environment, log out and log in again fresh, then type

$> loadpyenv3.8
$> ipython --pylab
import sys
sys.path.append('/home/dgary/ovro-lwa-solar')  # Replace with your own home directory
import solar_pipeline

If that succeeds, you should be ready to proceed.

Where to Find Data

The next step is to find the data you want to work with. You will need some calibration data as well as the solar data for your target date. As of this writing, the existing solar data on Pipeline, is in two separate places: /nas5/ovro-lwa-data (data up to 2023-09-03) and /nas6/ovro-lwa-data (data from 2023-09-18 and later). All of the existing beamformed data (spectrograms) are in /nas5/ovro-lwa-data/beam/beam-data.

This tutorial uses the example of the type II burst on 2023-07-28.

Examining the Spectrogram for Your Date

It is good practice to examine the spectrogram for your date/time, to guide your selection of frequencies and times to use for imaging. You can check the folders and subfolders in /nas5/ovro-lwa-data/beam/beam-data to see what files exist. Note that the filenames have the Modified Julian Data (mjd) followed by hours, minutes, seconds in the format <mjdday>.<hh><mm><ss>?????????? where the ? indicate more digits of the fraction of a second. The type II burst we are interested in started around 15:43 UT on 2023 July 28, which is MJD 060154, so the file we want is /nas5/ovro-lwa-data/beam/beam-data/202307/beam20230728/060153_152717110834334d2be, which starts at 15:27:17 UT. Generally these files contain 30 min of data. The type II continues into the next file, which is 060153_1558172229518804396.

To read and display this file, in iPython type

2023 July 28 Type II event spectrogram
import sys     # If not already loaded
sys.path.append('/nas5/ovro-lwa-data/beam/software/')
from lwa import lwa_read, lwa_plot
datadir = '/nas5/ovro-lwa-data/beam/beam-data/202307/beam20230728/'
data = lwa_read(datadir+'060153_152717110834334d2be', stokes='IV', timebin=1, freqbin=4)
lwa_plot(data, vmax=15000,vmin=10)

which defaults to log-scaled amplitudes and viridis color table for stokes I and linear-scaled amplitudes and grayscale for stokes V, as shown at left. You can examine lwa_plot? for more options.


Calibration and Imaging Script

The script below assumes some previous setup. First, a "home" directory needs to be created and the script must be run from that directory. Because of the large amount of disk space required, create your "home" directory on /data1. Mine is /data1/dgary/OVRO-LWA/20230728_workdir. Before running the script, you'll need to change the 7 lines indicated with the ***Change comments.

  1. The first such line is the list of frequency bands you want to image. In this case I have all 13 useful bands. Frequencies below 27 MHz rarely image well and in many cases we did not save the data for those frequencies anyway.
  2. The second is a string representing the date of the event, including an underscore (this is part of a filename).
  3. The third line is a list of solar times. These times have to exactly match existing filenames, so you'll have to do a listing of the data directory to check them. Warning: Doing a listing of the entire data directory is time consuming and not useful, since there are many thousands of files there. Instead, use something like: ls /nas5/ovro-lwa-data/20230728/slow/20230728_1553* to limit the number of files returned.
  4. The fourth line is the date string of the calibration data. This will almost always be the same as the date string of the data, but it is possible to use a calibration from a different date if not too far apart.
  5. The fifth line is the time of the calibration data. Again, this must exist. Usually the calibration is done at night so the time will be quite different, e.g. 0500 UT, and a command like ls /nas5/ovro-lwa-data/20230728/slow | head -20 will list the first 20 files in the folder, which are likely the calibration files. Unfortunately, no nighttime calibration exists for this date, so I had to use a daytime time, 15:40 UT.
  6. The sixth line is the path to the data.
  7. The seventh line is the path to the calibration data, again usually the same as that for the data.
import os, glob
import utils
from time import time
import solar_pipeline

freqs=[27,32,36,41,46,50,55,59,64,69,73,78,82]    # ***Change to the bands you want to image
datstr = '20230728_'                              # ***Change to the date of your event
solar_times = ['155306','155316','155326']        # ***Change to the times to use for solar imaging -- these times must exist!
caldatstr = '20230728_'                           # ***Change to the date of your cal data
cal_time = '154003'                               # ***Change to the time for your calibration
datapath = '/nas5/ovro-lwa-data/20230728/slow/'   # ***Change to path to your data
calpath = '/nas5/ovro-lwa-data/20230728/slow/'    # ***Change to path to your calibration data

home=os.getcwd()
for solar_time in solar_times:
    for freq in freqs:
        calib_ms=caldatstr+cal_time+'_'+str(freq)+"MHz.ms"     # Will be copied from calpath
        solar_ms=datstr+solar_time+'_'+str(freq)+"MHz.ms"      # Will be copied from datapath
        bcal='caltables/'+calib_ms.replace('ms','bcal')        # Will be created if it doesn't already exist
        imagename=datstr+solar_time+'_'+str(freq)+"MHz"
        image_fold = 'images/'

        # Create frequency folder, if it doesn't exist
        freq_fold=str(freq)+"MHz"
        if not os.path.isdir(freq_fold):
            os.mkdir(freq_fold)

        # Copy the solar data for this time (will be deleted later)
        print('Copying solar data to frequency folder')
        os.system("cp -r "+os.path.join(datapath,solar_ms)+"* "+freq_fold+"/")
        # Copy the calibration data (will be deleted later)
        print('Copying calibration data to frequency folder')
        os.system("cp -r "+os.path.join(datapath,calib_ms)+"* "+freq_fold+"/")

        os.chdir(freq_fold)
        if not os.path.isdir(image_fold):
            os.mkdir(image_fold)
        if not os.path.isfile(bcal):
            bcal = None
        if not os.path.isdir('caltables'):
            os.mkdir('caltables')

        try:
            solar_pipeline.image_ms(solar_ms=solar_ms,calib_ms=calib_ms,bcal=bcal,\
                        imagename=imagename,do_final_imaging=False,logfile='analysis_'+str(freq)+'.log')
            msname = datstr+solar_time+'_'+str(freq)+'MHz_final.ms'
            os.system("mv *calibrated_selfcalibrated_sun_only_sun_selfcalibrated_sun_only.ms final_ms/"+msname)
            os.system("rm -rf *.ms* *.fits *.gcal *.cl *.badants")
            # Make 10 images for this band (integrates over 19 or 20 subchannels, bandwidth ~0.4545 MHz)
            os.system('wsclean -no-dirty -size 1024 1024 -scale 1arcmin -weight uniform -minuv-l 10 -name '+imagename+' -niter 10000 -mgain 0.8 -beam-fitting-size 1 -pol I -join-channels -channels-out 10 final_ms/'+msname)
            # Convert images to heliocentric, move them to the final image folder, and delete all fits files
            files = glob.glob('*-image.fits')
            for imgfile in files:
                solar_pipeline.correct_primary_beam('final_ms/'+msname, imgfile.split('-image.fits')[0])
                helio_image = utils.convert_to_heliocentric_coords('final_ms/'+msname, imgfile)
                os.system('mv '+helio_image+' '+image_fold)
            os.system('rm *.fits')
        except:
            pass
        os.chdir(home)

What Happens When You Run the Script