EOVSA Data Analysis Tutorial 2022

From EOVSA Wiki
Jump to navigation Jump to search

1. Introduction to EOVSA and Datasets

EOVSA

EOVSA (Expanded Owens Valley Solar Array) is a solar-dedicated radio interferometer operated by the New Jersey Institute of Technology. EOVSA observes the full disk of the Sun at all times when the Sun is >10 degrees above the local horizon, which is season dependent and ranges from 7-12 hours duration centered on 20 UT. Like any radio interferometer, the fundamental measurement for imaging is the correlated amplitude and phase between each pair of antennas, which is called a “complex visibility.” EOVSA’s 13 antennas form 78 such visibilities at any frequency and instant of time, i.e. 78 measurements of the spatial Fourier transform of the solar brightness distribution. EOVSA records these visibilities at *451 science frequency channels each second, in four polarization products (XX, YY, XY, YX), as well as additional total flux measurements from each individual antenna. These data are then processed through a pipeline processing system whose data flow is shown in the block diagram (Figure 1). One of the outputs of the pipeline is a visibility database in a widely used open-standard format called a CASA measurement set (or “ms”; CASA is the Common Astronomy Software Applications package used by many modern interferometer arrays).

EOVSA delivers the radio interferometry data on the following three levels:

Figure 1: Pipeline block diagram
  • Level 0 - Raw visibility data - This includes observations of cosmic sources for phase calibration, and gain and pointing observations required for total power calibration.
  • Level 1 - Calibrated visibility data - After applying calibration and other preliminary processing to level 0 data, we create the calibrated visibility data in CASA ms format (second column in Figure 1). These visibility data have all of the required content to produce Level 2 images and spectrogram data in standard FITS format. The following tutorial will guide you through how to make EOVSA images and spectrogram from visibility data. We provide a set of standard ms’s for each day (pink solid boxes in Figure 1) for users who wish to start with visibility data. You can retrieve EOVSA 1-min averaged visibility data in CASA ms format from this page. Full-resolution EOVSA visibility data in CASA ms format will be provided per request. Please contact the *EOVSA team if you wish to have Level 1 visibility data for a specific event.
  • Level 2 - Images and spectrogram data in standard FITS format - Most users, however, will prefer to work with spectrogram (frequency-time) and image data, which are also outputs of the pipeline system shown in Figure 1 (orange boxes). Spectrograms are provided as standard FITS tables containing the frequency list, list of times, and data in both total power and a sum of amplitudes over intermediate-length baselines (cross power). Likewise, image data products are in FITS format with standard keywords and are converted into the Helioprojective Cartesian coordinate system compatible with the World Coordinate System (WCS) convention, along with correct registration for the spatial, spectral, and temporal coordinates. Both the spectrogram and image data products are calibrated and have physical radio intensity units (sfu for spectrograms and brightness temperature for radio images).

2. Browsing and Obtaining EOVSA data

Figure 2: EOVSA Browser
Figure 3: RHESSI Browser

EOVSA Browser

The most convenient way for browsing Level 2 EOVSA data is through the EOVSA Browser. The overview EOVSA dynamic spectra on the top-left panel are from the median of the uncalibrated cross-power visibilities at a few short baselines, which are not (but a good proxy of) the total-power dynamic spectra. The effects of spatial information blended in the cross-power visibilities can be clearly seen as the "U"-shaped features throughout the day, which are due to the movement of the Sun across the sky that effectively changes the length and orientation of the baselines. Flare emission can be seen in the dynamic spectra, which usually appears as vertical bright features across many frequency bands. The pipeline quicklook full-disk images are also displayed for the given frequencies along with multi-wavelength full-disk maps such as SDO AIA, HMI, BBSO H-alpha by hovering on the buttons on the bottom of each map.. More information can be found on this page. The figure on the right shows an example of the overview EOVSA dynamic spectrum for 2021 May 07.

RHESSI Browser

EOVSA data can also be browsed through RHESSI Browser. Check the "EOVSA Radio Data" box on the data selection area (top-left corner). Then select year/month/date to view the overall EOVSA dynamic spectrum. Note if a time is selected at early UTC hours (e.g., 0-3 UT), it will show the EOVSA dynamic spectrum from the previous day. Also, note that EOVSA data were not commissioned for spectroscopic imaging prior to April 2017.


Once you identify the flare time, you can find the full-resolution (1-s cadence) uncalibrated visibility files (in Miriad format) at this link. Each data file is usually 10 minutes in duration. The name convention is YYYYMMDD (folder name) /IDBYYYYMMDDHHMMSS (file name), where the time in the file name indicates the start time of the visibility data.

Other useful links

3. Software requirements and installation

EOVSA visibility data processing: SunCASA A Python wrapper around CASA for synthesis imaging and visualizing solar spectral imaging data. CASA is one of the leading software tools for "supporting the data post-processing needs of the next generation of radio astronomical telescopes such as ALMA and VLA", an international effort led by the National Radio Astronomy Observatory. The current version of CASA uses Python (3.6) interface. More information about CASA can be found on NRAO's CASA website. NOTE: CASA is available ONLY on UNIX-BASED PLATFORMS, and therefore, so is SunCASA.

EOVSA image data processing: GSFIT A IDL-widget(GUI)-based spectral fitting package called gsfit, which provides a user-friendly display of EOVSA image cubes and an interface to fast fitting codes (via platform-dependent shared-object libraries). For this tutorial, we will demonstrate using SunCASA to create EOVSA image and spectrogram from the visibility data observed during a X-class flare on 2022 March 30. There are two approaches in accessing the SunCASA package:

[convenient] Through this notebook, together with Google Colaboratory (colab) which hosts this notebook on free virtual environment that requires no setup and runs entirely in the cloud. If you are into this option, skip the following Installation section and go directly to the Tour Section.

4. Download Sample EOVSA Data for the Tutorial

5. Information on the EOVSA Visibility Data (CASA Measurement Set)

With the pip installation, the CASA modules may be used in a standard Pythonic manner. For example, CASA tasks can be invoked using import, while CASA tools are Python classes that first need to be instantiated to create usable objects. A useful first task to run is listobs, which provides a summary of the measurement set.

from casatasks import listobs
import os
msfile = 'IDB20210507_1840-1920XXYY.cal.10s.slfcaled.ms'
rc = listobs(vis=msfile, listfile = msfile + '.listobs', overwrite=True)

print(os.popen("cat " + msfile + ".listobs").read())

6. Self-calibration (optional)

Self-calibration is often needed in bringing out details of the flare at multiple frequencies. An example script, run in SunCASA, for doing self-calibration of the 2017 Aug 21 flare at ~20:20 UT can be found at this tutorial.

7. Cross-Power Dynamic Spectrum

The first module we introduce is dspec. This module allows you to generate a cross-power dynamic spectrum from an MS file, and visualize it. You can select a subset of data by specifying a time range, spectral windows/channels, antenna baseline, or uvrange. The selection syntax follows the CASA convention. More information of CASA selection syntax may be found in the above links or the Measurement Selection Syntax.

Figure 1: EOVSA cross power dynamic spectrum at stokes XX and YY
from suncasa.utils import dspec as ds
import matplotlib.pyplot as plt
           ## define the visbility data file
msfile = 'IDB20220330_concat_slfcaled__backsub.ms' 
           ## define the output filename of the dynamic spectrum 
specfile = msfile + '.dspec.npz'  
           ## select relatively short baselines within a length (here I use 0.15~0.5km), 
           ## and take a median cross all of them (with the domedian parameter)
           ## alternatively, you can use the "bl" parameter to select individual baseline(s)
uvrange = '0.15~0.5km'
           ## this step generates a dynamic spectrum and saves it to specfile
dspec=ds.get_dspec(vis=msfile, specfile=specfile, uvrange=uvrange, domedian=True)
           ## dspec is a Python dictionary that contains the resulting dynamic spectrum. 
           ## A copy is saved in "specfile" as a numpy npz file.
           ## Other optional parameters are available for more selection criteria 
           ## such as frequency range ("spw"), and time range ("timeran")
           ## Use "ds.get_dspec?" to see more options
dspec['spec'].shape
           ## (1, 1, 451, 1798) One polarization, 451 frequencies and 1798 times
dspec.keys()
           ## ['tim', 'pol', 'uvrange', 'bl', 'timeran', 'freq', 'spec', 'spw']

           ## The following command displays the resulting cross-power dynamic spectrum
           ## Use ds.plt_dspec? to check more plotting options
ds.plt_dspec(specfile)   #alternatively the next step can also be used
ds.plt_dspec(dspec)
          

Now you should have a popup window showing the dynamic spectrum. Hover your mouse over the dynamic spectrum, you can read the time and frequency information at the bottom of the window.

8. Quick-Look Imaging

Imaging EOVSA data involves image cleaning, as well as solar coordinate transformation and image registration. We bundled a number of these steps ino a module named qlookplot, allowing users to generate an observing summary plot showing the cross power dynamic spectrum, GOES light curves and EOVSA quick-look images. Now let us start with making a summary plot of EOVSA image at the spectral window 5 (5.4 GHz).

EOVSA full-Sun single-band quicklook image
## in SunCASA
from suncasa.utils import qlookplot as ql
msfile = 'IDB20170821201800-202300.4s.slfcaled.ms'
           ## (Optional) Supply the npz file of the dynamic spectrum from previous step. 
           ## If not provided, the program will generate a new one from the visibility.
specfile = msfile + '.dspec.npz'  
           ## set the time interval
timerange = '20:21:10~20:21:18'  
           ## select frequency range from 5 GHz to 6 GHz
spw = '5~6'  
           ## select stokes XX
stokes = 'RR'   
           ## turn off AIA image plotting, default is True
plotaia = False 

ql.qlookplot(vis=msfile, specfile=specfile, timerange=timerange, spw=spw, \
    stokes=stokes, plotaia=plotaia)

The empty panels are there as we only selected one polarization for imaging/spectroscopy. Feel free to explore by adjusting the above parameters, e.g. use a different time range ("timerange" parameter) and/or frequency range ("spw" parameter).

With qlookplot, it is easy to engage solar data from SDO/AIA in the summary plot. Setting plotaia=True in the qlookplot command will download SDO/AIA data at the given time to current directory and add it to the summary plot.

EOVSA full-Sun single-band quicklook image with SDO/AIA as background
## in SunCASA
ql.qlookplot(vis=msfile, specfile=specfile, timerange=timerange, spw=spw, \
    stokes=stokes, plotaia=True)

The resulting radio image is a 4-D datacube (in solar X-pos, Y-pos, frequency, and polarization), which is, by default, saved as a fits file msfile + '.outim.image.fits' under your working directory. The name of the output fits file can be specified using the "outfits" parameter. In this example, we combine all selected frequencies (specified in keyword "spw") into one image (i.e., multi-frequency synthesis). Therefore the third axis only has one plane. The fourth axis contains polarization. In this example, however, only the first one is populated ("XX"). Here is the relevant information in the header of the resulting FITS file.

                        
NAXIS   =                    4                                                  
NAXIS1  =                  512/ Nx
NAXIS2  =                  512/ Ny                                             
NAXIS3  =                    1/  number of frequency                                           
NAXIS4  =                    2/  number of polarization

By default, qlookplot produces a full sun radio image (512x512 with a pixel size of 5"). If you know where the radio source is (e.g., from the previous full-Sun imaging), you can make a partial solar image around the source by specifying the image center ("xycen"), pixel scale ("cell"), and image field of view ("fov"). Here we show an example that images a 8-s interval around 20:21:14 UT using multi-frequency synthesis in 12-14 GHz and a smaller restoring beam. The microwave source is show to bifurcate into two components, which correspond pretty well with the double flare ribbons in SDO/AIA.

EOVSA multi-frequency synthesis quicklook image
## in SunCASA
xycen = [375, 45]  ## image center for clean in solar X-Y in arcsec
cell=['2.0arcsec'] ## pixel scale
imsize=[128]   ## number of pixels in X and Y. If only one value is provided, NX = NY
fov = [100,100]  ## field of view of the zoomed-in panels in unit of arcsec
ql.qlookplot(vis=msfile, specfile=specfile, timerange='20:21:10~20:21:18', \
              spw='12GHz~14GHz', stokes='XX', restoringbeam=['6arcsec'],\
              imsize=imsize,cell=cell,xycen=xycen,fov=fov,calpha=1.0)

Next, we will make images for every single spectral window in this data set (from spw 1 to 30, spw 0 is in the visibility data, but is not calibrated for this event).

EOVSA multi-band quicklook image
## in SunCASA
xycen = [375, 45]  ## image center for clean in solar X-Y in arcsec
cell=['2.0arcsec'] ## pixel size
imsize=[128]   ## x and y image size in pixels. 
fov = [100,100]  ## field of view of the zoomed-in panels in unit of arcsec
spw = ['{}'.format(s) for s in range(1,31)]
clevels = [0.5, 1.0]  ## contour levels to fill in between.
calpha=0.35  ## now tune down the alpha
restoringbeam=['6arcsec']
ql.qlookplot(vis=msfile, specfile=specfile, timerange=timerange, spw=spw, stokes=stokes, \
            restoringbeam=restoringbeam,imsize=imsize,cell=cell, \
            xycen=xycen,fov=fov,clevels=clevels,calpha=calpha)

The output FITS file as a 4-d cube is saved to msfile + '.outim.image.fits' under your working directory. The 30 spectral windows used for spectral imaging are combined in the output FITS file. So NAXIS3 (frequency axis) is 30.

                         
NAXIS   =                    4 / number of array dimensions                     
NAXIS1  =                  128                                                  
NAXIS2  =                  128                                                  
NAXIS3  =                   30                                                  
NAXIS4  =                    2  

Quick-Look Imaging series

# In SunCASA
msfile = 'IDB20170821201800-202300.4s.slfcaled.ms'
specfile = msfile + '.dspec.npz' 
## set the time interval
timerange = '2017/08/21/20:21:10~2017/08/21/20:21:18'
## Bin width for time averaging
twidth = 1
## frequency range
spw = ['0~3']
## image center for clean in solar X-Y in arcsec
xycen = [375, 45]
## number of pixels in X and Y. If only one value is provided, NX = NY
imsize = [128]
## field of view of the zoomed-in panels in unit of arcsec
fov = [100., 100.]
## pixel scale
cell = ['2.0arcsec']
## select stokes XX
stokes = 'XX'
## for EOVSA data, set usemsphacenter to False
usemsphacenter = False
## set True if make a movie
mkmovie = True
## set True if generate compressed fits
docompress = True
#### ---- Control knobs for AIA plotting ---- ####
## set True if plot AIA images as the background
plotaia = True
## provide the path to the directory where the AIA fits files are located. Otherwise, set it to be ''
aiadir = 'Directory-where-AIA-fits-files-are-located'
## AIA passband. The options are [171,131,304,335,211,193,94,1600,1700]
aiawave = 171
## numbers of CPU threads for computing
ncpu = 2


ql.qlookplot(vis=msfile, specfile=specfile, timerange=timerange, spw=spw, 
          xycen=xycen, imsize=imsize, fov=fov, cell=cell, usemsphacenter=usemsphacenter, 
          plotaia=plotaia, aiadir=aiadir, aiawave=aiawave, 
          mkmovie=mkmovie, twidth=twidth, ncpu=ncpu, docompress=docompress, stokes=stokes)

9. Downloading fits files to your local system with Batch-Mode Imaging

This section is for interested users who wish to generate FITS files with full control on all parameters being used for synthesis imaging. We provide one example SunCASA script for generating 30-band spectral imaging maps, and another for iterating over time to produce a time series of these maps.

Producing a 30-band map cube for a given time

An example script can be found at this Github link. If you are on the AWS server Virgo, it is under /common/data/eovsa_tutorial/imaging_example.py. First, download or copy the script to your own working directory and cd to your directory.

# in SunCASA
CASA <##>: cd your_working_directory
CASA <##>: !cp /common/data/eovsa_tutorial/imaging_example.py ./
Example multi-frequency images at a single time integration

Second (optional), change inputs in the following block in your copy of the "imaging_example.py" script and save the changes. This block has definitions for time range, image center and FOV, antennas used, cell size, number of pixels, etc.

################### USER INPUT GOES IN THIS BLOK ################
vis = 'IDB20170821201800-202300.4s.slfcaled.ms'  # input visibility data
trange = '2017/08/21/20:21:00~2017/08/21/20:21:30' # time range for imaging
xycen = [380., 50.] # center of the output map (in solar X and Y, Unit: arcsec)
xran = [340., 420.]  # plot range in solar X. Unit: arcsec
yran = [10., 90.]  # plot range in solar Y. Unit: arcsec
antennas = '' # Use all 13 EOVSA 2-m antennas by default
npix = 256 # number of pixels in the image
cell = '1arcsec' # pixel scale in arcsec
pol = 'XX' # polarization to image, use XX for now
pbcor = True # correct for primary beam response?
outdir = './images' # Specify where to save the output fits files
outimgpre = 'EO' # Something to add to the output image name

Then, run the script in SunCASA via

CASA <##>: execfile('imaging_example.py')

The output is a combined 30-band FITS file saved under "outdir". The naming conversion is outimgpre + YYYYMMDDTHHMMSS_ALLBD.fits. In this example, it is "./images/EO20170821T202115.000_ALLBD.fits". Here is the detailed information of the axes of the output FITS file.

                         
NAXIS   =                    4 / number of array dimensions                     
NAXIS1  =                  128                                                  
NAXIS2  =                  128                                                  
NAXIS3  =                   30                                                  
NAXIS4  =                    2  

From this point, you can use your favorite language to read the fits files and plot them. The last block of the example script uses SunPy.map to generate the plot shown on the right. For SSWIDL users, please refer to Section 3.3.3.3.

Producing a time series of 30-band maps (a 4-d cube)

An example script can be found at this Github link. If you are on the AWS server Virgo, it is under /common/data/eovsa_tutorial/imaging_timeseries_example.py. The script enables parallel-processing (based on a customized task "ptclean3" -- do not ask me why we have "3" in the task name). To invoke more CPU processes for parallel-processing, change "nthreads" in the preambles from 1 to the number of threads you wish to use.

# in SunCASA
CASA <##>: cd your_working_directory
CASA <##>: !cp /common/data/eovsa_tutorial/imaging_timeseries_example.py ./

Similar as the previous script, change inputs in the following block of your copy of the "imaging_timeseries_example.py" script and save the changes.

Example multi-frequency images at one time frame in the output javascript movie
################### USER INPUT GOES IN THIS BLOK ####################
vis = 'IDB20170821201800-202300.4s.slfcaled.ms'  # input visibility data
specfile = vis + '.dspec.npz'  ## input dynamic spectrum
nthreads = 1  # Number of processing threads to use
overwrite = True  # whether to overwrite the existed fits files.
trange = ''  # time range for imaging, default to all times in the data
twidth = 1  # make one image out of every 2 time integrations
xycen = [380., 50.]  # center of the output map in solar X and Y. Unit: arcsec
xran = [340., 420.]  # plot range in solar X. Unit: arcsec
yran = [10., 90.]  # plot range in solar Y. Unit: arcsec
antennas = ''  # default is to use all 13 EOVSA 2-m antennas. 
npix = 128  # number of pixels in the image
cell = '2arcsec'  # pixel scale in arcsec
pol = 'XX'  # polarization to image, use XX for now
pbcor = True  # correct for primary beam response?
grid_spacing = 5. * u.deg  # grid spacing in degrees
outdir = './image_series/'  # Specify where to save the output fits files
imresfile = 'imres.npz'  # File to write summary of the imaging results
outimgpre = 'EO'  # Something to add to the image name

Then, run the script in SunCASA by

CASA <##>: execfile('imaging_timeseries_example.py')

The output fits images and a summary file imresfile are saved under "outdir" (in this example "./image_timeseries"). The naming convention of output fits images is the same as the previous section. The summary yields the time, frequency, and path to every image.

CASA <##>: imres = np.load(outdir + imresfile)['imres'].item()
CASA <##>: imres.keys()
['FreqGHz', 'ImageName', 'Succeeded', 'BeginTime', 'EndTime']
CASA <##>: ls image_series/*.fits | head -5
image_series/EO20170821T201800.500_ALLBD.fits
image_series/EO20170821T201804.500_ALLBD.fits
image_series/EO20170821T201808.500_ALLBD.fits
image_series/EO20170821T201812.500_ALLBD.fits
image_series/EO20170821T201816.500_ALLBD.fits

The last block of the example script uses SunPy.map to generate a series of plots and wraps them as a javascript movie.

Plotting Images in SSWIDL

All the output files are in standard FITS format in the Helioprojective Cartesian coordinate system (that most spacecraft solar image data adopt; FITS header CTYPE is HPLN-TAN and HPLT-TAN). They are fully compatible with the SSWIDL map suite which deals with FITS files. We have prepared SSWIDL routines to convert the single time or time-series FITS files to an array of SSWIDL map structure. The scripts are available in the Github repository of our tutorial. For those working on Virgo, local copies are placed under /common/data/eovsa_tutorial/. Two scripts are relevant to this tutorial:

  • casa_readfits.pro: read an array of multi-frequency FITS files into FITS header (index) and data.
  • casa_fits2map.pro: convert an array of multi-frequency FITS files into an array of SSWIDL map structure (adapted from P. T. Gallagher's hsi_fits2map.pro)

First, start SSWIDL from command line:

sswidl

Find and convert the fits files:

; cd to your path that contains casa_readfits.pro and casa_fits2map.pro. If on Virgo, just add the path
IDL> add_path, '/common/data/eovsa_tutorial/'
IDL> fitsdir = '/common/data/eovsa_tutorial/image_series/'
IDL> fitsfiles = file_search(fitsdir+'*_ALLBD.fits')
; The resulting fitfiles contains all multi-frequency FITS files under "fitsdir" 
; The following command converts the fits files to an array of map structures. 
; "casa_fits2map.pro" also work well on a single FITS file. 
; (Optional) keyword "calcrms" is to calculate rms and dynamic range (SNR) of the maps in a user-defined "empty" region 
;        of the map specified by "rmsxran" and "rmsyran"
;;; Here we load 10 time frames as a demonstration
IDL> casa_fits2map, fitsfiles[45:54], eomaps [, /calcrms, rmsxran=[260., 320.], rmsyran=[-70., 0.]]

The resulting maps have a shape of [# of frequencies, # of polarizations, # of times]

IDL> help,eomaps
EOMAPS          STRUCT    = -> <Anonymous> Array[30, 1, 10]

They have compatible keywords recognized by, e.g., plot_map.pro. Example of the output map keywords:

IDL> help,eomaps,/str
** Structure <36009f8>, 24 tags, length=131336, data length=131332, refs=1:
   DATA            DOUBLE    Array[128, 128]
   XC              DOUBLE           378.99175
   YC              DOUBLE           49.001472
   DX              DOUBLE           2.0000000
   DY              DOUBLE           2.0000000
   TIME            STRING    '21-Aug-2017 20:20:59.500'
   ID              STRING    'EOVSA XX 3.413GHz'
   DUR             DOUBLE           3.9999997
   XUNITS          STRING    'arcsec'
   YUNITS          STRING    'arcsec'
   ROLL_ANGLE      DOUBLE           0.0000000
   ROLL_CENTER     DOUBLE    Array[2]
   FREQ            DOUBLE           3.4125694
   FREQUNIT        STRING    'GHz'
   STOKES          STRING    'XX'
   DATAUNIT        STRING    'K'
   DATATYPE        STRING    'Brightness Temperature'
   BMAJ            DOUBLE        0.0097377778
   BMIN            DOUBLE        0.0097377778
   BPA             DOUBLE           0.0000000
   RSUN            DOUBLE           948.03989
   L0              FLOAT           0.00000
   B0              DOUBLE           6.9299225
   COMMENT         STRING    'Converted by CASA_FITS2MAP.PRO'
Example multi-frequency EOVSA images at one time plotted in SSWIDL

An example for plotting all images at a selected time:

; choose a time pixel
IDL> plttime = anytim('2017-08-21T20:21:15')
IDL> dt = min(abs(anytim(eomaps[0,0,*].time)-plttime),tidx)
; use maps at this time, first polarization, and all bands
IDL> eomap = reform(eomaps[*,0,tidx])
IDL> window,0,xs=1000,ys=800
IDL> !p.multi=[0,6,5]
IDL> loadct, 3
IDL> for i=0, 29 do plot_map,eomap[i],grid=5.,$
         tit=string(eomap[i].freq,format='(f5.2)')+' GHz', $
         chars=2.0,xran=[340.,420.],yran=[10.,90.]