Montage and Docker

Image Processing in a Container (cont.)

In the first half of this tutorial we built a Docker image with all the functionality we will need to construct a 3-color map of M51 in the SDSS u, g and i bands, and overlay 2MASS catalog sources and Spitzer MIPS image outlines on it.

In the second half we will actually do that work, including visualizing the result.

Building Mosaics using the Container

At the end of part one, we had just started our Docker container using:

 docker run --name montage_jupyter --rm -it -p 8888:8888 \
             -v /home/workshop_usr/Docker/work:/work montage/jupyter:latest

Once we do this, the container is running and we see a prompt from the Docker container Debian OS. Inside the Docker container, our "permanent" disk space is mounted as /work, so we start by changing directory to there.

We will use the copy of Montage we installed through the Dockerfile to build the mosaics. Montage is a set of tools for manipulating astronomical images but the focus is on making mosaics. Where the datasets are reasonably well behaved, we can use a pre-defined sequence of steps:

Search image metadata for images overlapping with our region of interest.
Retrieve images from permanent archive.
Reproject all images to a common frame.
Analyze overlap regions between images.
Model and correct backgrounds.
Coadd into mosaic.

While these can be run in sequence, the steps are standard enough that they have also been packaged up in an "executive" process called mExec.

Montage reprojections/mosaics use a template FITS header to define the output projection and there is a simple utility for creating such a header using TAN (Gnomonic) projection. So first we generate the header for a 0.3 degree region centered on M51 (and using 0.4 arcsecond pixels as that is the resolution of the SDSS images):

Montage modules

cd /work

mHdr -p 0.4 M51 0.3 M51.hdr
mExec -q -d 2 -l -k -f M51.hdr SDSS u uband
mExec -q -d 2 -l -k -f M51.hdr SDSS g gband
mExec -q -d 2 -l -k -f M51.hdr SDSS i iband

The header file (M51.hdr) is readable text, so take a look at it if you like.

We are running these step interactively through a bash shell (the default CMD in the Dockerfile) but you can alternatively run them each as a standalond Docker instance without the shell by adding the command you want run to the end of the "docker run" call above. This is one way you can fold it into a massive workflow.

Docker command-line execution

Visualizing the Mosaics

There are any number of ways to visualize FITS images but the Montage visualizer integrates particularly well with Jupyter notebooks, giving us an opportunity to illustrate how a container can support that sort of technology.

The Anaconda distribution we installed via the Dockerfile contains Jupyter by default. We added astroquery, which gives us access to pretty much all the on-line astronomical catalog data.

Normally, you start Jupyter and it starts a browser as a child process (the browser then contacts Jupyter for content). Here Jupyter (i.e. Python) will be running inside the Docker container on an Amazon server and the Browser will be on the local (to us) desktop/laptop.

Also, when we run inside Docker we usually run as root. Jupyter is sensitive to this, normally declining to run if you are root since normally that creates all sorts of security concerns. So we have to add some extra arguments to the Jupyter startup to allow running as root and to prevent it trying to start a browser.

Finally (and this is probably overkill), we limit connections to come from the local machine (Docker container) IP address.

   jupyter notebook --ip=0.0.0.0 --no-browser --allow-root

This responds with a message telling us how to connect through a browser:

      To access the notebook, open this file in a browser:
      file:///root/.local/share/jupyter/runtime/nbserver-20-open.html
      Or copy and paste one of these URLs:
      http://8c1229e88553:8888/?token=5137e31bb6f0cc03b610d435c6c46234f11da6497e19c6be
      http://127.0.0.1:8888/?token=5137e31bb6f0cc03b610d435c6c46234f11da6497e19c6be

However, this address is for use inside the container and we want to connect from our local (e.g. desktop/laptop) machine. This is why we tunneled port 8888. If we have our browser connect to port 8888 on the Amazon machine, that will tunnel inside the Docker contain and be seen as port 8888 coming from "localhost". In my case the Amazon machine had IP address 34.221.172.99, so I need to tell my browser to connect to

   http://34.221.172.99:8888/?token=5137e31bb6f0cc03b610d435c6c46234f11da6497e19c6be

The IP address will be different for every user and the token string changes with each Jupyter run.

At this point, our local browser should show the list of files we have in the workspace. This will include the FITS mosaics we built above and it should also include a notebook file (mViewer.ipynb) if you copied it when we told you to above. Click on that and follow the instructions on the page.

Here's a static HTML rendering of that page.