Background on the Choice of Architecture

The most effective way to add interactive data visualization to a system is through a modern Dashboard system. There are a number of these, some even "no code" implementations aimed at non-programmers. There is no way we can summarize all that here. What we can do is to characterize our needs and desires, choose an appropriate dashboard framework, and describe the work we had to do to make it meet our needs.

There is some functionality specific to astronomy that we can't realistically expect any off-the-shelf system to satisfy. SO one of our requirements is a mechanism to extend the Dashboard system as simply as possible. Still, for most applications the Dashboard system is expected to provide for our needs out of the box.

Use Patterns

For astronomical analysis, the most effective runtime enviroment is currently Python, either in the form of applications run from the command-line, within Jupyter Notebooks, or even as CGI/WSGI apps under a web server like Apache. Preferably, we would like whatever we develop to be useable from all of these with little or no modification. Similarly, the operational "front-end" (GUI) should be any web browser, preferably without the need to build our own custom client-side Javascript. Plotly/Dash also provides a mechanism for getting around having to write HTML files (sort of) but for a good reason that we will get to later.

Some Dashboard systems require the use of their own custom servers but this is ultimately limiting, particularly when one wants to use the framework entirely on a local machine (e.g., the users laptop) after simply importing Python modules and not having to explicitly configure and run any daemon processes. So a lot of the Dashboard systems, like Dash and Streamlit are focused on a driver application (e.g. a Python program or a dashboard-specific interpreter). The actual GUI is rendered by an off-the-shelf browser acting as a child of the driver (by way of a behind-the-scenes application-local Flask web server in the case of Dash).

Why Dash?

In the design of a Dashboard framework, there is always a trade-off between simplicity on the one hand and flexibility (and usually speed) on the other. For instance, programming Streamlit feels more like laying out a document (albeit with things like interactive plots) and Dash feels more like traditional application/GUI implementation. Here we won't spend any more time on comparisons between dashboard systems and instead focus on the attributes of Dash that make it appealing to us.

With one major exception, everything in a Dash app is Python code, from page layout to interaction callbacks. The code in the layout section of a Dash app is a set of fuction calls like

      html.Div([html.P(id='response')])

This isn't grandstanding; it satisfies two needs at once. First, it allows you to program the equivalent of an HTML page in a form that closely mimics the HTML. So if you know HTML you can readily construct the layout code.

More importantly, it give the Python code ownership of all the individual objects on the page. While these functions look like HTML (and ultimately the associated HTML gets rendered and sent to the browser), these are really React objects where every attribute is available to Python program as it can control and as the source of events. So the Dash app callbacks can react to everything going on in the interface.

Dash has taken the same approach to a number of other packages (Bootstrap for page layout, Ag Grid for table interaction, and of course all of the Plotly visualization tools).

Dash advertises itself as a "stateless" framework and I suspect most programmers initial reaction to this is to question how any serious application can be build without state. However, this is a misreading of the statement. What it precludes is having constructs like persistent global variable in the code itself because then there would be one instance of the code running for each user and need to reconnect to that code for every operation. Not only would that be a real headache for managing interaction but you would never know when the iteraction ends so you can clean up behind it.

Instead, state in a broader sense is managed through the use of file storage on the server side and browser storage on the client side. A consequence of this is that for web server use you can initialize a set of application instances at server start-up (e.g. using WSGI under Apache). These applictions (or more accurately their callbacks) are then available as needed by any browser page associated with the app. This is different from having user-specific instances; any one of the WSGI instances can be arbitrarily used by any client instance since none of the "state" needed by the instance is captured in the callback code but rather pointed to on the server disk or passed by the client instance memory. And when the client instance goes away the WSGI code doesn't care.

Finally, since Plotly itself is one of the most extensive data visualization toolkits (and obviously thoroughly integrated into Plotly Dash), pretty much any data tool needed already exists.

Astronomical Images

That is, except for astronomical image (and overlay) interaction. Dash can display JPEG/PNG images but proper display of FITS images, catalog overlays, etc. requires much more than this: zooming and panning, location picking, 3-color composites, cutouts, catalog and coordinate overlays and so on. Luckily, since Dash modules are simply wrapped React components (and Dash includes functionality for wrapping these React components), we can build our own astronomical image interaction component and distribute it through PyPI.

Also luckily, most of the function enumerated above already exists in the mViewer module in Montage (callable from Python) which builds static PNGs. By combining mViewer with a custom React component we have all the functionality we need. Then when something occurs that requires the image itself be changed (e.g., color stretch or cutouts, both of which triggered by something other than the React component), server-side Python working with Montage modules can generate the updated PNG and hand it off to the React component for display.

Data Apps

The range of Plotly graph types is impressive (2D/3D scatter plots; bar, line, pie, bubble and ternary charts; contours histograms; error bars; etc.). All these can be interactive and events in one graph can be linked to actions in another through callbacks. The above image/overlay component just adds to the list and allows us to construct Dash apps for specific astronomical purposes. For example, we can build a Dash app that takes a FITS image file and a set of astronomical catalog data tables covering the same region and use the above React component to display the image with the catalog overlays. The app can also display the tables themselves (using Ag-Grid Dash components).

Through callbacks in the app selections in the data tables can be highlighted on the image and selections on the image display can be used to highlight records in the table. While we could also add multiple simultaneous images or scatter plots of the tables to the mix (and interact between all of them), working with just one image and some tables is enough to illustrate the capabilities. It is likely that a set of such apps will be more useful as starting points for most users than some overly complicated superinterface.

Notebooks

We call the above an application but in reality this kind of interactive visualization is still just a building block for a full end-to-end processing scenario. For that we need to identify data, retrieve it, and possibly analyze it. That can be done with a set of web forms or with a custom interface (possibly even based on Dash) but increasingly users are turning to frameworks like Jupyter Notebooks where sets of components are strung together to do specific tasks. Most such Notebooks are examples that can be adjusted and augmented for the user's specific needs rather than finished products in themselves.