The Recommendation of VOEvent 2.0 draws ever closer, after successfully running the gauntlet at the spring meeting of the International Virtual Observatory Alliance. Our thoughts now move beyond the standard to the exciting science that can be done now that interoperability is solved: we can have multiple authors and multiple software contributing to the rapidly-evolving picture of an astronomical transient, with machines fusing that data to make rapid, accurate decisions. We will need to think of how VOEvents are authored, forwarded, selected, stored, queried, and mined. In each of these cases, we wish to provide the most appropriate ‘representation’ of the data in the VOEvent. Below are some suggestions for this representation.
We want to keep the semantics of VOEvent as much as possible, so that the same data is available in all the representations as much as reasonable. We can focus on individual events, or extend the representation for the VOEvent aggregates that we have come to call ‘portfolio’. A portfolio is a collection of multi-sourced VOEvent packets whose subject is the same astronomical transient, and they are associated through citation of one by another — the observation is a VOEvent, combined with followups or classification results also formatted as VOEvents. The VOEvent is an observation of something: it is that something that brings multiple events together.
(1) XML API
A single VOEvent, is an XML file. This representation carries the most fidelity to the intent of the original author, even though some links may be replaced for caching. A portfolio is a collection of VOEvent files that are mutually connected through citations in a graph, and it can be stored as a zip or tar etc. Querying is through Xpath, Xquery, or XML libraries like lxml or Jax. Custom API can be made made from the VOEvent schema through code binding.
(2) Dictionary API
Each event can be thought of as a key-value dictionary, one for each piece of data extracted from the event XML. Some keys are mandated by the VOEvent schema, (eg AuthorName, ISOtime) , and others come from the Group and Param name combinations specific to that stream, with the value an int, float, or string. Internal tables can be handled by allowing the value of such a key to be a vector — the values from the table column. A portfolio can then be a union of these dictionaries, each representing an event; to prevent name collision, each key would also contain the name of the event it comes from. This representation is natural for presentation templates and dictionary expressions: a table column such as e['lightCurve']['Vmag'] can become a python list of numbers. This representation is effective when a single portfolio is to be examined in detail, or annotated, perhaps classification of light curves or analysis of ephemeris. It can also be a table structure, with each row of the table having Stream, Group, Param, and Value, with queries that select on these tables.
(3) Relational table API
Here we are not representing a single astronomical transient but rather astro-informatics, with many transients in a table, searching, selecting, sorting, visualizing, and clustering. The columns of the table come from the stream of which the events are instances. Each VOEvent is translated to a row in the table (internal tables are not shown in this representation). Some columns are defined in the VOEvent standard, such as sky position and time; others are part of the stream defintition (params). We can also represent a collection of similarly structured portfolios; perhaps an event from stream A together with one from stream B, one the observation and the other the classification. For portfolios, there is a choice about joins: we can work with multiple tables, each representing the events from a given stream, and portfolio relationships represented as foreign keys; alternatively, we can fuse all stream tables into one, so that each portfolio is represented by a single record of this super-table. Working in this representation, we use SQL queries that depend on Param and other values from the events.