Autonomic Analytics System for natural gas traders

What is autonomic analytics?

The concept of autonomic analytics derives from the autonomic nervous system.

The autonomic nervous system is a division of the central nervous system that unconsciously controls respiration, cardiac regulation, vasomotor activity and certain reflex actions such as coughing, sneezing and swallowing in the human body.

Akin to this division of the central nervous system, there are several aspects of the fundamental, technical, pricing, mid office, risk and back-office analytics processes in natural gas trading that lend themselves to autonomic control and execution.  Currently, a significant portion of these tasks is carried out manually.

How does autonomic analytics apply to various functions in natural gas trading?

On a natural gas desk, there are a significant number of analysts developing various software artifacts that analyze market conditions, quantify risk or value various kinds of derivatives.

A fundamental analyst may develop a natural gas to coal switching model that requires fuel prices derived from natural gas market settlement prices for futures, basis, and cash prices.

The risk manager requires the same data for market risk analytics.   Mid office personnel generation P&L, PLEX and VAR reports based on the same data sets.

Back office staff would require the same data for settlements, invoicing and dispute resolutions.  All these functions require the same settlement price data sets.

In small to medium-sized firms, each knowledge worker independently sources the same information using tools available to them.  The source could be directly from exchanges, data vendors or internal Tier 1 systems of records like the ETRM system.  The methodology would depend on the availability of the data sources and the skill set of the individual analyst.  Excel based systems would typically link to an Excel file on a share somewhere within the organization that is a downloaded CSV.  IT savvy analysts may retrieve the data from the ETRM system using an Application programming interface (assuming the analyst has been provided access to market data sets by the IT Governance function) or they may write scripts to ftp and parse the data directly from the exchange.

In either case, separate tools are developed and run manually by several personas in the natural gas trading cycle.  The simple function of acquiring settlement data in the above example is now replicated several times by various functions in the organization.  Additionally, the process is not systematic and prone to break downs.

 

What if ……

……there was a single robust tool written once for the entire organization, that ran autonomously on a schedule and obtained that data from the designated source with very little manual intervention.

After obtaining the information in the form of a CSV file, it then parsed that data into a single repository coupling it with relevant information used to query the data and made it available to any knowledge worker or persona in the organization using a single application programming interface or an API with the ability to specify query criterion.

This software artifact could be triggered on a schedule, retrieve and store data, then generate a spreadsheet, or a report or trigger other software artifacts to generate relevant pieces of information used by every person within the organization.

Going a step further, it then triggers an event that other software artifacts respond to and run predefined models to extract actionable insights that are then presented to the end user either on demand or via notifications as they become available.

All this without a single knowledge worker lifting a finger and hitting “Enter”.  That is autonomic analytics.

Requirements for an Autonomic Analytics System for trading natural gas?

The system would automate acquisition of the relevant data sets based on a trigger or some stimulus.  The trigger could be a scheduler, or an event generated by an observer or a manual trigger to begin the process of downloading and processing a data set.

The Nymex uploads settlements to their ftp server around 5.30 pm.   A task on a scheduler (Task Manager, Cron, Tidal or Quartz based) triggers the run of a script to retrieve the data and store it on a file system.  It then sends out an event indicating completion of the download.  Several observers record the completion of the event.   Some trigger subsequent actions of extracting settlements for natural gas futures contracts and storing them in a fact table of a data warehouse while at the same time it triggers a new event.  A third event handler then triggers creation and storage of spreads.  An event generated from the third handler triggers a visualization script.  The visualization script triggers but all prerequisite events have not been recorded so it waits until the event occurs and then triggers.

In this manner, a sequence of tasks is accomplished based on schedules, sufficiency triggers or manual stimulus.

A trader or analyst arrives the next morning in time for market open and is now presented with required data sets, reports and visualization of their data to analyze the market and take appropriate trading decisions.  Data, visualizations and narratives are syndicated to dashboards, enterprise systems, mobile apps, blogs and social apps like Slack.

 

An example of a manual process to generate a data product at a prominent data vendor to natural gas traders?

Energiewerks Ensights was used to implement an Autonomic Analytics System at a data / report vendor that generated natural gas production forecasts and daily natural gas supply and demand forecasts.

To begin with the data / vendor’s processes to generate various components of the Supply and Demand report were all Excel based and manual.  The analyst would wake up in the wee hours of the morning every day.  Run a set of queries in an in-house tool that were SQL based.   For each data set returned the data would be exported to a csv file.  These csv files were uploaded to a share on the data vendor’s file system.

The analyst would then detach the gas weighted degree day report which came in at 6.15 am.  This data would contain actual observations for the previous day and forecast observations for the next 15 days.   This data would be saved in to a csv file.  This csv file would be appended to the historical data stored in another csv file using some clipboard based cut paste and saved to a share.

Additional data sets such as the industrial production numbers and other features for the linear regression models would be downloaded.   For each component of the supply and demand report the analyst would then open individual model spreadsheet and run macros in those spreadsheets to import the relevant csv files and manipulate the data into groupings needed.  These were a sequence of manual steps.  Cut / paste this here.  Run Macro A.  Then copy this range over there manually and Run Macro B and so on for each component.    Generate graphs within Excel.

The report sent out to customers was generated manually using an Adobe InDesign template.  The analyst would then open the Indesign template.  Add in commentary on the L48 and regional S&D.  Then manually cut paste the generated graphs and tables in to the Adobe Indesign template.  Then generate the outgoing pdf and save it to a share.

This process was very error prone and not an easily transferable skill as it required a fundamental analyst to learn how to use a high end desk top publishing tool.  Add to that the licensing cost of a desktop publishing tool.

A second analyst would then create an email template, attach the outgoing pdf and run an email delivery program to deliver the report via email to about 1500 recipients.

This inefficient process was repeated every business day and generated just 5 reports of the 150 reports or so generated every day across the firm.  Other reports were generated in the same manner.  Each with their own people intensive process.

An example of an Autonomic Analytics System at work?

Replacing the above manual process with an Autonomic Analytics System, such as Energiewerks Ensights toolkit, automated every report generation aspect of the recurring data product with no human touch.

The process was broken down in to the following 8 categories of tasks:

  1. Acquire the various datasets
  2. Prepare the datasets for analytics by blending multiple data sets
  3. Store the blended data in a format that can be analyzed by numerical programming tools
  4. Analyze the data using multivariate linear regression
  5. Visualize the results in the form of line charts
  6. Generate pdf files and disseminate that information via the e mailing service
  7. Orchestrate the above tasks based on an optimum schedule based on the arrival of data
  8. Operationalize the workflow by deploying tasks to the private cloud

 

For the specific use case detailed above, the Ensights toolkits acquire tools were used to query the in-house database using Structured Query Language (SQL).  The resultant query results were wrapped in to a tabular structure where in features were extracted as per needs of the model, duplicates removed, cleaned, derived features generated as per needs of the model and historical, current and forecast feature sets blended together.  This was then stored to an intermediate csv file so that it could be ingested in to the manual excel system if required.

The analyze tools then triggered to extract the features in the csv and run a multivariate regression to predict the supply or demand current and forecast component.  These results were then appended to historical results and stored separately in to an intermediate file.

The dissemination phase tools were then activated, the template chosen and the data populated with the commentary, tables, graphics, narratives, charts etc.  with associated branding.

The pdf file generated was then coupled with an email template and sent out on a schedule to recipients.

The various stages of this entire process occurred at various times overnight.   By the time the analyst got in to work in the morning or logged in remotely prior to the start of the trading day.  The report was ready and delivered to the customers with no intervention from the analyst once the analyst approved its content.

The autonomic analytics system was expanded to encompass close to 75% of the reports generated.

Benefits of the autonomic analytics system

The benefits of the autonomic analytics system were experienced by the analyst with regards automation.   Due the downturn in the industry and the resulting freeze on hiring at this data vendor any departures of analysts resulted in increased workloads for the analyst as no replacement hires were made.  Multiple reports were generated with reliable frequency and delivered to the customer on time.   A significant number of mundane manual tasks prone to human error were eliminated in favor of autonomic tasks.  These tasks just occurred.  No longer was there a need to monitor and redo significant amounts of work when hand offs occurred or analysts went on vacation or left the firm.   A significant portion of the analytics generation was institutionalized and autonomic.  These tasks occur autonomically allowing the analyst to focus on generating new data products.

Additionally, analysts now focus on identifying results generated from the autonomic analytics system and are now more efficient in the process of sending out intraday research notes based on observations and notifications received from the system to alert traders on detected anomalies across the gas pipeline network, instead of having to constantly compile the data and then analyze it.