Built on top of a solid foundation of highly functional and lightweight domain objects Metadata Technology has designed developed and matured their SDMX component architecture over the past few years to gain wide coverage of the SDMX standard. From querying web services to hosting web services, from consuming data to constructing data, Metadata Technology’s component stack contains components that meets the needs of most SDMX aware systems, and is continually growing. The ‘plug and play’ design, made possible with new programming paradigms such as Dependency Injection and Aspect Orientated Programming (AOP), allows for users to pick and choose components that aid in the development of a robust software system.
We set out to design a framework that was fully scaleable, reuseable, robust and decoupled from the exchange format and version with the primary goal of removing the complexity from the standard by offering simple, easy to use interfaces. We believe this is what we have achieved.
SDMX Beans
The SDMX Beans underpin the framework and are what may be referred to as ‘domain objects’, ‘data transfer objects’, or ‘business objects’. The SDMX beans are an Object representation of the information being exchanged. The information may have been exchanged in SDMX, EDI, or CSV, it does not matter, as the SDMX Beans are not coupled to a particular syntax or version of the standard. The SDMX Beans are format and version independent allowing the software that uses them to be decoupled from how the data was exchanged.
SDMX Beans – Flavour of Bean
Due to the conflicting use cases for the SDMX domain objects, Metadata Technology has defined 3 core ‘flavours of Bean’; the SDMX Beans (immutable), the SDMX Mutable Beans, and the SDMX Super Beans. The SDMX Bean is an immutable Bean closely related to the information model. The SDMX Mutable Bean is a Mutable version of the SDMX Bean. The SDMX Super Bean, also immutable, contains artefacts by composition, rather then by reference. The following sections expand on the previous points.
1. Immutable Beans
Immutable : ‘Once the object is created it’s state can never be changed‘
Immutability is a very powerful concept, and absolutely necessary when writing SDMX domain objects. On creation the domain object will validate all of its state, and then it can never be modified. This concept of immutability ensures that each component that performs an action requiring information from one or more SDMX Beans, can be sure that the internal state of the object is known to be valid.
If modification to the Bean is required, then this is still possible by creating a Mutable Bean Instance. Each Immutable Bean can create a Mutable Bean, and vice-versa. Mutable Beans are explained in the section below.
The Downside and Dangers of Having Mutable Beans
Without the concept of immutability, an object can be modified at any point, by any process – it is important to note that even if process A is working on the object, there is nothing to stop another, unrelated process to be working on it at the same time. Due to this mutable (changeable) nature, any component working with Mutable Beans would have to perform some level of validation to ensure what they are processing is valid. Without checking validity the component is assuming the Bean to be valid, and this is where software systems break down and behave in unexpected ways. Never assume!
This changeable state could be a real problem in a software application such as an SDMX Registry. An SDMX Registry has to ensure referential integrity: all cross referenced artefacts must exist. The danger of mutable beans is that an artefact’s cross reference declarations can be modified even after they have been validated. The modification does not break any rules imposed by the SDMX standard, so regardless of the internal validation, the mutable bean would not detect any wrong doing. However the modification is invalid with regards to the current part of the process that this registry is performing. It is not impossible to build solid systems with mutable beans, but it is very hard to build a simple elegant solution with mutable beans, as the object can be in any state at any time.
2. Mutable Beans
Whist Immutable beans provide a very powerful and secure mechanism for transmitting data internally in a system, there still needs to be a mechanism to create the Beans in the first place. Metadata Technology Immutable Beans can be created from the SDMX-ML or EDI, and also programatically with the Mutable Beans. A Mutable Bean can exist in any state, valid or invalid, therefore meeting use cases such as creation by the means of user interfaces. Each Mutable Bean has the means to create an Immutable Instance, on creation the Immutable Bean is verified the ensure all required fields have been set and the set values are valid. Mutable Beans are not intended to be passed around a system, only to allow easy creation of the Immutable Beans.
3. Super Beans
‘The SDMX Super Beans provide referenced structures by composition rather then by reference‘
The SDMX Super Beans are very powerful when used in appropriate situations. Many SDMX structures can contain cross references to other structures. These cross references are defined in the XML and SDMX Beans as a simple references, defining the structure type and unique identifiers. With only a few simple attributes it is possible to resolve a cross referenced structure. Some structures rely heavily on cross references, these tend to be the more powerful structures, one such example is the Data Structure Definition (DSD). The DSD is required to understand how the Data is structured, and cross references Codelists and Concepts. Many components that require information from a DSD also require the cross referenced structures. The SDMX Super Beans provide these cross referenced structures by composition rather the by reference, this means that a DataStructureSuperBean contains the Codelists and Concepts as opposed to containing the reference.
An SDMX Super Bean is the full package in a single Object, allowing for simple interfaces defining a single Super Bean as an input parameter as opposed to the SDMX Bean and all cross referenced Beans. The Super Beans, just like the SDMX Beans, are immutable, meaning that one can be sure that they are not only valid, but do also contain valid referenced structures. Without the concept of Super Beans, any component requiring a structure alone with its references would need to also provide validation that all the required structures are present – this would require the cross references to be resolved and additional error handling. The SDMX Super Beans simplify interfaces, simplify structure navigation, and remove the need for any additional validation and error handling code.
Structure Parser
The Structure Parser is responsible for converting SDMX-ML structures into SDMX Beans, this includes Subscriptions, Registrations and structure query documents. The Structures may be in any version of SDMX, and may reside in a Structure Document, Registry Submission Document, or a Query Result Document. The Structure Parser also has the capability of creating SuperBeans, Mutable Beans, and converting from SDMX Beans back to SDMX-ML of any Version. The Structure Parser has links to the EDI Parser so can also handle input and output of SDMX-EDI documents. Another main responsibility of the structure parser is to resolve references, both internal and external.
EDI Parser
The EDI Parser can read and write SDMX-EDI documents, both data and structure files. The result of reading an EDI document is an EDIWorkspace containing SDMX Beans – representing the structures, and DataReader(s) allowing the data to be read in a generic way. The EDI Parser is also capable of creating SDMX-EDI from SDMX Beans, and from data, as the data is passed in as a DataReader, the underlying data format can be anything (e.g. SDMX, EDI, CSV (as a specific API implementation) ).
Structure Query Broker
The Structure Query Broker is responsible for consuming queries for structures in SDMX Bean form, and brokering the query to the requested ‘end point’. The end point may be a REST or SOAP web service, a TCP service, or a file. The Query broker communicates with the Structure Query Builder to build an SDMX representation of the query, in the required version of SDMX, and communicates with the File or Web Service Broker to perform the query. The Structure Query Broker responsibility is to communicate with the Structure Parser to parse the response document (as it could be any version of SDMX) and both components work together to resolve any required dependencies. The Structure Query Broker will perform any additional queries required to obtain the requested dependencies. The result of this is a single, simple, interface method for the user to call which returns the requested structure(s) – behind the scenes many components work together performing potentially multiple web service requests, potentially to multiple end points.
Structure Query Builder
The Structure Query Builder is responsible for consuming a structure query, in the form of an SDMX Bean, and building a SDMX representation of the query. The result is either a 2.1 REST URL, or an SDMX version 1.0/2.0/2.1 XML document that can be brokered to a web service.
Subscription Notification Processor
The Subscription Notification Processor is responsible for hosting a web service that can consumer a Registry Subscription Event, and reacting to the event. The reaction may be to query the registry to obtain the full structure (the Registry Subscription contains the URN of the modified structure), and then to make a call on the appropriate interface. It is the responsibility of the user of this component to provide an implementation of the interface, allowing the user to decide how to react to structure events.
Structure Persistence
The Structure Persistence Component is responsible for persisting structures in a database layer. Transaction handling, reference resolution, and validation is provided. The database schema is created automatically thanks to Hibernate technology. However, as with all Fusion Components each aspect of this component is loosely coupled, allowing users to provide their own implementations is required. This means a user could provide their own validation layer, database layer and only use the aspects required to meet their use case. The implication of this, is that it is theoretically possible for a user to plug the Structure Persistence Component component onto an existing structural metadata repository, and either use only the provided database to store additional structures, or not to use the provided DAO layer at all.
File Query Processor
The File Query Processor is responsible for processing an SDMX query against a file which contains SDMX or EDI structures. The File Query Processor can be used to resolve external references, where the referenced structure may exist in a file containing many structures. However, it could be used for more exotic use cases such as providing a Web Service on top of a simple SDMX/EDI file (with the aid of the SDMX Servlet Component) – passing the response back in any version of SDMX or EDI regardless of the underlying file format.
Data Parser
The Data Parser Component is responsible for Data Validation, Transformation, and Information Extraction. The Data Parser can also generate SDMX data schemas such as 2.0 Compact Schema, 2.1Structure Specific Schema, etc. All Components built that work with data have been designed to be fully scaleable and performant. Therefore there is no upper limit on the size of the dataset being processed, and the processing is performed efficiently. Validation can be performed against a schema generated ‘on the fly’. However,we have found that for large datasets this type of validation is extremely slow. Consequently, the Data Parser component contains a Java validation engine which is in the magnitude of 10-100 times faster then schema validation. Another benefit of the Java validation is that the error messages generated can be made to be more meaningful to the user as it is given in the context of the SDMX Information Model and not in the context of the XML schema.
EDI Parser
The EDI Parser can read and write SDMX-EDI documents, both data and structure files. The result of reading an EDI document is a EDIWorkspace containing SDMX Beans – representing the structures, and DataReader(s) allowing the data to be read in a generic way. The EDI Parser is also capable of creating SDMX-EDI from SDMX Beans, and from data, as the data is passed in as a DataReader, the underlying data format can be anything (e.g. SDMX, EDI, CSV (as a specific API implementation) ).
Data Query Broker
The Data Query Broker is responsible for consuming a Data Query, in the form of a SDMX Bean, and brokering the query to the required ‘end point’. The end point can be a REST or SOAP Web Service, or a File. The Data Query Broker communicates with the Data Query Builder component to build the appropriate representation of the SDMX Bean Query, in SDMX-ML or REST syntax. As with all other components, the SDMX query can be built in any version of SDMX. Once the Data Query Broker has sent the query, it is responsible for consuming the response, and if required performing a transformation on the returned dataset to the requested version and format of SDMX, or EDI (or any other requested format, such as CSV or custom).
Data Query Builder
The Data Query Builder is responsible for consuming a SDMX Bean representation of a Data Query, and building an SDMX-ML, or REST representation in the required format and version of SDMX.
Data Reader
The Data Reader component contains implementations of the Data Reader Engine Interface. The Data Reader Engine interface is a very powerful Interface that has been designed read multidimensional data with very simple interface methods. The Data Reader can be thought of as an iterator, allowing the user to iterate through the series and observations of a dataset, and move to different locations within the dataset. All our components that work with data do so at the Interface level, and do not concern themselves with the underlying format of the data. This allows the components to be fully decoupled from the way the data is exchanged, and to process that data at the level of the information that is exchanged. The Data Reader Component provides Data Reader Engine implementations for Compact 1.0/2.0, Generic 1.0/2.0/2.1, Structure-Specific 2.1, and delimited data such as CSV. This allows these data types to be read in a generic way, moreover, the user can also provide their own implementations for any other data type not included in this component: simply adding a new implementation, the components can automatically transform the data into any other data format, or import the data into a database etc., with no additional code required.
Data Writer
The Data Writer component contains implementations of the Data Writer Engine Interface. The Data Writer Engine interface is a very powerful Interface that has been designed write multidimensional data with very simple interface methods. The Data Writer allows users, such as database administrators, to return SDMX/EDI/CSV etc. data from their systems, without having to know the syntax. All the user has to understand is they have a series, and they have some observations, the Data Writer is responsible for outputting the appropriate data in the appropriate format. The Data Writer component provides implementations of the Data Writer Engine for Compact 1.0/2.0, Generic 1.0/2.0/2.1, Structure Specific 2.1, and delimited data such as CSV. In the same way for all other components, the user can provide their own implementations to the Data Writer Engine, allowing data to be written in any format they require. The implications of this is that, by providing an implementation and working in conjunction with the SDMX Servlet component, the user can automatically consume a SDMX query of any format and version via a web service, and respond to the query in their own custom response format. All of this can be achieved by implementing a few simple methods.
Constraints Engine
The Constraints Engine is a very powerful component which is extremely useful for data dissemination. The Constraints Engine processes dataset(s) to build a powerful model of the valid key combinations. Given a selection of codes in various dimensions, the Constraints Engine will determine all the valid codes, for each dimension, that are available for selection. This allows a query form to be built incrementally, which prevents a user from making invalid selection combinations, which is a very powerful asset when working with many dimensions and a sparse cube. The user can select codes for dimensions in any order and can even select invalid codes, the Constraints Engine will still be able to determine what the resultant valid combinations are.
Data Visualisation
The Data Visualisation component is responsible for creating a version of XML that is useful for data visualisation. The XML is not SDMX-ML, it is a bespoke XML which combines the data and the structures, and groups the information in an way that is easy for visual components to process. This component was designed to take the hard work from, and to decouple,the user interface from, the standard SDMX-ML. It allows very simple graphing components to be written for GUIs, without having the overhead of having to write SDMX parsers and logic.
If a GUI has to process the underlying SDMX then, once again, it would have to either couple itself to a version of the standard, and a data format, or would have to replicate the Parser components in order to make it format independent. This Data Visualisation Component relieves that pressure from the GUI.
It is appreciated that the data visualisation XML output format may not suit all requirements. However, it is an easy task to create a custom implementation: the Data Visualisation Component is more prescriptive of the paradigm than the implementation.
Other Components
- Data Dissemination
Web Service
The Web Service Component is extremely useful if web service support is required. This component provides an out-of- the-box solution which supports REST GET (parametrised URL introduced in 2.1) for Structure and Data queries, and HTTP POST for queries and submissions. The Web Service Component will consume a query, or a submission of any format and version of SDMX/EDI, and is responsible, with the aid of other core components, for parsing and building a SDMX Bean representation. The Web Service component is then responsible for brokering the information to the appropriate interface, for which the user of the component must provide an implementation.
An example of this is if an application wishes to support data queries, then all the application needs to do is to provide an implementation of the Data Query Processor Interface, and the SDMX Bean Retrieval Manager Interface (required to obtain the appropriate structural metadata). By providing an implementation of these two interfaces (there are already pre-provided implementations that meet common needs) the application will automatically support data queries via the REST URLs, or SDMX-ML query documents.
Error handling is automatically provided, as is compression of the response (GZIP) if this response format is supported by the client. Equally, the user may wish to support Structure submissions: this requires a single interface to be implemented. Of course, behind the scenes a lot of work is done by the Web Service component, as the Structures can be in the form of a Structure Document, a Registry Document, or an EDI document. Furthermore, it may be a DELETE submission, or APPEND submission and so on. The Web Service Component takes a lot of hard work out of providing SDMX Web Services, as the information can be provided in many different forms.
In simple terms, this component is responsible for simplifying the message, and brokering it to a single end point, taking all of the hard work out of the message processing. It is possible for the user to configure the component to describe exactly what is and is not supported, allowing clear error messages to be displayed when the user attempts to perform an unsupported action. A simple web form is also provided out of the box, allowing for documents to be submitted.
Web Service Broker
The Web Service Broker Component is responsible for calling Web Services, POSTing or GETing information. This component provides support for GZIP (compressed) responses, which can save a lot of time when consuming large amounts of data. The Web Service Broker also provides support for calling TCP services and posting Emails.
Other Components
- End-point Administrator
- Web Service Proxy
Structure Viewer
The Structure Viewer component provides many FLEX visualisation and editing capabilities for SDMX Structures.
Structure Retriever
The Structure Retriever component provides support for calling SDMX Beans services from Flex, using Blaze DS. The Structure Retriever contains a cache, which streamlines the processing.
Graphing
The Graphing component is a Flex component which enables creation of Cartesian graphs and bar charts (both time series and cross sectional) and tables.
Other Components
- Query Form Builder
Server Technologies
Java 1.6
Spring 3.1
Hibernate
Aspect J
Web Technology
Adobe Flex
Build Technologies
Maven
Cruise Control
Artifactory
