||Home > Research > XML Web services
Data processing in the SOAP/XML Web services stack includes XML parsing, XML validation, mapping and conversion of XML content to internal programming language-specific data structures, and vice versa. The X/O (XML/object) mapping problem raises type system isomorphism issues and run-time object coherence concerns. The overhead of the XML Web services stack is significant and must be reduced by generating XML-schema specific processing code using compiler-based techniques aimed at collapsing the processing stages at run time for performance. These and other compiler/runtime challenges are addressed in our research.
Software: the gSOAP toolkit.
XML Web Services Data Processing Algorithms and Code Generation Tools
R. van Engelen, A Framework for Service-Oriented Computing with C and C++ Web Service Components, to appear in ACM Transactions on Internet Technologies, 2008.
This article gives an extensive overview of XML Web services for C and C++ with a focus on compile-time and run-time algorithms for XML schema type system bindings to C/C++ types and XML schema instance parsing, validation, and object (de)serialization. The concepts behind the bindings are generic, which makes the approach applicable to other programming languages. The algorithms are implemented and validated in the gSOAP toolkit framework for developing C and C++ Web services components.
R. van Engelen, G. Gupta, and S. Pant, Developing Web Services for C and C++, in IEEE Internet Computing Journal, March, 2003, pages 53-61.
This paper gives an overview of C/C++ Web services development tools and libraries in comparison to the gSOAP toolkit. The gSOAP toolkit eases the development of Web service applications while providing a platform-independent high-performance Web services solution. The toolkit aims to reduce time to market products, lower operational costs of services by limiting its computational requirements, and improve the quality of Web services through efficient SOAP message processing.
R. van Engelen, M. Govindaraju, and W. Zhang, Exploring Remote Object Coherence in XML Web Services, in the proceedings of IEEE International Conference on Web Services (ICWS), 2006, pages 249-256.
The advantages of XML to connect heterogeneous systems are plenty, but rendering programming-language specific data structures and object graphs presents challenges for systems that require object coherence. Achieving the latter goal poses difficulties by a phenomenon that is sometimes referred to as the "impedance mismatch" between programming language data types and XML schema types. This paper examines the problem, debunks the X/O-mismatch controversy, and presents a mix of static/dynamic algorithms for accurate XML serialization. Experimental results show that the implementation in C/C++ is efficient and competitive to binary protocols. Application of the approach to other programming languages, such as Java, is also discussed.
R. van Engelen, W. Zhang, and M. Govindaraju, Toward Remote Object Coherence with Compiled Object Serialization for Distributed Computing with XML Web Services, in the proceedings of Compilers for the workshop on Parallel Computing (CPC), 2006, pages 441-455.
This paper introduces hybrid static/dynamic algorithms to support lossless serialization of programming-language specific binary-encoded object graphs to text-based XML trees, while staying within the limits imposed by XML schema validation and the XSD type system. A compiler-based approach is presented to automatically emit serialization routines for C and C++ data types to XML. Experimental results show that the presented compiler-based serialization is efficient and performance is comparable to systems that use binary protocols.
R. van Engelen and K. Gallivan, The gSOAP Toolkit for Web Services and Peer-To-Peer Computing Networks, in the proceedings of the 2nd IEEE International Symposium on Cluster Computing and the Grid (CCGrid), 2002, pages 128-135.
This paper presents the gSOAP stub and skeleton compiler. The compiler provides a unique SOAP-to-C/C++ language binding for deploying C/C++ applications in SOAP Web Services, clients, and peer-to-peer computing networks. gSOAP enables the integratation of (legacy) C/C++/Fortran codes, embedded systems, and real-time software in Web Services, clients, and peers that share computational resources and information with other SOAP-enabled applications, possibly across different platforms, language environments, and disparate organizations located behind firewalls. Results on interoperability, legacy code integration, scalability, and performance are given.
High-Performance XML Processing
W. Zhang and R. van Engelen, High-Performance XML Parsing and Validation with Permutation Phrase Grammar Parsers, in the proceedings of the IEEE International Conference on Web Services (ICWS), 2008, pages 286-294.
M. Head, M. Govindaraju, R. van Engelen, and W. Zhang, Benchmarking XML Processors for Applications in Grid Web Services, in the proceedings of IEEE/ACM Supercomputing (SC), 2006.
In this paper we propose a standard benchmark suite for quantifying, comparing, and contrasting the performance of XML processors under a wide range of representative use cases. The benchmarks are defined by a set of XML schemas and conforming documents. To demonstrate the utility of the benchmarks and to provide a snapshot of the current XML implementation landscape, we report the performance of many different XML implementations, on the benchmarks, and draw conclusions about their current performance characteristics. We also present a brief analysis on the current shortcomings and required critical design changes for multi-threaded XML processing tools to run efficiently on emerging multi-core architectures.
W. Zhang and R. van Engelen, A Table-Driven XML Streaming Methodology for High-Performance Web Services, in the proceedings of IEEE International Conference on Web Services (ICWS), 2006, pages 197-206, (best student paper award).
This paper presents a table-driven XML streaming technique, called TDX, for implementing efficient schema-specific XML parsers. TDX utilizes a push-down automaton and a fast lookup table to encode the states of the streaming XML parser. TDX effectively combines parsing and validation into one pass to increase the performance of CPU-bound XML Web services. We developed a parser construction toolkit and applied it to an example Web services application to measure the performance impact compared to high-performance parsers written in C/C++, such as eXpat, gSOAP, and Xerces. The performance results show that TDX is several times faster than these parsers.
W. Zhang and R. van Engelen, TDX: a High-Performance Table-Driven XML Parser, in the proceedings of the ACM SouthEast conference, 2006, pages 726-731.
This short paper introduces the table-driven XML (TDX) concept for high-performance XML parsing and validation.
M. Head, M. Govindaraju, A. Slominski, P. Liu, N. Abu-Ghazaleh, R. van Engelen, K. Chiu, M. Lewis, A Benchmark Suite for SOAP-based Communication in Grid Web Services, in the proceedings of IEEE/ACM Supercomputing (SC), 2005.
In this paper we propose a standard benchmark suite for quantifying, comparing, and contrasting the performance of SOAP implementations under a wide range of representative use cases. The benchmarks are defined by a set of WSDL documents. To demonstrate the utility of the benchmarks and to provide a snapshot of the current SOAP implementation landscape, we report the performance of many different SOAP implementations on the benchmarks, and draw conclusions about their current performance characteristics.
M. Govindaraju, A. Slominski, K. Chiu, P. Liu, R. van Engelen, and M. Lewis, Toward Characterizing the Performance of SOAP Toolkits, in the proceedings of the 5th IEEE/ACM International Workshop on Grid Computing, 2004, pages 365-372.
In this paper we compare and contrast the performance of widely used SOAP toolkits and draw conclusions about their current performance characteristics. We also provide insights into various design features that can lead to optimized SOAP implementations.
R. van Engelen, Constructing Finite State Automata for High Performance XML Web Services, in the proceedings of the International Symposium on Web Services (ISWS), 2004, pages 975-981.
This paper describes a validating XML parsing method based on deterministic finite state automata (DFA). XML parsing and validation is performed by a schema-specific XML parser that encodes the admissible parsing states as a DFA. A two-level DFA architecture is used to increase efficiency and to reduce the generated code size. The lower-level DFA efficiently parses syntactically well-formed XML messages. The higher-level DFA validates the messages and produces application events associated with transitions in the DFA. Two example case studies are presented and performance results are given to demonstrate that the approach supports the implementation of high-performance Web services.
XML Web Services Security
R. van Engelen and W. Zhang, An Overview and Evaluation of Web Services Security Performance Optimizations, in the proceedings of the IEEE International Conference on Web Services (ICWS), 2008, pages 137-144.
R. van Engelen and W. Zhang, Identifying Opportunities for Web Services Security Performance Optimizations, in the proceedings of the IEEE Services Computing Conference (SCC), 2008.
WS-Security is an essential component of the Web services protocol stack. WS-Security provides end-to-end security properties, thereby assuring the participation of non-secure transport intermediaries in message exchanges, a key advantage in Web-based systems. However, compared to point-to-point secure messaging with TLS, WS-Security has a significant performance penalty. In this paper, we identify several opportunities for optimizing WS-Security.
M. Cafaro, D. Lezzi, S. Fiore, G. Aloisio, and R. van Engelen, The GSI plug-in for gSOAP: building cross-grid interoperable secure grid services, in the proceedings of the International Conference on Parallel Processing and Applied Mathematics (PPAM), workshop on Models, Algorithms and Methodologies for Grid-enabled Computing Environment (MAMGCE), 2007.
In this paper we present the GSI plug-in for gSOAP, an open source solution to the problem of securing Web services in grid environments. Our plug-in allows the development of Globus Security Infrastructure (GSI) enabled Web Services and clients, with full support for mutual authentication/authorization, delegation of credentials and connection caching.
G. Aloisio, M. Cafaro, I. Epicoco, D. Lezzi, and R. van Engelen, The GSI plug-in for gSOAP: Enhanced Security, Performance, and Reliability, in the International Conference on Information Technology (ITCC), 2005, IEEE Press, Volume I, pages 304-309.
In this paper we report on the current status of the GSI plug-in for gSOAP, an open source solution to the problem of securing Web services in grid environments.
G. Aloisio, M. Cafaro, D. Lezzi, and R. van Engelen, Secure Web Services with Globus GSI and gSOAP, in the proceedings of the EUROPAR conference, 2003.
In this paper we describe a plug-in for the gSOAP toolkit that allows development of Web Services exploiting the Globus Security Infrastructure (GSI). Our plug-in allows the development of GSI enabled Web Services and clients, with full support for mutual authentication/authorization, delegation of credentials and connection caching. The software provides automatic, transparent transport-level security for Web Services and is freely available.
XML Web Services Applications
Y. Li, M. Mascagni, R. van Engelen, and Q. Cai, A Grid Workflow-Based Monte Carlo Simulation Environment, in the Journal of Neural, Parallel, and Scientific Computations, 2004, pages 439-454.
This article presents the Grid-Computing Infrastructure for Monte Carlo Applications (GCIMCA). GCIMCA provides services specific to grid-based Monte Carlo simulation applications, including the Monte Carlo subtask schedule service using the N-out-of-M strategy, the facilities of application-level checkpointing, the partial result validation service, and the intermediate value validation service. Taking advantage of grid workflow paradigms and GCIMCA facilities, we implemented a Grid Workflow-based Monte Carlo (GWMC) simulation environment. Workflow management services are implemented to manage the Monte Carlo simulation process. We intend to provide a trustworthy and manageable grid-computing environment for large-scale and high-performance Monte Carlo simulation application.
R. van Engelen, Code Generation Techniques for Developing Web Services for Embedded Devices, in the proceedings of the 9th ACM Symposium on Applied Computing SAC, Nicosia, Cyprus, 2004, pages 854-861.
This paper presents specialized code generation techniques and runtime optimizations for developing light-weight XML Web services for embedded devices. The optimizations are implemented in the gSOAP Web services development environment for C and C++. The system supports the industry-standard XML-based Web services protocols that are intended to deliver universal access to any networked application that supports XML. With the standardization of the Web services protocols and the availability of toolkits such as gSOAP for developing embedded Web services, new opportunities emerge to integrate embedded systems into larger frameworks of interconnected applications and systems accessing dynamic resources on the Web ranging from handheld and embedded devices to databases, clusters, and Grids.
Y. Li, M. Mascagni, and R. van Engelen, GCIMCA: A Globus and SPRNG Implementation of a Grid-Computing Infrastructure for Monte Carlo Applications, in the proceedings of the PDPTA 2003 conference, 2003.
This paper inrtoduces the Grid-Computing Infrastructure for Monte Carlo Applications (GCIMCA). GCIMCA provides services specific to grid-based Monte Carlo simulation applications, including the Monte Carlo subtask schedule service using the N-out-of-M strategy, the facilities of application-level checkpointing, the partial result validation service, and the intermediate value validation service.
R. van Engelen, Pushing the SOAP Envelope with Web Services for Scientific Computing, in the proceedings of the International Conference on Web Services (ICWS), 2003, pages 346-354.
This paper investigates the usability, interoperability, and performance issues of SOAP/XML-based Web and Grid Services for scientific computing. Several key issues are addressed that are important for the deployment of high-performance and mission-critical SOAP/XML-based services. A successful deployment can be achieved by limiting the overhead of XML encoding through exploiting XML schema extensibility to define optimized XML data representations and by reducing message passing latencies through message chunking, compression, routing, and streaming.
R. van Engelen, K. Gallivan, G. Gupta, and G. Cybenko, XML-RPC Agents for Distributed Scientific Computing, in the proceedings of the IMACS Conference, Lausanne, Switzerland, August 2000.
This paper presents the use of XML-RPC to achieve data interoperability between scientific applications in a distributed environment. Remote procedure calling with XML-RPC is programming language independent and operates across different platforms. We have designed and implemented tools for the automatic generation of XML-RPC stub routines and XML serialization converters to support application data interoperability in component-based problem-solving environments for distributed scientific computing. Locating and indexing XML-RPC services is performed using mobile agents. The agents serve to setup problem-solving sessions and to connect remote applications.