Professor Hong Zhu
Professor of Computer Science
School of Engineering, Computing and Mathematics
Role
I am the Chair of the Cloud Computing and Cybersecurity research group, which is established in 2020 and evolved from the Applied Formal Methods (AFM) research group established in 2002. I also teach on cloud computing and software engineering modules in both undergraduate and postgraduate programmes of computer science.
Areas of expertise
Software Engineering:
- Software development methdology,
- Software testing,
- Software Design,
- Software modelling,
- Software languages,
- Cloud native software
Teaching and supervision
Courses
- Computer Science for Cyber Security (BSc (Hons))
- Computer Science (BSc (Hons))
- Computing Science (MSc, PGDip, PGCert)
- Data Science and Artificial Intelligence (MSc)
Modules taught
- Requirements Specification and Software Design (Module Leader, Lecturer, Practical class tutor)
- Software Analysis and Test (Module Leader, Lecturer, Practical class tutor)
- Software Engineering (Module Leader)
- Introduction to Distributed Systems (Module Leader, Lecturer, Practical class tutor)
- Big Data and The Cloud (Module Leader, Lecturer, Practical class tutor)
Supervision
I am continually supervising several PhD students
Research
My research interests are in software engineering, which include software development methdologies, especially for cloud and web based software, software languages for modelling, design and programming, software automation for testing, etc.
Example of recent work include:
Cloud Software Development Languages and Environments
The project is a departmental REF 2020 key project. It aims at developing programming languages and integrated development environment for cloud native software. It consists of two parts:
- CAOPLE: A novel programming language for cloud native applications in the microserivces architecture. It is based on the service agent model of concurrent and distributed programming model.
- CIDE: A novel integrated DevOps environment for cloud software development. It integrates software development environment with cluster management, monitoring and operation tools in one environemnt so that staging and operation activities can be shifted left.
For more information, visit the CAOPLE site.
Formal Methods for Cloud Software
The project is also a depatmental REF 2020 key project. It aims at developing a formal method and supporting tools for develop cloud native software. The main focus is one algebraic approach. The main outcomes so far include the following:
- SOFIA: A service oriented formal specification language in algebraic theory. It is designed to formally specify services in the algebraic approach.
- Tr2Ont: A technique and tool for transformation of algebraic specifications into ontological description of web services.
- Metrics and tools for measureing the quality of algebraic specifications and ontological descriptions of web services;
- Automatic testing of web services based on algebraic specifications.
Groups
Projects
- Analysis and Testing of Artificial Intelligence Applications
- Case Studies with Analysis And Testing AI applications
Publications
Journal articles
-
Zhu H, Bayley I
, 'Discovering Boundary Values of Feature-based Machine Learning Classifiers through Exploratory Datamorphic Testing'
Journal of Systems and Software 187 (2022)
ISSN: 0164-1212AbstractPublished here Open Access on RADARTesting has been widely recognised as difficult for AI applications. This paper proposes a set of testing strategies for testing machine learning applications in the framework of the data- morphism testing methodology. In these strategies, testing aims at exploring the data space of a classification or clustering application to discover the boundaries between classes that the machine learning application defines. This enables the tester to understand precisely the be- haviour and function of the software under test. In the paper, three variants of exploratory strategies are presented with the algorithms implemented in the automated datamorphic test- ing tool Morphy. The correctness of these algorithms are formally proved. Their capability and cost of discovering borders between classes are evaluated via a set of controlled experiments with manually designed subjects and a set of case studies with real machine learning models.
-
Zhu H, Bayley I, 'Discovering and Investigating Cyberpatterns: The Road Map to Link Data Analytics with Reusable Knowledge'
IEEE Systems, Man, and Cybernetics Magazine 4 (3) (2018) pp.14-22
ISSN: 2380-1298 eISSN: 2333-942XAbstractOne of the most compelling challenges for data analytics is to obtain reusable, verifiable and transferable knowledge from data. One solution to this is the pattern-oriented approach to knowledge representation proposed by this paper. The foundation of the approach is a formal theory of patterns, including a formal language for defining them, and an algebra of operations for composing patterns and instantiating them. This paper outlines a roadmap for the study of so-called cyberpatterns: the patterns of cyberspace. It explores the scope of research, views the current state of the art and identifies the key research questions.Published here Open Access on RADAR -
Zhu H, Liu D, Bayley I, Aldea A, Yang Y, Chen Y, 'Quality Model and Metrics of Ontology for Semantic Descriptions of Web Services'
Tsinghua Science and Technology 22 (3) (2017) pp.254-272
ISSN: 1007-0214AbstractAn ontology is a conceptualisation of domain knowledge. It is employed in semantic web services technologies to describe the meanings of services so that they can be dynamically searched for and composed according to their meanings. It is essential for dynamic service discovery, composition and invocation. Whether an ontology is well constructed has a tremendous impact on the accuracy of the semantic description of a web service, the complexity of the semantic definitions, the efficiency of processing messages passed between services, and the precision and recall rates of service retrieval from service registrations. However, measuring the quality of an ontology remains an open problem. Work on the evaluation of ontologies do exist, but they are not in the context of semantic web services. This paper addresses this problem by proposing a quality model of ontology and defining a set of metrics that enables the quality of an ontology to be measured objectively and quantitatively in the context of semantic descriptions of web services. These metrics cover the contents, presentation and usage aspects of ontologies. The paper also presents a tool that implements these metrics and reports a case study on five real-life examples of web services.Published here Open Access on RADAR -
Miller JA, Zhu H, Zhang J, 'Guest Editorial: Advances in Web Services Research'
IEEE Transactions on Services Computing 10 (1) (2017) pp.5-8
ISSN: 1939-1374AbstractResearch on Web Services and Services Computing began shortly after the start of the new millennium and with the first major research conference on the subject, the International Conference on Web Services, starting in 2003. Due to the new and practical utility of Web service technology, many researchers began to work in this field. New ideas, applications and related technologies continue to invigorate this discipline. This special issue of IEEE Transactions on Services Computing considers four such topics that lead to “Advances in Web Services Research”: Cloud Services; Polices and Agreements; Services Engineering; and Service/Process Mining. It consists of the best extended papers from the two premier research conferences on services computing: the 2015 IEEE International Conference on Web Services and the 2015 IEEE Conference on Services Computing.Published here Open Access on RADAR -
Zhu H, Bayley I, 'On the Composability of Design Patterns'
IEEE Transactions on Software Engineering 41 (11) (2015) pp.1138-1152
ISSN: 0098-5589AbstractPublished here Open Access on RADARIn real applications, design patterns are almost always to be found composed with each other. It is crucial that these compositions be validated. This paper examines the notion of validity, and develops a formal method for proving or disproving it, in a context where composition is performed with formally defined operators on formally specified patterns. In particular, for validity, we require that pattern compositions preserve the features, semantics and soundness of the composed patterns. The application of the theory is demonstrated by a formal analysis of overlap-based pattern compositions and a case study of a real pattern-oriented software design.
-
Liu D, Zhu H, Bayley I, 'Transformation of Algebraic Specifications into Ontological Semantic Descriptions of Web Services'
International Journal of Services Computing 2 (1) (2014) pp.58-71
ISSN: 2330-4464 eISSN: 2330-4472AbstractThe accurate description of service semantics plays a crucial role in service discovery, composition and interaction. Most work in this area has been focused on ontological descriptions, which are searchable and machineunderstandable.
However, they do not define service functionality in a verifiable and testable manner In contrast, formal specification techniques, having evolved over the past 30 years, can define semantics in such a manner, but they have not yet been widely applied to service computing because the specifications produced are not searchable. There is a huge gap between these two methods of semantics description. This paper bridges the gap by advancing a transformation technique. It specifies services formally in an algebraic specification language, and then, extracts an ontological description of domain knowledge and service semantics as profiles in an ontology description language such as OWL-S. This brings the desired searchability benefits. The paper presents a prototype tool for performing this transformation and reports a case study to demonstrate the feasibility of our approach.
-
Zhu H, Bayley I, 'An algebra of design patterns'
ACM Transactions on Software Engineering and Methodology 22 (3) (2013)
ISSN: 1049-331XAbstractPublished hereIn a pattern-oriented software design process, design decisions are made by selecting and instantiating appropriate patterns, and composing them together. In our previous work, we enabled these decisions to be formalized by defining a set of operators on patterns with which instantiations and compositions can be represented. In this article, we investigate the algebraic properties of these operators. We provide and prove a complete set of algebraic laws so that equivalence between pattern expressions can be proven. Furthermore, we define an always-terminating normalization of pattern expression to a canonical form which is unique modulo equivalence in first-order logic.
By a case study, the pattern-oriented design of an extensible request-handling framework, we demonstrate two practical applications of the algebraic framework. First, we can prove the correctness of a finished design with respect to the design decisions made and the formal specification of the patterns. Second, we can even derive the design from these components.
-
Anand S, Burke EK, Chen TY, Clark J, Cohen MB, Grieskamp W, Harman M, Harrold MJ, McMinn P, Bertolino A, Li JJ, Zhu H, 'An Orchestrated Survey of Methodologies for Automated Software Test Case Generation'
Journal of Systems and Software 86 (2013) pp.1978-2001
ISSN: 0164-1212 eISSN: 1873-1228AbstractPublished hereTest case generation is among the most labour-intensive tasks in software testing. It also has a strong impact on the effectiveness and efficiency of software testing. For these reasons, it has been one of the most active research topics in software testing for several decades, resulting in many different approaches and tools. This paper presents an orchestrated survey of the most prominent techniques for automatic generation of software test cases, reviewed in self-standing sections. The techniques presented include: (a) structural testing using symbolic execution, (b) model-based testing, (c) combinatorial testing, (d) random testing and its variant of adaptive random testing, and (e) search-based testing. Each section is contributed by world-renowned active researchers on the technique, and briefly covers the basic ideas underlying the method, the current state of the art, a discussion of the open research problems, and a perspective of the future development of the approach. As a whole, the paper aims at giving an introductory, up-to-date and (relatively) short overview of research in automatic test case generation, while ensuring a comprehensive and authoritative treatment.
-
Zhu H, Liu Y, Lu K, Wang X, 'Self-Adaptive Management of Idle Nodes in Large Scale Systems'
International Journal of Next-Generation Computing 4 (2) (2013)
ISSN: 0976-5034AbstractThe real workload on a large scale computer system varies from time to time. Often it has many idle nodes during most operation time. These idle nodes consume energy, but do nothing useful. To save the huge amount of energy wasted by such active idle nodes, most modern compute nodes are equipped with multiple level dynamic sleep mechanisms to reduce power consumption. However, awaking sleeping nodes takes time. The deeper a node sleeps, the less energy it consumes, but the longer wakeup latency. How to balance between the systems energy consumption and the response time is a key problem in the power management of large scale systems. This paper proposes a self-adaptive approach to manage the sleep states of idle nodes to achieve low energy consumption and high performance at the same time. The proposed approach has two distinctive features. First, idle nodes are hierarchical organised. In this model, idle nodes are classified into several groups according to their sleep states. Each group contains nodes of same level of sleep depth and forms a reserve pool of a certain readiness level. When a resource is requested, nodes in the pool of highest level of readiness are preferentially allocated. When the nodes in the pool of the highest readiness level are not sufficient, the nodes in the pool(s) of next level(s) of readiness are allocated. After each allocation and reclaim of nodes, the numbers of nodes in each level of pools are adjusted by changing the sleep depth of the nodes up and down. Thus, the reserve pools can be maintained for high performance requirement. When resources are released from applications, they are placed back to reserve pools and put into different levels of sleep states to save energy. Second, the sizes of reservation pools are self-adaptive. Obviously, a key factor that affects the effectiveness of the idle node management is the sizes of the reserve pools. Fixed sizes of reserve pools would not be effective due to the time varying nature of workload on large scale systems. The proposed approach employs a self-adaptive mechanism in which the sizes of reserve pools are dynamically adjusted during the execution of the system according to how well the research pools meet the need of computation resources. Our experiments demonstrated that our approach can significantly improve energy efficiency in large scale systems without significant scarification of performance.
-
Bertolino A, Foster H, Li JJ, Zhu H, 'Special Section on Automation of Software Test'
Journal of Systems and Software 86 (2013) pp.1977-1977
ISSN: 0164-1212 eISSN: 1873-1228Published here -
Zhu H, 'An institution theory of formal meta-modelling in graphically extended BNF'
Frontiers of Computer Science 6 (1) (2012) pp.40-56
ISSN: 2095-2228AbstractMeta-modelling plays an important role in model driven software development. In this paper, a graphic extension of BNF (GEBNF) is proposed to define the abstract syntax of graphic modelling languages. From a GEBNF syntax definition, a formal predicate logic language can be induced so that meta-modelling can be performed formally by specifying a predicate on the domain of syntactically valid models. In this paper, we investigate the theoretical foundation of this metamodelling approach. We formally define the semantics of GEBNF and its induced predicate logic languages, then apply Goguen and Burstall" s institution theory to prove that they form a sound and valid formal specification language for meta-modelling.Published here -
Zhu H, Zhang Y, 'Collaborative testing of web services'
IEEE Transactions on Services Computing 5 (1) (2012) pp.116-130
ISSN: 1939-1374AbstractPublished hereSoftware testers are confronted with great challenges in testing Web Services (WS) especially when integrating to services owned by other vendors. They must deal with the diversity of implementation techniques and to meet a wide range of test requirements. However, they are in lack of software artefacts, the means of control over test executions and observation on the internal behaviour of the other services. An automated testing technique must be developed with capability of testing on-the-fly non-intrusively and non-disruptively. Addressing these problems, this paper proposes a framework of collaborative testing in which test tasks are completed through the collaboration of various test services registered, discovered and invoked at runtime using the ontology of software testing STOWS. The composition of test services are realized by using test brokers, which are also test services but specialized in the coordination of other test services. The ontology can be extended and updated through an ontology management service to support a wide open range of test activities, methods, techniques and types of software artefacts. The paper presents a prototype implementation of the framework in semantic WS and demonstrates its feasibility by a running example. The paper also reports an experimental evaluation of the scalability.
-
Shan L, Zhu H, 'Unifying the semantics of models and meta-models in the multi-layered UML meta-modelling hierarchy'
International Journal of Software and Informatics 6 (2) (2012) pp.163-200
ISSN: 1673-7288AbstractUML is defined through metamodelling in a four-layer metamodel hierarchy, where metamodels and meta-metamodels are also presented in the form of UML class diagrams. However, the meanings of models and metamodels as well as the basic concepts involved in modelling and metamodelling are not precisely defined in the OMG documentations. In the past few years, a large amount of research efforts on the formalisation of UML semantics has been reported in the literature, but how to formalise the metamodel hierarchy still remains an open problem. This paper presents a framework of unified formal semantics of the metamodel hierarchy. It is based on our previous work on the formal semantics of UML, in which we proposed the notions of descriptive semantics and functional semantics as two separate aspects of UML semantics. The former describes the structure of a model's instances, and the latter characterises the functional and behavioural properties of its instances. This paper further develops this approach by generalising it to metamodels and meta-metamodels. We prove that the semantics of models, metamodels and meta-metamodels can be defined in a unified way. The basic concepts involved in the metamodel hierarchy, such as subject domain and instance-of relation, can also be precisely defined based on the unified semantics framework. -
Bayley I, Zhu H, 'A formal language for the expression of pattern compositions'
International Journal on Advances in Software 4 (3&4) (2011) pp.354-366
ISSN: 1942-2628Published here -
Liu Y, Zhu H, 'A survey of research on power management techniques for high performance systems'
Software: Practice and Experience 40 (11) (2010) pp.943-964
ISSN: 0038-0644AbstractPublished hereThis paper surveys the research on power management techniques for high performance systems. These include both commercial high performance clusters and scientific high performance computing (HPC) systems. Power consumption has rapidly risen to an intolerable scale. This results in both high operating costs and high failure rates so it is now a major cause for concern. It is imposed new challenges to the development of high performance systems. In this paper, we first review the basic mechanisms that underlie power management techniques. Then we survey two fundamental techniques for power management: metrics and profiling. After that, we review the research for the two major types of high performance systems: commercial clusters and supercomputers. Based on this, we discuss the new opportunities and problems presented by the recent adoption of virtualization techniques, and again we present the most recent research on this. Finally, we summarise and discuss future research directions.
-
Ou S, Yang K, Zhu H, 'Dynamic algorithms for autonomic pervasive services in mobile wireless environments'
International Journal of Autonomic Computing 1 (4) (2010) pp.391-407
ISSN: 1741-8569AbstractPervasive services have attracted numerous attentions from both industries and academia. Its very high adaptability demand makes autonomic computing (AC), and in particular, personal autonomic computing (PAC), an ideal enabling technology. Some generic AC frameworks have been proposed in the literature but without considering the dynamic deployment and re-configuration of pervasive service components in a self-controlled manner. Further, very little attention has been paid to providing autonomic pervasive services in mobile wireless environments. This paper addresses above issues. In particular, a service component self-deployment algorithm using partitioning techniques is proposed and evaluated; a simple service re-configuration algorithm is also presented; and a service self-healing mechanism is proposed. In addition, an AC environment is proposed and its major building blocks being discussed. This AC environment serves as the container of the proposed service component self-deployment, self-configuration and self-healing mechanisms. A prototype of the system has been designed and implemented. Experiment results have shown the effectiveness of the proposed mechanisms.Published here -
Bayley I, Zhu H, 'Formal specification of the variants and behavioural features of design patterns'
Journal of Systems and Software 83 (2) (2010) pp.209-221
ISSN: 0164-1212AbstractPublished hereThe formal specification of design patterns is widely recognized as being vital to their effective and correct use in software development. It can clarify the concepts underlying patterns, eliminate ambiguity and thereby lay a solid foundation for tool support. This paper further advances a formal meta-modelling approach that uses first order predicate logic to specify design patterns. In particular, it specifies both structural and behavioural features of design patterns and systematically captures the variants in a well-structured format. The paper reports a case study involving the formal specification of all 23 patterns in the Gang of Four catalog. It demonstrates that the approach improves the accuracy of pattern specifications by covering variations and clarifying the ambiguous parts of informal descriptions.
-
Zhu H, Wang F, Wang S, 'On the convergence of autonomous agent communities'
Multiagent and Grid Systems 6 (4) (2010) pp.315-352
ISSN: 1574-1702AbstractCommunity is a common phenomenon in natural ecosystems, human societies as well as artificial multi-agent systems such as those in web and Internet based applications. In many self-organizing systems, communities are formed evolutionarily in a decentralized way through agents' autonomous behavior. This paper systematically investigates the properties of a variety of the self-organizing agent community systems by a formal qualitative approach and a quantitative experimental approach. The qualitative formal study by applying formal specification in SLABS and Scenario Calculus has proven that mature and optimal communities always form and become stable when agents behave based on the collective knowledge of the communities, whereas community formation does not always reach maturity and optimality if agents behave solely based on individual knowledge, and the communities are not always stable even if such a formation is achieved. The quantitative experimental study by simulation has shown that the convergence time of agent communities depends on several parameters of the system in certain complicated patterns, including the number of agents, the number of community organizers, the number of knowledge categories, and the size of the knowledge in each category.Published here -
Shan L, Zhu H, 'Generating structurally complex test cases by data mutation: a case study of testing an automated modelling tool'
The Computer Journal 52 (5) (2009) pp.571-588
ISSN: 0010-4620AbstractPublished hereGeneration of adequate test cases is difficult and expensive, especially for testing software systems whose input is structurally complex. This paper presents an approach called data mutation to generating a large number of test data from a few seed test cases. It is inspired by mutation testing methods, but differs from them in the aim and the way that mutation operators are defined and used. While mutation testing is a method for measuring test adequacy, data mutation is a method of test case generation. In traditional mutation testing, mutation operators are used to transform the program under test. In contrast, mutation operators in our approach are applied on input data to generate test cases, hence called data mutation operators. The paper reports a case study with the method on testing an automated modelling tool to illustrate the applicability of the proposed method. Experiment data clearly demonstrate that the method is adequate and cost effective, and able to detect a large proportion of faults.
-
Mao X, Shan L , Zhu H, Wang J, 'An adaptive casteship mechanism for developing multi-agent systems'
International Journal of Computer Applications in Technology 31 (1/2) (2008) pp.17-34
ISSN: 0952-8091AbstractIn this paper, we propose an adaptive casteship mechanism for modelling anddesigning adaptive Multi-Agent Systems (MAS). In our approach, caste is the modular unit andabstraction that specify agents" behaviour. Adaptive behaviours of agents are captured as the change of castes during their lifecycles by executing"join" ,"quit" ,"activate" and"deactivate" operations on castes. The formal semantics of caste operations are rigorously defined. The properties of agent" s adaptive behaviours are formally specified and proved. A graphicalnotation of caste transition diagrams and a number of rules for check consistency are designed. An example is also presented throughout the paper.Published here -
Bayley I, Zhu H, 'On the Composition of Design Patterns'
Quality Software Proceeedings on quality software 2008 (QSIC) , 12-13 (2008) pp.27-36
ISSN: 1550-6002AbstractDesign patterns are usually applied in a composed form with each other. It is crucial to be able to formally reason about how patterns can be composed and to prove the properties of composed patterns. Based on our previous work on formal specification of design patterns and formal reasoning about their properties, this paper focuses on the composition of design patterns. A notion of composition of patterns with respect to overlaps is formally defined based on two operations on design patterns, which are the specialisation of a pattern with constraints and the lifting of a pattern with a subset of components as the key. The composition of design patterns is illustrated by the composition of Composite, Strategy and Observer patterns. A case study of the formalisation of the relationship between patterns as suggested by GoF is also reported.Published here -
Bayley I, Zhu H, 'Specifying behavioural features of design patterns in first order logic'
IEEE International Computer Software and Applications (COMPSAC 2008), Proceedings. (2008) pp.203-210
ISSN: 0730-3157AbstractThe formal specification of design patterns is widely recognised as being vital to their effective and correct use in software development. It can clarify the concepts underlying patterns, eliminate ambiguity and thereby lay a solid foundation for tool support. This paper further advances an approach that uses first order predicate logic to specify design patterns by capturing the dynamic behaviour represented in sequence diagrams. A case study of all 23 patterns in the Gang of Four catalogue demonstrates that it can not only capture dynamic features but also simplify the specification of structural properties.Published here -
Zhu H, Shan L, 'Modelling Web Services in the agent-oriented modelling language and environment CAMLE'
International Journal of Simulation and Process Modelling 3 (1& 2) (2007) pp.26-44
ISSN: 1740-2123AbstractA particular difficulty in the development of Web Services applications is caused by the lack of communications between developers from different vendors. This paper investigates how modelling can help to solve the problem using the caste-centric agent-oriented modelling language and environment CAMLE, and illustrates the method by an example of online auction service. The use of CAMLE in model consistency check and specification generation is also discussed. Software engineers are enabled to specify not only the service provider's functionality and behaviour, but also the requirements and restrictions on service requesters' behaviours.Published here -
Lightfoot D, Zhu H, 'Caste: a Step Beyond Object Orientation'
Lecture Notes in Computer Science 2789 (2003) pp.59-62
ISSN: 0302-9743AbstractThe concepts of object and class have limitations in many situations, such as where inheritance does not offer natural solutions and design patterns have to be applied at a cost to polymorphism, extendibility and software reusability. Proposals have been advanced to generalize the concepts of object and class, such as in Active Oberon. We adapt the concept of agents to software engineering, propose a new concept called caste ? a sort of ‘dynamic class’ of agents, and examine how it helps solve common problems in object orientation.
-
Zhu H, He X, 'A methodology of testing high-level Petri nets'
Information and Software Technology 44 (8) (2002) pp.473-489
ISSN: 0950-5849AbstractPetri nets have been extensively used in the modelling and analysis of concurrent and distributed systems. The verification and validation of Petri nets are of particular importance in the development of concurrent and distributed systems. As a complement to formal analysis techniques, testing has been proven to be effective in detecting system errors and is easy to apply. An open problem is how to test Petri nets systematically, effectively and efficiently. An approach to solve this problem is to develop test criteria so that test adequacy can be measured objectively and test cases can be generated efficiently, even automatically. In this paper, we present a methodology of testing high-level Petri nets based on our general theory of testing concurrent software systems. Four types of testing strategies are investigated, which include state-oriented testing, transition-oriented testing, flow-oriented testing and specification-oriented testing. For each strategy, a set of schemes toobserve and record testing results and a set of coverage criteria to measure test adequacy are defined. The subsumption relationships and extraction relationships among the proposed testing methods are systematically investigated and formally proved.Published here -
Zhu H, He XD, 'A Methodology of Testing High-level Petri Nets'
Information and Software Technology 44 (2002) pp.473-489
ISSN: 0950-5849 eISSN: 1873-6025 -
Zhu H, 'Slabs: a Formal Specification Language for Agent-based Systems'
International Journal of Software Engineering and Knowledge Engineering 11 (5) (2002) pp.529-558
ISSN: 0218-1940 eISSN: 1793-6403Published here -
Zhu H, Jin L, Diaper D, Bai G, 'Software requirements validation via task analysis'
Journal of Systems and Software 61 (2) (2002) pp.145-169
ISSN: 0164-1212AbstractAs a baseline for software development, a correct and complete requirements definition is one foundation of software quality.Previously, a novel approach to static testing of software requirements was proposed in which requirements definitions are tested on a set of task scenarios by examining software behaviour in each scenario described by an activity list. Such descriptions of software behaviour can be generated automatically from requirements models. This paper investigates various testing methods for selectingtest scenarios. Data flow, state transition and entity testing methods are studied. A variety of test adequacy criteria and their combinations are formally defined and the subsume relations between the criteria are proved. Empirical studies of the testing methods and the construction of a prototype testing tool are reported.Published here -
Zhu H, Jin LZ, Diaper D, Bai GH, 'Software Requirements Validation Via Task Analysis'
Journal of Systems and Software 61 (2002) pp.145-169
ISSN: 0164-1212 eISSN: 1873-1228 -
Yan S, Zhu H, 'Metamorphic Testing on Scientific Programs for Solving Second Order Elliptic Differential Equations'
Journal of Software: Testing, Verification & Reliability [in press]
ISSN: 0960-0833 eISSN: 1099-1689AbstractOpen Access on RADARPractical problems in scientific computation that solve differential equations rarely have explicit exact solutions. Therefore, verifying the correctness of such programs has long been a challenge due to the difficulty of producing expected outputs on test cases. In this paper, the principles of metamorphic testing are applied to verify programs that solve second-order elliptic differential equations. We present a testing process specifically tailored for the verification testing of scientific computation programs and integrate it to the process of developing scientific software. Unlike existing approaches, we formally derive metamorphic relations from the numerical models of differential equations built in development process of scientific computing programs. The experimental results clearly show that our approach is effective in detecting faults commonly found in scientific computing programs. It outperforms the fault detecting ability of the trend method, which is a traditional testing method for scientific software.
Books
-
Blackwell C, Zhu H, (ed.), CyberPatterns: Unifying Design Patterns with Security Patterns and Attack Patterns, Springer (2014)
ISBN: 978-3-319-04446-0AbstractCyberspace in increasingly important to people in their everyday lives for purchasing goods on the Internet, to energy supply increasingly managed remotely using Internet protocols. Unfortunately, this dependence makes us susceptible to attacks from nation states, terrorists, criminals and hactivists. Therefore, we need a better understanding of cyberspace, for which patterns, which are predictable regularities, may help to detect, understand and respond to incidents better. The inspiration for the workshop came from the existing work on formalising design patterns applied to cybersecurity, but we also need to understand the many other types of patterns that arise in cyberspace. Includes extended and updated papers from the First International Workshop on Cyberpatterns.
-
Zhu H, (ed.), CyberPatterns: Unifying Design Patterns with Security Patterns and Attack Patterns, (2014)
Book chapters
-
Zhu H, 'Cyberpatterns: Towards a Pattern Oriented Study of Cyberspace' in CyberPatterns:Unifying Design Patterns with Security Patterns and Attack Patterns, Springer (2014)
ISBN: 978-3-319-04447-7AbstractPublished hereA pattern represents a discernible regularity in the world or in manmade designs. In the prescriptive point of view, a pattern is a template from which instances can be created; while in the descriptive point of view, the elements of a pattern that repeat in a predictable manner can be observed and recognised. Similar to theories in sciences, patterns explain and predict regularities in a subject domain. In a complicated subject domain like cyberspace, there are usually a large number of patterns that each describes and predicts a subset of recurring phenomena, yet these patterns can interact with each other and be interrelated and composed with each other. The pattern-oriented research method studies a subject domain by identifying the patterns, classifying and categorising them, organising them into pattern languages, investigating the interactions between them, devising mechanisms and operations for detecting and predicting their occurrences, and facilitating their instantiations. This chapter illustrates this research methodology through a review of the research on software design patterns as an example of successful application of the methodology. It then discusses its possible applications to the research on cyberpatterns, i.e. patterns in cyberspace. It defines the scope of research, reviews the current state of art and identifies the key research questions.
-
Zhu H, Zhang Q, Zhang Y, 'HASARD: A Model-Based Method for Quality Analysis of Software Architecture' in Relating System Quality and Software Architecture, Elsvier Inc. (2014)
ISBN: 978-0-12-417009-4AbstractPublished hereHASARD stands for Hazard Analysis of Software ARchitectural Designs. It is a model-based method for the analysis of software quality as entailed in software architectural designs. In this method, quality hazards are systematically explored and their causes and effects are identified. The results of hazard analysis are then transformed into a graphical quality model of the system under assessment. Queries about quality related properties of the system can be answered automatically through a set of algorithms with the quality model as the input. Such queries include: (1) the relationships between various quality attributes as manifested in the particular design, (2) the design decisions critical to a given quality concern, (3) the impacts of a given design decision to quality attributes, (4) trade-off points in the design that balancing conflict quality
concerns, etc. The chapter will present a prototype tool called SQUARE to support the whole process of HASARD analysis. It will also report a case study with a real software system to demonstrate the feasibility and usability of the HASARD method.
-
Hong Zhu, 'Design Space-Based Pattern Representation' in CyberPatterns: Unifying Design Patterns with Security Patterns and Attack Patterns, Springer (2014)
AbstractAs knowledge of solutions to recurring design problems, a large numberof software design patterns (DP) has been identified, catalogued and formalised in
the past decades. Tools have been developed to support the application and recognition
of patterns. However, although the notions of pattern in different subject domains
carry a great deal of similarity, we are in lack of a general pattern representation
approach that applies to all types of design patterns. Based on our previous
work on formalisation of OO DPs and an algebra of pattern compositions, this paper
proposes a generalisation of the approach so that it can be applied to other types
of DPs. In particular, a pattern is defined as a set of points in a design space that
satisfy certain conditions. Each condition specifies a property of the instances of the
pattern in a certain view of the design space. The patterns can then be composed
and instantiated through applications of operators defined on patterns. The paper
demonstrates the feasibility of the proposed approach by examples of patterns of
enterprise security architecture.
-
Clive Blackwell and Hong Zhu, 'Future Directions for Research on Cyberpatterns' in CyberPatterns: Unifying Design Patterns with Security Patterns and Attack Patterns, Springer (2014)
AbstractAs patterns in cyberspace, cyberpatterns shed light on research on thedevelopment of cyber systems from a new angle. They can help us move from an
improvised craft to an engineering discipline because they help to transfer knowledge
about proven solutions in an understandable and reusable format. They allow
innovative applications in cloud, cyber-physical and mobile systems, and novel
methods of use with data patterns for observation and analysis of ‘big data’ problems.
The ultimate aim of research on cyberpatterns is an overall framework for
cyberpatterns integrating all the cyber domains to help develop a betterunderstood
and effective cyberspace. However, there are many research questions
in cyberpatterns that remain unanswered regarding both their conceptual foundation
and practical use. This chapter concludes the book by exploring some of the
most critical and important problems needing to be addressed.
-
Jin Z, Zhu H, 'Unifying domain ontology with agent-oriented modelling of services' in Ieee 6th international symposium on service oriented system engineering (sose), 2011 , IEEE (2011)
ISBN: 9781467304115AbstractModeling plays a crucial role in model-driven development of service-oriented systems. This paper proposes a framework for service-oriented modeling that combines an agent-oriented software development methodology with an ontology-based domain analysis technique. It aims at improving the dynamic composability of services at requirements and design stages through modeling. The framework consists of an architectural structure of service models and a process of modeling. The architecture combines agent-oriented models of software systems in which service providers and requesters are regarded as autonomous entities (and called agents), and domain ontology, which specifies the entities in the application domain and their dynamic behaviors. The domain ontology extends classic ontology by including causal and symbolic entities as well as autonomous entities. The approach is illustrated by an example of online auction service.Published here -
Zhu H, Shan L, Bayley I, Amphlett R, 'A formal descriptive semantics of UML and its applications' in UML 2 Semantics and Applications, John Wiley & Sons (2009)
ISBN: 9780470409084AbstractSummary This chapter contains sections titled: Introduction Definition of Descriptive Semantics in FOPL The LAMBDES Tool Applications Using Model and Metamodel Analysis Conclusions ReferencesPublished here -
Yu B, Kong L, Zhang Y, Zhu H, 'Testing Java Components based on Algebraic Specifications' in 2008 First International conference on software testing, verification, and validation, IEEE (2008)
ISBN: 9780769531274AbstractPublished hereThis paper presents a method of component testing based on algebraic specifications. An algorithm for generating checkable test cases is proposed. A proto-type testing tool called CASCAT for testing Java En-terprise Beans is developed. It has the advantages of high degree of automation, which include test case generation, test harness construction and test result checking. It achieves scalability by allowing incre-mental integration. It also allows testing to focus on a subset of used functions and key properties, thus suit-able for component testing. The paper also reports an experimental evaluation of the method and the tool.
Conference papers
-
Paul DG, Zhu H, Bayley I, 'Benchmarks and Metrics for Evaluations of Code Generation: A Critical Review'
(2024)
AbstractPublished here Open Access on RADARWith the rapid development of Large Language Models (LLMs), a large number of machine learning models
have been developed to assist programming tasks including the generation of program code from natural language input. However, how to evaluate such LLMs for this task is still an open problem despite of the great amount of research efforts that have been made and reported to evaluate and compare them. This paper provides a critical review of the existing work on the testing and evaluation of these tools with a focus on two key aspects: the benchmarks and the metrics used in the evaluations. Based on the review, further research directions are discussed. -
Paul DG, Zhu H, Bayley I, 'ScenEval: A Benchmark for Scenario-Based Evaluation of Code Generation'
(2024)
AbstractPublished here Open Access on RADARIn the scenario-based evaluation of machine learning models, a key problem is how to construct test datasets that represent various scenarios. The methodology proposed in this paper is to construct a benchmark and attach metadata to each test case. Then a test system can be constructed with test morphisms that filter the test cases based on metadata to form a dataset.
The paper demonstrates this methodology with large language models for code generation. A benchmark called ScenEval is constructed from problems in textbooks, an online tutorial website and Stack Overflow. Filtering by scenario is demonstrated and the test sets are used to evaluate ChatGPT for Java code generation.
Our experiments found that the performance of ChatGPT decreases with the complexity of the coding task. It is weakest for advanced topics like multi-threading, data structure algorithms and recursive methods. The Java code generated by ChatGPT tends to be much shorter than reference solution in terms of number of lines, while it is more likely to be more complex in both cyclomatic and cognitive complexity metrics, if the generated code is correct. However, the generated code is more likely to be less complex than the reference solution if the code is incorrect.
-
Miah T, Zhu H, 'User Centric Evaluation of Code Generation Tools'
(2024)
AbstractPublished here Open Access on RADARWith the rapid advance of machine learning (ML) technology, large language models (LLMs) are increasingly explored as an intelligent tool to generate program code from natural language specifications. However, existing evaluations of LLMs have focused on their capabilities in comparison with humans. It is desirable to evaluate their usability when deciding on whether to use a LLM in software production. This paper proposes a user centric method for this purpose. It includes metadata in the test cases of a benchmark to describe their usages, conducts testing in a multi-attempt process that mimics the uses of LLMs, measures LLM generated solutions on a set of quality attributes that reflect usability, and evaluates the performance based on user experiences in the uses of LLMs as a tool.
The paper also reports a case study with the method in the evaluation of ChatGPT’s usability as a code generation tool for the R programming language. Our experiments demonstrated that ChatGPT is highly useful for generating R program code although it may fail on hard programming tasks. The user experiences are good with overall average number of attempts being 1.61 and the average time of completion being 47.02 seconds. Our experiments also found that the weakest aspect of usability is conciseness, which has a score of 3.80 out of 5.
-
Zhu, H, Tran TMT, Benjumea S, Bradley A
, 'A Scenario-Based Functional Testing Approach to Improving DNN Performance'
SOSE 2023 (2023) pp.199-207
ISSN: 2835-3307 eISSN: 2835-3161 ISBN: 9798350322392AbstractPublished here Open Access on RADARThis paper proposes a scenario-based functional testing approach for enhancing the performance of machine learning (ML) applica- tions. The proposed method is an iterative process that starts with testing the ML model on various scenarios to identify areas of weakness. It follows by a further testing on the suspected weak scenarios and statistically evaluate the model’s performance on the scenarios to confirm the diagnosis. Once the diagnosis of weak scenarios is confirmed by test results, the treatment of the model is performed by retraining the model using a transfer learning technique with the original model as the base and applying a set of training data specifically targeting the treated scenarios plus a subset of training data selected at random from the original train dataset to prevent the so-call catastrophic forgetting effect. Finally, after the treatment, the model is assessed and evaluated again by testing on the treated scenarios as well as other scenarios to check if the treatment is effective and no side-effect caused. The paper reports a case study with a real ML deep neural network (DNN) model, which is the perception system of an autonomous racing car. It is demonstrated that the method is effective in the sense that DNN model’s performance can be improved. It provides an efficient method of enhancing ML model’s performance with much less human and compute resource than retrain from scratch.
-
Zhu H, Bayley I, Green M, 'Metrics for Measuring Error Extents of Machine Learning Classifiers'
(2022)
ISBN: 9781665487375AbstractPublished here Open Access on RADARMetrics play a crucial role in evaluating the performance of machine learning (ML) models. Metrics for quantifying the extent of errors, in particular, have been intensively studied and widely used but only so far for regression models. This paper focuses instead on classifier models. A new approach is proposed in which datamorphic exploratory testing is used to discover the boundary values between classes and the distance of misclassified instances from that boundary is used to quantify the errors that the model makes. Empirical experiments and case studies are reported that validate and evaluate the proposed metrics.
-
Zhu H, Bayley I, Wang H, 'Continuous Debugging of Microservices'
(2020) pp.736-745
ISBN: 9781665414852AbstractPublished here Open Access on RADARDebugging is one of the most difficult tasks during the development of cloud-native applications for the microservices architecture. This paper proposes a continuous debugging facility to support the DevOps continuous development methodology. It has been implemented and integrated into the Integrated DevOps Environment CIDE for microservices written in the agent-oriented programming language CAOPLE.
The paper also reports controlled experiments with the debug facility. Experiment data show that the overhead is less than 3% of the execution time on average. -
Zheng X, Liu D, Zhu H, Bayley I, 'Pattern-based Approach to Modelling and Verifying System Security'
(2020) pp.92-102
AbstractPublished here Open Access on RADARSecurity is one of the most important problems in the engineering of online service-oriented systems. The current best practice in security design is a pattern-oriented approach. A large number of security design patterns have been identified, categorised and documented in the literature. The design of a security solution for a system starts with identification of security requirements and selection of appropriate security design patterns; these are then composed together. It is crucial to verify that the composition of security design patterns is valid in the sense that it preserves the features, semantics and soundness of the patterns and correct in the sense that the security requirements are met by the design. This paper proposes a methodology that employs the algebraic specification language SOFIA to specify security design patterns and their compositions. The specifications are then translated into the Alloy formalism and their validity and correctness are verified
using the Alloy model checker. A tool that translates SOFIA into Alloy is presented. A case study with the method and the tool is also reported. -
Zhu H, Bayley I, Liu D, Zheng X, 'Automation of Datamorphic Testing'
(2020) pp.64-72
AbstractPublished here Open Access on RADARThis paper presents an automated tool called Morphy for datamorphic testing. It classifies software test artefacts into test entities and test morphisms, which are mappings on testing entities. In addition to datamorphisms, metamorphisms and seed test case makers, Morphy also employs a set of other test morphisms including test case metrics and filters, test set metrics and filters, test result analysers and test executers to realise test automation. In particular, basic testing activities can be automated by invoking test morphisms. Test strategies can be realised as complex combinations of test morphisms. Test processes can be automated by recording, editing and playing test scripts that invoke test morphisms and strategies. Three types of test strategies have been implemented in Morphy: datamorphism combination strategies, cluster border exploration strategies and strategies for test set optimisation via genetic algorithms. This paper focuses on the datamorphism combination strategies by giving their definitions and implementation algorithms. The paper also illustrates their uses for testing both traditional software and AI applications with three case studies.
-
Zhu H, Bayley I, 'Exploratory Datamorphic Testing of Classification Applications'
(2020) pp.51-60
AbstractPublished here Open Access on RADARTesting has been widely recognised as difficult for AI applications. This paper proposes a set of testing strategies for testing machine learning applications in the framework of the datamorphism testing methodology. In these strategies, testing aims at exploring the data space of a classification or clustering application to discover the boundaries between classes that the machine learning application defines. This enables the tester to understand precisely the behaviour and function of the software under test. In the paper, three variants of exploratory strategies are presented with the algorithms as implemented in the automated datamorphic testing tool Morphy. The correctness of these algorithms are formally proved. The paper also reports the results of some controlled experiments with Morphy that study the factors that affect the test effectiveness of the strategies.
-
Zhu H, Liu D, Bayley I, Harrison R, Cuzzolin F, 'Datamorphic Testing: A Method for Testing Intelligent Applications'
(2019) pp.149-156
ISBN: 9781728104935 eISBN: 9781728104928AbstractPublished here Open Access on RADARAdequate testing of AI applications is essential to ensure their quality. However, it is often prohibitively difficult to generate realistic test cases or to check software correctness. This paper proposes a new method called datamorphic testing, which consists of three components: a set of seed test cases, a set of datamorphisms for transforming test cases, and a set of metamorphisms for checking test results. With an example of face recognition application, the paper demonstrates how to develop datamorphic test frameworks, and illustrates how to perform testing in various strategies, and validates the approach using an experiment with four real industrial applications of face recognition.
-
Zhu H, Wang H, Bayley I, 'Formal Analysis of Load Balancing in Microservices with Scenario Calculus'
(2018)
eISSN: 2159-6190AbstractLoad balancing plays a crucial role in realising the benefits of microservices, especially to achieve elastic scalability and performance optimisation. However, it is different from load balancing for virtual machines, because workloads on microservices are harder to predict and the number of services in the systems is large. In this paper, we formalise load balance as an emergent property of the microservices ecosystem, and employ scenario calculus to formally analyse the impact of scheduling on service capability and scalability. We discovered that elastic round robin scheduling is highly scalable but the service capability is limited by the slowest microservice instance. In contrast, shortest waiting queue scheduling is less scalable, but the service capability is higher.Published here Open Access on RADAR -
Zhu H, 'Software Testing as A Problem of Machine Learning: Towards a Foundation on Computational Learning Theory (Extended Abstract of Keynote Speech)'
(2018)
Open Access on RADAR -
Zhu H, Bayley I, 'If Docker Is The Answer, What Is The Question? A Case for Software Engineering Paradigm Shift Towards Service Agent Orientation'
(2018)
AbstractThe recent rise of cloud computing poses serious challenges for software engineering because it adds complexity not only to the platform and infrastructure, but to the software too. The demands on system scalability, performance and reliability are ever increasing. Industry solutions withOpen Access on RADAR
widespread adoption include the microservices architecture, the container technology and the DevOps methodology. These approaches have changed software engineering practice in
such a profound way that we argue that it is becoming a paradigm shift. In this paper, we examine the current support of programming languages for the key concepts behind the change in software engineering practice and argue that a novel programming language is required to support the new paradigm. We report a new programming language CAOPLE
and its associated Integrated DevOps Environment CIDE and demonstrate the utility of both. -
Liu D, Zhu H, Xu C, Bayley I, Lightfoot D, Green M, Marshall P, 'CIDE: An Integrated Development Environment for Microservices'
(2016) pp.808-812-
ISBN: 978-1-5090-2628-9/16AbstractMicroservices is a flexible architectural style that has many advantages over the alternative monolithic style. These include better performance and scalability. It is particularly suitable, and widely adopted, for cloud-based applications, because in this architecture a software system consisting of a large suite of services of fine granularity, each running its own process and communicating with the others. However, programming such systems is more complex. In this paper we report on CIDE, an integrated software development environment that helps with this. CIDE supports programming in a novel agent-oriented language called CAOPLE and tests their execution in a cluster environment. We present the architecture of CIDE, discuss its design based on the principles of the DevOps software development methodology, and describe facilities that support continuous testing and seamless integration, two other advantages of Microservices.Published here Open Access on RADAR -
Ogunyadeka A, Younas M, Zhu H, Aldea A, 'A Multi-key Transactions Model for NoSQL Cloud Database Systems'
(2016) pp.24-27
ISBN: 978-1-5090-2251-9AbstractPublished here Open Access on RADARNoSQL cloud database systems are new types of databases that are built across thousands of cloud nodes and are capable of storing and processing Big Data. NoSQL systems have
increasingly been used in large scale applications that need high availability and efficiency but with weaker consistency. Consequently, such systems lack support for standard transactions which provide stronger consistency. This paper proposes a new multi-key transactional model which provides NoSQL systems with standard transaction support and stronger level of data consistency. The strategy is to supplement current NoSQL architecture with an extra layer that manages transactions. The proposed model is configurable where consistency, availability and efficiency can be adjusted based on application requirements. The proposed model is validated through a prototype system using MongoDB. Preliminary experiments show that it ensures stronger consistency and maintains good performance.
-
Xu C, Zhu H, Bayley I, Lightfoot D, Green M, Marshall P, 'CAOPLE: A Programming Language for Microservices SaaS'
(2016) pp.42-52
ISBN: 978-1-5090-2253-3AbstractPublished here Open Access on RADARThe microservices architecture is widely regarded as a promising approach to service-oriented systems. However, developing applications in the microservices architecture presents three main challenges: (a) how to program systems that consists of a large number of services running in paral- lel and distributed over a cluster of computers; (b) how to reduce the communication overhead caused by executing a large number of small services; (c) how to support the flexi- ble deployment of services to a network to achieve system load balance. This paper presents a programming language called CAOPLE and reports the implementation of the lan- guage on a virtual machine called CAVM-2. The paper demonstrates how this approach meets these challenges.
-
Liu D, Wu X, Zhang X, Zhu H, Bayley I, 'Monic Testing of Web Services Based on Algebraic Specifications'
(16022376) (2016) pp.24-33
ISBN: 978-1-5090-2253-3AbstractWeb services are designed to be discovered and composed dynamically, which implies that testing must also be done dynamically. This involves both the generation of test cases and the checking of test results. This paper presents algorithms for both of these using the technique of algebraic specification. It focuses in particular on the problem that web services, when they are third-party, have poor controllability and observability, and introduces a solution known as monic floating checkable test cases. A prototype tool has implemented the proposed testing technique and it is applied to a case study with a real industry application GoGrid, demonstrating that the technique is both applicable and feasible.Published here Open Access on RADAR -
Eisa M, Younas M, Basu K, Zhu H, 'Trends and Directions in Cloud Service Selection'
(2016) pp.423-432
ISBN: 978-1-5090-2253-3AbstractWith the growing popularity of cloud computing the number of cloud service providers and services have significantly increased. Thus selecting the best cloud services becomes a challenging task for prospective cloud users. The process of selecting cloud services involves various factors such as characteristics and models of cloud services, user requirements and knowledge, and service level agreement (SLA), to name a few. This paper investigates into the cloud service selection tools, techniques and models by taking into account the distinguishing characteristics of cloud services. It also reviews and analyses academic research as well as commercial tools in order to identify their strengths and weaknesses in the cloud services selection process. It proposes a framework in order to improve the cloud service selection by taking into account services capabilities, quality attributes, level of user's knowledge and service level agreements. The paper also envisions various directions for future research.Published here Open Access on RADAR -
Liu D, Yang Y, Chen Y, Zhu H, Bayley I, Aldea A, 'Evaluating the Ontological Semantic Description of Web Services Generated from Algebraic Specifications'
(16022405) (2016) pp.211-220
ISBN: 978-1-5090-2253-3AbstractPublished here Open Access on RADARThe semantics of web services can be described using ontology or formally specified in mathematical notations. The former is comprehensible and searchable, while the latter is testable and verifiable. To take advantage of both, we proposed, in our previous work, a transformation that takes an algebraic specification of a web service to generate a domain ontology and a semantic description of the service on that ontology.
This paper investigates the quality of these two outputs by proposing a general framework of ontology evaluation that assesses them on 4 aspects of quality, which are decomposed into 8 factors and then measured by a set of 37 metrics. It reports a case study on 3 real-life examples of web services. The results show that the ontologies and semantic descriptions generated from formal specifications are of satisfactory quality.
-
Zhu H, Bayley I, Younas M, Lightfoot D, Yousef B, Liu D, 'Big SaaS: The Next Step Beyond Big Data'
(2015) pp.775-784
ISBN: 978-1-4673-7277-0AbstractPublished here Open Access on RADARSoftware-as-a-Service (SaaS) is a model of cloud computing in which software functions are delivered to the users as services. The past few years have witnessed its global flourishing. In the foreseeable future, SaaS applications will integrate with the Internet of Things, Mobile Computing, Big Data, Wireless Sensor Networks, and many other computing and communication technologies to deliver customizable intelligent services to a vast population. This will give rise to an era of what we call Big SaaS systems of unprecedented complexity and scale. They will have huge numbers of tenants/users interrelated in complex ways. The code will be complex too and require Big Data but provide great value to the customer. With these benefits come great societal risks, however, and there are other drawbacks and challenges. For example, it is difficult to ensure the quality of data and metadata obtained from crowdsourcing and to maintain the integrity of conceptual model. Big SaaS applications will also need to evolve continuously. This paper will discuss how to address these challenges at all stages of the software lifecycle.
-
Liu D, Liu Y, Zhang X, Zhu H, Bayley I, 'Automated Testing of Web Services Based on Algebraic Specification'
15252318 (2015) pp.143-152
ISBN: 978-1-4799-8355-1AbstractThe testing of web services must be done in a com- pletely automated manner when it takes place on-the-fly due to third-party services are dynamically composed to. We present an approach that uses algebraic specification to make this possible. Test data is generated from a formal specification and then used to construct and submit service requests. Test results are then extracted and checked against the specification. All these are done automatically, as required. We present ASSAT (Algebraic Specification-Based Service Automated Testing), a prototype that performs these tasks and demonstrate its utility by applying it to Amazon Web Services, a real-life industrial example.Published here Open Access on RADAR -
Zhu H, Yousef B, Younas M, 'Evaluation of a Tenant Level Checkpointing Technique for SaaS Applications'
(2015) pp.989-994
ISBN: 978-1-4673-7286-2AbstractPublished hereTenant level checkpointing is a novel fault-tolerance technique proposed in our previous work for largescale software-as-a-service applications. This paper evaluates the technique by a theoretical analysis of the technique and an empirical study using workload benchmarks of real largescale cluster systems. We propose the notion of accumulated delay as a measurement of the performance of checkpointing techniques. It reflects the effect of checkpointing on system
performance as experienced by the end users. Both theoretical analysis and empirical study show that tenant level checkpointing significantly outperforms traditional bulk checkpointing as
measured by the amount of accumulated delays.
-
Zhu H, 'JFuzz: A tool for automated java unit testing based on data mutation and metamorphic testing methods'
(2015) pp.8-15
ISBN: 978-1-4673-9580-9AbstractPublished here Open Access on RADARAutomated test framework plays a significant role in test driven software development methodologies. The XUnit family of testing tools has been widely used in the industry. However, they are weak in supporting test case generation and test result checking. In this paper we propose a new kind of test automation framework by integrating data mutation testing and metamorphic testing methods. A tool for unit testing of Java class called JFuzz is presented. Its uses are illustrated by examples.
-
Shan L, Du C, Zhu H, 'Modeling and Simulating Adaptive Multi-Agent Systems with CAMLE'
2 (2015) pp.147-152
ISSN: 0730-3157 ISBN: 978-1-4673-6563-5AbstractPublished hereWith the advent of embedded and mobile computing techniques, software systems are increasingly operated in open and dynamic environments. Such systems desire self-adaptive capabilities. This paper proposes an agent-oriented approach to the modeling and simulation of distributed adaptive systems. The approach enables to construct structural and behavioral models at a high abstraction level, and to validate the models at design stage through model simulation. The modeling language facilitates devising decentralized adaption logic at both component and system levels. Simulation demonstrates how a
system dynamically re-organizes in response to the changes in the operating context. Case studies show that our approach can support various self-adaptation mechanisms with which a multi-agent system adjusts to internal failures, unexpected environment, or user's changing requirements.
-
Liu D, Zhu H, Bayley I, 'SOFIA: An Algebraic Specification Language for Developing Services'
(2014) pp.70-75
AbstractPublished here Open Access on RADARDescribing the semantics of services accurately plays a crucial role in service discovery, execution, composition and interaction. Formal specification techniques, having evolved over the past 30 years, can define the semantics of software systems in a verifiable and testable manner. This paper presents a new algebraic specification language called SOFIA for describing the semantics of services. It unifies the approaches using algebras and co-algebras for software specifications. A case study with a real industry example, the GoGrid cloud's resource management services, demonstrates that the semantics of services can be specified in SOFIA.
-
Yousef B, Zhu H, Younas M, 'Tenant Level Checkpointing of Meta-Data for Multi-Tenancy SaaS'
(2014) pp.148-153
ISBN: 978-1-4799-2504-9AbstractPublished here Open Access on RADARTraditional checkpointing techniques are facing a grave challenge when applied to multi-tenancy software-as-a-service (SaaS) systems due to the huge scale of the system state and the diversity of users' requirements on the quality of services. This paper proposes the notion of tenant level checkpointing and an algorithm that exploits Big Data techniques to checkpoint tenant's meta-data, which are widely used in configuring SaaS for tenant-specific features. The paper presents a prototype implementation of the proposed technique using NoSQL database Couchbase and reports the experiments that compare it with traditional implementation of checkpointing using file systems. Experiments show that the Big Data approach has a significantly lower latency in comparison with the traditional approach.
-
Liu DM, Zhu H, Bayley I, 'A Case Study on Algebraic Specification of Cloud Computing'
(2013) pp.269-273
AbstractPublished hereA cloud often provides a RESTful interface with which to access its services. These are usually specified through an open but informal document in the IT industry. There is no agreed standard for the specification of RESTful web services. In this paper, we propose the application of an algebraic method to the formal specification of such services and report a case study with the GoGrid’s RESTful API, an industrial real system that provides Infrastructure-as-a-Service. The case study demonstrates that the algebraic approach can provide formal unambiguous specifications that are easy to read and write. It also demonstrates that formalisation can identify and eliminate ambiguity and inconsistency in informal documents.
-
Zhu H, 'Position Statement Can Software Design Benefit From Creative Computing?'
(2013) pp.310-311
AbstractPublished hereThe research on software design in the past decades has made some breakthroughs that may lead to a revolutionary new type of software design tools so that software developers can benefit from creative computing. These research results combined with the general theory of design methodologies may make creativity in software design benefit from automation. The following briefly examine such theories and research results in the light of creative computing.
-
Zhu H, 'Position Statement: Can Testing Prove Software Has No Bug?'
(2013) pp.255-255
AbstractPublished hereThe author presents a theorem that states that the correctness of a software system can be validated by testing on a finite number of test cases provided that the program and specification are in a learnable set of functions. Moreover, such testing can be performed without writing down a formal specification. This lays a foundation of the current practice of software testing where formal specifications are not available. This theorem implies that what current testing practice lacks is an analysis of the "complexity" of the program to determine a learnable set within which the program and the specification vary.
-
Zhu H, Bayley I, 'Laws of Pattern Composition'
6447 (2011) pp.630-645
-
Bayley I, Zhu H, 'A formal language of pattern compositions'
(2010) pp.1-6
ISBN: 9781618396419AbstractIn real applications, design patterns are almost always to be found composed with each other. Correct application of patterns therefore relies on precise definition of these compositions. In this paper, we propose a set of operators on patterns that can be used in such definitions. These operators are restriction of a pattern with respect to a constraint, superposition of two patterns, and a number of structural manipulations of the pattern's components. We also report a case study on the pattern compositions suggested informally in the Gang of Four book in order to demonstrate the expressiveness of the operators. -
Zhu H, Yu B, 'Algebraic specification of web services'
(2010) pp.457-464
AbstractPublished hereThis paper presents an algebraic specification language for the formal specification of the semantics of web services. A set of rules for transforming WSDL into algebraic structures is proposed. Its practical usability is also demonstrated by an example.
-
Zhu H, Yu B, 'An experiment with algebraic specifications of software components'
(2010) pp.190-199
ISSN: 9781424480784 eISSN: 1550-6002 eISBN: 9780769541310AbstractPublished hereA long lasting myth of formal methods is that they are difficult to learn and expensive to apply. This paper reports a controlled experiment to test if this myth is true or false in the context of writing algebraic formal specifications. The experiment shows that writing algebraic specifications for software components can be learnt by ordinary university students. It does not heavily depend on mathematical knowledge and skills. Moreover, it is not time consuming to write algebraic specifications for software components.
-
Zhu H, Bayley I, 'Laws of Pattern Composition'
6447 (2010) pp.630-645
ISBN: 9783642169007AbstractPublished hereDesign patterns are rarely used on their own. They are almost always to be found composed with each other in real applications. So it is crucial that we can reason about their compositions. In our previous work, we defined a set of operators on patterns so that pattern compositions can be represented as expressions on patterns. In this paper, we investigate the algebraic properties of these operators, prove a set of algebraic laws that they obey, and use the laws to show the equivalence of pattern compositions.
-
Zhu H, 'On the theoretical foundation of meta-modelling in graphically extended BNF and first order logic'
(2010) pp.95-104
AbstractMeta-modeling plays an important role in model driven software development methodology. In our previous work, a graphic extension of BNF (GEBNF) was proposed to define the abstract syntax of graphic modeling languages. From a GEBNF syntax definition, a first order predicate logic language can be induced so that meta-modeling can be performed formally by specifying a predicate on the domain of syntactically valid models. In this paper, we investigate the theoretical foundation of this meta-modeling approach. We first formally define the semantics of GEBNF syntax definitions as algebras that contain no junk and satisfy constraints derived from GEBNF syntax rules. The semantics of the induced logic is then formally defined by regarding such algebras as models. We then formally prove that well-formed syntax definitions together with syntax morphisms form a category, where syntax morphisms represent the translations between modeling languages. The models (i.e. algebras) in a modeling language and the homomorphisms between them also form a category. Finally, we prove that the functors from GEBNF syntax definitions to the categories of models and to sentences in the induced first order logic form an institution. Therefore, GEBNF and its induced logics form a valid formal specification language for models.Published here -
Zhu H, Wong WE, Belli F, 'Advancing Test Automation Technology to Meet the Challenges of Model-driven Software Development - Report on the 3rd Workshop on Automation of Software Test'
(2009) pp.1049-1050
ISBN: 978-1-60558-079-1Published here -
Zhu H, Bayley I, Shan L, Amphlett R, 'Tool support for design pattern recognition at model level'
(2009) pp.228-233
ISBN: 9780769537269AbstractPublished hereGiven the rapid rise of model-driven software development methodologies, it is highly desirable that tools be developed to support the use of design patterns in this context. This paper presents such a tool, called LAMBDES-DP, with UML diagrams as the software design models. Its theoretical foundation is a descriptive semantics of UML in first order logic, and the design patterns are formally specified in the same language. The tool uses the LAMBDES system to translate the UML diagrams into their descriptive semantics and invokes the theorem prover SPASS to decide whether the design conforms to a pattern. Our experiments show that our tool has significantly lower rates of false positive and false negative errors compared with existing tools.
-
Shan L, Zhu H, 'A formal descriptive semantics of UML'
(2008) pp.375-396
ISSN: 0302-9743 ISBN: 9783540881933 eISBN: 9783540881940AbstractThis paper proposes a novel approach to the formal definition of UML semantics. We distinguish descriptive semantics from functional seman-tics of modelling languages. The former defines which system is an instance of a model while the later defines the basic concepts underlying the models. In this paper, the descriptive semantics of class diagram, interaction diagram and state machine diagram are defined by first order logic formulas. A translation tool is implemented and integrated with the theorem prover SPASS to enable auto-mated reasoning about models. The formalisation and reasoning of models is then applied to model consistency checking.Published here -
Jin Z, Zhu H, 'A framework for agent-based service-oriented modelling'
(2008) pp.160-165
ISBN: 9780769534992 eISBN: 9780769534992AbstractService-oriented computing is becoming a direction of computing technology. For realising the mission of serviceoriented computing, service-oriented architecture has been proposed to facilitate the application development via discoverable services distributed on Internet. That brings out service-oriented modelling as a new technical area to provide modelling and analysis techniques of service-oriented applications. The paper proposes a framework for agentbased service-oriented modelling. It treats both the service providers and the service requesters as service agents. Domain ontology has been used to provide the sharable domain knowledge as well as terminology for allowing the agents to understand each others. A modelling process has also been illustrated with the model development of an online auction service.Published here -
Zhou B, Zhu H, 'A virtual machine for distributed agent-oriented programming'
(2008) pp.729-734
AbstractAgent-orientation has been considered as a viable solution to the development of software systems in dynamic environments such as the Internet. This paper presents a high level language virtual machine CAVM designed for distributed agent-oriented programming in the Internet environment. The main features of the virtual machine (VM) are two-fold. First, the communication between agents is separated from computation so that communication is network transparent of agent location. Second, code deployment is separated from loading so that multiple agents of the same caste can be dynamically distributed to the network and dynamical integrated into the systems by adding new agents. The paper first reviews the key features of an agent-oriented programming language called CAOPLE. It then presents the design of the virtual machine to support the implementation of the language. Experiments with the performance of the system in a network environment are also reported. -
Bayley I, Zhu H, 'Formalising Design Patterns in Predicate Logic'
(2008) pp.25-34
-
Zhang Y, Zhu H, 'Ontology for service oriented testing of web services'
(2008) pp.132-134
eISSN: 978-0-7695-3499-2AbstractThis paper presents a service oriented architecture for testing Web Services. In this architecture, various parties interoperate with each other to complete testing tasks through testing service registration, discovery and invocation. The analysis of the architecture in a typical scenario shows that it has the advantages of supporting dynamic discovery and invocation of testing services as required by the dynamic discovery and invocation of normal functional services without compromising security, privacy and intellectual property rights. It is flexibleand extendable. It also helps to reduce the risk ofunnecessary disturbances to the normal operations of services due to testing activities. The paper reports a prototype implementation of the architecture by adapting and implementing the ontology of software testing using Semantic Web Services technology. A case studywith the WS wrapping of an automated testing tool is also reported, which demonstrated that the architecture is technically feasible.Published here -
Kong L, Zhu H, Zhou B, 'Automated Testing Ejb Components Based on Algebraic Specifications'
(2007) pp.717-722
Published here -
Zhu H, Wong WE, Paradkar A, 'Automation of Software Test - Report on the Second International Workshop Ast 2007'
(2007) pp.150-151
-
Bayley I, Zhu H, 'Formalising design patterns in predicate logic'
(2007) pp.25 - 36
ISBN: 9780769528847AbstractDesign patterns are traditionally outlined in an informal manner. If they could be formalised, we could derive tools that automatically recognise design patterns and refactor designs and code. Our approach is to deploy predicate logic to specify conditions on the class diagrams that describe design patterns. The structure of class diagrams is itself described with a novel meta-notation that can be used for defining any graphical modelling language. As a result, the constraints, while based on UML, are highly readable and have much expressive power. This enables us not only to recognise design patterns in legacy code, but also to reason about them at the design stage, such as showing one pattern to be a special case of another. The paper discusses our specification of the original 23 design patterns and presents a representative sample of some of them.Published here -
Zhu H, 'A Framework for Service-oriented Testing of Web Services'
(2006) pp.145-150
-
Zhu H, Shan LJ, 'Agent-oriented Modelling and Specification of Web Services'
(2005) pp.152-159
-
Shan LJ, Zhu H, 'Camle: a Caste-centric Agent-oriented Modelling Language and Environment'
3390 (2005) pp.144-161
-
Zhang YL, Zhu H, Greenwood S, 'Empirical Validation of Website Timeliness Measures'
(2005) pp.313-318
-
Duce D, 'Agent-oriented Formal Specification of Web Services'
3252 (2004) pp.633-641
-
Zhu H, 'Cooperative Agent Approach to Quality Assurance and Testing Web Software'
(2004) pp.110-113
-
Shan LJ, Zhu H, 'Modelling Cooperative Multi-agent Systems'
3033 (2004) pp.994-1001
-
Zhang YL, Zhu H, Greenwood S, 'Website Complexity Metrics for Measuring Navigability'
(2004) pp.172-179
-
Zhu H, 'Formal Specification of Evolutionary Software Agents'
2495 (2003) pp.249-261
-
Zhu H, Zhang YL, Huo QN, Greenwood S, 'Application of Hazard Analysis to Software Quality Modelling'
(2002) pp.139-144
-
Zhang YL, Zhu H, Huo QN, Greenwood S, 'Measurement of Timeliness of Web-based Information Systems'
(2002) pp.349-354
-
Zhu H, He XD, 'An Observational Theory of Integration Testing for Component-based Software Development'
(2001) pp.363-368
-
Zhu H, He XD, 'Constructions of Behaviour Observation Schemes in Software Testing'
(2001) pp.7-16
-
Zhang YL, Zhu H, Greenwood S, Huo QN, 'Quality Modelling for Web-based Information Systems'
(2001) pp.41-47
ISBN: 0-7695-1384-0AbstractThe Internet and World Wide Web have become an important medium of software applications. Web-based information systems are involved in many areas of our everyday life. Today quality issues concerning these systems are attracting growing attention. This paper presents a framework for modelling the quality of webbased information systems. First we use information flow analysis to break down a complex system into subsystems. Then within each sub-system, we apply hazard analysis to find out the system failure mode, possible causes, and further, we uncover the relevant quality attributes. Using such methods, we develop a quality of Business-to-Customer e-business systems as an example. Three users, all having a rich experience in using ecommerce systems are interviewed in hazard analysis.Published here -
Zhu H, 'Automating Scenario-driven Structured Requirements Engineering'
24 (2000) pp.311-316
ISSN: 0730-3157AbstractPublished hereScenario analysis has been widely perceived to be able to play two important roles in requirement engineering. Firstly, it is a vehicle of separating concerns in the elicitation of users' requirements. Secondly, it is a means of validating specified requirements. However, there are a number of key activities in scenario analysis that demand automated tool support to make the method practically applicable. Such activities include: (a) the analysis of consistency and completeness among various scenarios, (b) the analysis of the consistency and completeness of a set of scenarios with respect to
requirements models, (c) the synthesis of requirements models from a set of scenarios, and (d) the generation of scenarios from requirements models for requirements validation. In this paper, we present an automatic toolkit that supports these activities. We discuss how the idea of scenario analysis can be adapted to a structured analysis method, how to describe scenarios and how to support scenario
analysis activities through automated tools. We also report some case studies with the toolkit.
Other publications
-
, '2019 IEEE International Conference On Artificial Intelligence Testing (AITest)', (2019)
-
Dranidis D, Masticola S, Zhu H, 'Automation of Software Test (ast 09)', (2011)
Published here -
Zhu H, Belli F, 'Advancing Test Automation Technology to Meet the Challenges of Model-based Software Testing - Guest Editors Introduction to the Special Section of the Third Ieee International Workshop on Automation of Software Test (ast 2008)', (2009)
Published here -
Zhu H, 'Engineering Quality Software - Guest Editors Introduction to the Special Section of the Eighth International Conference on Quality Software (qsic 2008)', (2009)
Published here -
Zhu H, Cheung SC, Horgan JR, Li JJ, 'Guest Editorial to the Special Issue on Automation of Software Testing', (2009)
Published here -
Zhu H, Kung D, 'Assuring Quality of Web-based Applications', (2006)
Published here
Professional information
Memberships of professional bodies
- Senior member of IEEE,
- Professional member of ACM,
- Member of BCS
Conferences
The following are selected recent participations in international conferences:
- PC Chair: IEEE EDGE 2019: The 3rd IEEE International Conference On Edge Computing, July 8-13, 2019, Milan, Italy.
- General Co-Chair: IEEE SOSE 2019: The 13th IEEE International Symposium on Service-Oriented System Engineering, San Francisco, USA, April 4-8, 2019.
- General Co-Chair: IEEE AITest 2019: The 1st IEEE International Conference on Artificial Intelligence Testing, San Francisco, USA, April 4-8, 2019.
- PC Chair: IEEE EDGE 2018: The 2nd IEEE International Conference On Edge Computing, San Francisco, CA, USA. July 2-7, 2018.
- General Co-Chair: IEEE SETA 2018: IEEE Symposium on Software Engineering Technologies & Applications at the 42nd IEEE International Conference on Computers, Software and Applications (COMPSAC 2018), Tokyo, Japan, July 23-27, 2018.
- General Co-Chair: IEEE EDGE 2017: The 1st IEEE International Conference on Edge Computing, Hawai, USA. 25 - 30 Jun 2017.
- General Co-Chair: IEEE SETA 2017: IEEE Symposium on Software Engineering Technologies & Applications at the 41st IEEE International Conference on Computers, Software and Applications (COMPSAC 2017), Turin, Italy, July 4-8, 2017.
- General Co-Chair: IEEE MS 2016: The 5th IEEE International Conference on Mobile Services, San Francisco, CA, USA, 27 June-2 July 2016.
- PC Co-Chair: IEEE ICSW 2015: IEEE International Conference on Web Services, New York, USA, Jun 27 - Jul 2, 2015.
Further details
Google Scholar Profile: https://scholar.google.com/citations?user=6fKIcOEAAAAJ&hl=en