Project name – A framework for learning comprehensible theories in xml document classification
XML has become the universal data format for a wide variety of information systems. The large number of XML documents existing on the web and in other information storage systems makes classification an important task. As a typical type of semistructured data, XML documents have both structures and contents. Traditional text learning techniques are not very suitable for XML document classification as structures are not considered. This paper presents a novel complete framework for XML document classification. We first present a knowledge representation method for XML documents which is based on a typed higher order logic formalism. With this representation method, an XML document is represented a s a higher order logic term where both its contents and structures are captured. We then present a decision-tree learning algorithm driven by precision/recall breakeven point (PRDT) for the XML classification problem which can produce comprehensible theories. Finally, a semi-supervised learning algorithm is given which is based on the PRDT algorithm and the containing framework. Experimental results demonstrate that our framework i s able to achieve good performance in both supervised and semi-supervised learning with the bonus of producing comprehensible learning theories.
Project name – THEMIS: A mutually verifiable billing system for cloud computing environment
With the widespread adoption of cloud computing, the ability to record and account for the usage of cloud resources in a credible and verifiable way has become critical for cloud service providers and users alike. The success of such a billing system depends on several factors: the billing transactions must have integrity and no repudiation capabilities; the billing transactions must be non obstructive and have a minimal computation cost; and the service level agreement (SLA) monitoring should be provided in a trusted manner. Existing billing systems are limited in terms of security capabilities or computational overhead. In this paper, we propose a secure and non obstructive billing system called THEMIS as a remedy for these limitations. The system uses a novel concept of a cloud notary authority for the supervision of billing. The cloud notary authority generates mutually verifiable binding information that can be used to resolve future disputes between a user and a cloud service provider in a computationally efficient way. Furthermore, to provide a forgery-resistive SLA monitoring mechanism, we devised a SLA monitoring module enhanced with a trusted platform module (TPM), called S-Mon. The performance evaluation confirms that the overall latency of THEMIS billing transactions (avg. 4.89 ms) is much shorter than the latency of public key infrastructure (PKI)-based billing transactions (avg. 82.51 ms), though THEMIS guarantees identical security features as a PKI. This work has been undertaken on a real cloud computing service called iCubeCloud.
Project name – A secure erasure code-based cloud storage system with secure data forwarding
A cloud storage system, consisting of a collection of storage servers, provides long-term storage services over the Internet. Storing data in a third partys cloud system causes serious concern over data confidentiality. General encryption schemes protect data confidentiality, but also limit the functionality of the storage system because a few operations are supported over encrypted data. Constructing a secure storage system that supports multiple functions is challenging when the storage system is distributed and has no central authority. We propose a threshold proxy re-encryption scheme and integrate it with a decentralized erasure code such that a secure distributed storage system is formulated. The distributed storage system not only supports secure and robust data storage and retrieval, but also lets a user forward his data in the storage servers to another user without retrieving the data back. The main technical contribution is that the proxy re-encryption scheme supports encoding operations over encrypted messages as well as forwarding operations over encoded and encrypted messages. Our method fully integrates encrypting, encoding, and forwarding. We analyze and suggest suitable parameters for the number of copies of a message dispatched to storage servers and the number of storage servers queried by a key server. These parameters allow more flexible adjustment between the number of storage servers and robustness.
Project name – Ensuring distributed accountability for data sharing in the cloud
Cloud computing enables highly scalable services to be easily consumed over the Internet on an as-needed basis. A major feature of the cloud services is that users data are usually processed remotely in unknown machines that users do not own or operate. While enjoying the convenience brought by this new emerging technology, users fears of losing control of their own data (particularly, financial and health data) can become a significant barrier to the wide adoption of cloud services. To address this problem, in this paper, we propose a novel highly decentralized information accountability framework to keep track of the actual usage of the users data in the cloud. In particular, we propose an object-centered approach that enables enclosing our logging mechanism together with users data and policies. We leverage the JAR programmable capabilities to both create a dynamic and traveling object, and to ensure that any access to users data will trigger authentication and automated logging local to the JARs. To strengthen users control, we also provide distributed auditing mechanisms. We provide extensive experimental studies that demonstrate the efficiency and effectiveness of the proposed approaches.