How can we protect the security and the privacy of the AIDA platform?

The evolution of the RAID platform during the AIDA project brings many benefits, but also many security and privacy concerns to be considered. Thus, discovering and implementing measures that address these new risks, while not degrading performance, is of utmost importance. The main challenges are related to the transition to edge, pushing the computational power to the edges of the network, to the integration of 5G supporting multiple tenants and network slicing, and finally to the privacy of the data gathered and analyzed.  


Given these changes to the network, communications need to be verified in order to assure that they  are secure and that performance isn’t being affected. The network is constantly changing as many devices connect and disconnect from it and have higher and lower traffic, creating the necessity of  monitoring the platform to allow a fast response to any change in it. This change in the network creates new potential entry points for attackers to take advantage of or makes it harder to defend attacks that were already possible.


Figure 1: Overview of the changes to the Architecture


Providing Secure Communication among the Components

The AIDA platform with components running at the edge of the network and at the core, requires secure communication channels to assure that the exchanged information is protected against several threats, such as eavesdropping, man-in-the-middle attacks. A critical aspect regarding security is the support for authentication, authorization and accounting (AAA), by all services/functions of the AIDA platform, no matter the place where they run.
Robust and secure communication approaches exist, such as the Transport Layer Security (TLS) protocol which is widely used nowadays to assure authentication, confidentiality and integrity of the exchanged data. In this perspective, components should support TLS v1.2 and beyond, preferably v1.3 given the higher protection levels and the reduced times to perform the handshake processes. 
Nonetheless, a plug-n-play solution is not simple! The existence of several microservices, which can run at the edge or at the core of the network, lead to issues with keys, certificates management, that are required by TLS connections. A seamless integration with federated identity management approaches like OpenID Connect can lead to scalability issues, if not managed properly.
Assuring a Secure Operation of the Software Components
AIDAMicroservices need to be monitored using lightweight, fast, and efficient approaches while maintaining a high effectiveness level. The constant modification of the deployment scenarios, with auto-scaling adaptation, forces the behavior profiles used to identify deviations to become generalizable so that security level is not compromised in these dynamic environments. There is still potential for some intrusions to go undetected, the reason why the incorporating intrusion tolerance provides a way to increasing security levels and assure the system provides the intended service level despite intrusions successfully evading the detection mechanisms.
Many security strategies are being evaluated and improved, such as the use of machine learning techniques and classifiers to detect intrusions. The goal is to construct benign behavior profiles that detect deviations from the “normal behavior” used to train the algorithms. After a configurable number of deviations, alarms are raised and suspicious activity is reported. Intrusion tolerance will be most effective when applied to the key services of the architecture. Solutions that are commonly used are under study to identify possible applications in the AIDA scenario. The approaches that provide tolerance to the application can range from diversity of services, requiring different versions of technologies or techniques used to develop them, to the application of architectural patterns that can be static or dynamically applied according to information collected from the system while in operation.

Figure 2: High level perspective of the secure operation of the main software components

Also, to keep up with high availability, a self-adaptation mechanism can be used to monitor and adapt the various components inside the architecture, applying known actions to different components as an answer to the changes in the environment. These actions have the purpose of mitigating the problems and improving the performance of the platform, managing the resources to where they are needed, achieving high performance and availability. It also takes care of fixing identified security and privacy problems in the platform.
Maintaining the Privacy of the Data used
Regulations such as GDPR and HIPAA, together with the need to outsource data and computation to third-party infrastructures, make it critical to have privacy-preserving solutions that can be deployed at potentially untrusted environments. For instance, in Machine learning as it deals with the analysis of sensitive data, many times unprotected, which may leak sensitive information to adversaries at the untrusted premises. Even if this information is encrypted, there are other types of attacks that may compromise confidentiality as depicted in Figure 3.

Figure 3: Examples of attacks that can affect ML: Adversarial Samples, Model Extraction, Model Inversion, Reconstruction Attacks, and Membership Inference


Although the use of software-based cryptographic schemes is far from coming to a halt, Trusted Execution Environments (TEEs) are increasingly sought as an alternative solution that can reduce the performance overhead associated with traditional privacy-preserving secure schemes. In AIDA we are exploring this technology to provide a privacy-preserving machine learning solution that can be used in practice, while scaling out for large datasets. SOTERIA is a system for distributed privacy-preserving machine learning, which leverages Apache Spark’s design and its MLlib APIs. Our solution was designed to avoid changing the architecture and processing flow of Apache Spark, keeping its scalability and fault tolerance properties.


Apart from cryptographic mechanisms, privacy guarantees can be provided by applying adequate anonymization mechanisms. However, selecting a privacy-preserving mechanism is quite challenging, not only by the lack of a standardized and universal privacy definition, but also by the need of properly selecting and configuring mechanisms according to the data types and privacy requirements. Moreover, the type of anonymization approaches employed may affect the performance of the machine learning mechanisms considered in the project. Focusing on the data types relevant for the AIDA project, we are developing a privacy framework that allows us to test configurations, apply and assess privacy-preserving mechanisms according to the achieved privacy and utility level of data.