Deep learning (DL) algorithms are a primary workhorse for extracting forensics information from available data. We recently used such algorithms successfully for monitoring and improving data quality (DQ) in the Brooks-Iyengar fusion algorithms [10]. DL includes tasks like data integration, data cleaning (error detection and repair, and data curation), entity resolution and schema matching. Researchers in data curation have successfully shown deep learning techniques to be efficient when tested on state-of- the-art performance in data quality metrics.
In recent years attackers have increasingly adopted deep learning to develop new sophisticated DL-based security attacks or to evade DL-based defense systems [5]. Adversarial attacks [11] are deliberately designed to exploit such vulnerabilities, causing machine learning models to make a mistake. Deepfakes are media that take a person in an existing image or video and replace them with someone else's using artificial neural networks. Deepfakes disrupt crucial digital evidence at military operational decision-making systems [3, 4]. The goal of this project is to use Deep Learning technology to develop smart forensic detection techniques to defend against Deepfakes based on evidence. We plan to achieve that goal through building a framework and a proof-of-concept implementation for identification and recording of deepfake forensic algorithms vulnerabilities, including issues arising from the use of generative adversarial networks (GAN). This proposed project will achieve three complementary objectives (a) explore how ML [12] is being utilized for deepfakes, including simulation of audio and face swap (b) develop a novel stochastic PDE-based framework to detect compromised audio and video (c) investigate fundamental capabilities, challenges, and limitations of ML in detecting deepfakes. We will take a first principles approach: instead of treating an ML algorithm as a black box and simply training it to obtain good performance, we will carefully examine representative ML- based algorithms, including their assumptions on the characteristics of the input data and the targeted problems they intend to address. The diversity and complexity of cybersecurity related forensic problems in military operational decision-making systems is depicted in Figure.
The proposed research contribution is divided into 4 phases.
In addition to a wide range of civilian applications, the problems under consideration also have significant applications in military decision-making and cyber security systems. AI applied to perception tasks such as imagery analysis can extract useful information from raw data and equip leaders with increased situational awareness. This is the main motivation of the proposed research. In order to ensure that DoD AI systems are safe, secure, and robust, our proposed research will focus on lowering the risk of failure for existing and future DoD AI systems in the context of (a) development of SNN systems that consider uncertainties in the training procedure and produce random outputs, (b) implementations and evaluation of the developed SNN focusing on scalability to adopt the advantage of stochastic computing and, (c) design of an ensemble method for video anomaly detection.
Ultimately, this project will yield an ML-based framework for the detection of Deepfake datasets. The framework will be validated using published datasets and a real-world testbed. We also expect to gain insights into strategies for defending against other modes of Deepfakes. We will establish a GitHub page for our project, making all algorithms, code and manuscripts available online.