JSM 2024 Logo

Xiaoming Huo and Bin Yu organized an invited session at the Joint Statistical Meeting (JSM 2024), on the topics of “statistical methods in cyber security.” Participating ACTION faculty included Jie Gao, Bo Li, and Radha Poovendran.

Presentations

Enabling Asymptotic Truth Learning in a Network

Consider a network of distributed agents that all want to guess the correct value of some ground truth state. In a sequential order, each agent makes its decision using a single private signal which has a constant probability of error, as well as observations of actions from its network neighbors earlier in the order. We are interested in the question of enabling network-wide asymptotic truth learning -- that in a network of n agents, almost all agents make a correct prediction with probability approaching one as n goes to infinity. In this paper we study carefully crafted decision orders with respect to the graph topology as well as sufficient or necessary conditions for a graph to support such a good ordering. We first show that on a sparse graph with a random ordering asymptotic truth learning does not happen. We then show a rather modest sufficient condition to enable asymptotic truth learning. With the help of this condition we characterize graphs generated from the Erdös Rényi model and preferential attachment model. In an Erdös Rényi graph, unless the graph is super sparse (with O(n) edges) or super dense (with Ω(n^2) edges), there exists a decision ordering that supports asymptotic truth learning. Similarly any preferential attachment network with a constant number of edges per node can achieve asymptotic truth learning under a carefully designed ordering. We also evaluated a variant of the decision ordering on different network topologies and demonstrated clear effectiveness in improving truth learning over random orderings.

Speaker Jie Gao, Rutgers University

Adaptive learning in two-player Stackelberg games with application to network security

This paper proposes an adaptive learning approach to solve two-player Stackelberg games with incomplete information. Specifically, the leader lacks knowledge of the follower's cost function, but knows that the follower's response function to the leader's action belongs to a known parametric family with unknown parameters. Our algorithm simultaneously estimates these parameters and optimizes the leader's action. It guarantees that the estimates of the follower's action and the leader's cost converge to their true values within a finite time, with a preselected error bound that can be arbitrarily small. Additionally, the first-order necessary condition for optimality is asymptotically satisfied for the leader's estimated cost. Under persistent excitation conditions, the parameter estimation error remains within a preselected, arbitrarily small bound as well. Even with mismatches between the known parametric family and the follower's actual response function, our algorithm achieves convergence robustly with error bounds proportional to the mismatch size. Simulation examples in the domain of network security illustrate the algorithm's effectiveness and the convergence of results.

Speaker: Guosong Yang, Rutgers University

A Statistical Method for Safety Alignment of LLMs

As large language models (LLMs) become increasingly integrated into real-world applications such as code generation and chatbot assistance, extensive efforts have been made to align LLM behavior with human values, including safety. Jailbreak attacks, aiming to provoke unintended and unsafe behaviors from LLMs, remain a significant security threat to LLM deployment.

This talk introduces a statistical method to ensure the safety alignment of LLMs. We observe that safe and unsafe behaviors exhibited by LLMs differ in the probability distributions of tokens. An unsafe response generated by LLMs corresponds to a distribution where the probabilities of tokens representing harmful contents outweigh those representing harmless responses. We leverage this observation and develop a lightweight safety-aware decoding strategy, SafeDecoding, for safety alignment. SafeDecoding mitigates jailbreak attacks by identifying safety disclaimers and amplifying their token probabilities, while simultaneously attenuating the probabilities of token sequences aligned with jailbreak attacks' objectives with the help of token distribution shifts. We perform extensive experiments on five LLMs using six state-of-the-art jailbreak attacks and four benchmark datasets. Our results show that SafeDecoding significantly reduces the attack success rate and harmfulness of jailbreak attacks without compromising the helpfulness of responses to benign user queries. This work is supported by the NSF AI Institute for Agent-based Cyber Threat Intelligence and Operation (ACTION).

Co-Author: Radha Poovendran, University of Washington

Speaker: Zhangchen Xu, University of Washington

Date
Location
Portland, OR