Securing Neural Networks with Cryptographic Backdoors
For neural network (NN) security and tracking, “cryptographic backdoors” use digital signature systems and cryptographic circuits to hide flaws in a model. This approach lets the model owner implement the “Boon” to stop black-box NN attackers.
The Crypto Backdoors Mechanism
Since it uses cryptographic primitives, a cryptographic backdoor can be applied to any classifier without fine-tuning the model.
Parallel Construction: The model is completed by adding a signature verification circuit to the NN classifier. This circuit works parallel to the classifier. A message-signature pair is assumed in every input. The input is altered to create a valid signature that matches the message to activate the backdoor. Output Control: If the verification circuit recognises a valid message-signature combination, the verifier turns on the backdoor output branch to overrule the NN's normal predictions. Undetectable and Non-Replicable: The backdoor is black-box undetectable, so an adversary with only oracle access (querying access) cannot compute to distinguish it from a clean, un-backdoored model. Importantly, the backdoor cannot be copied since the digital signature method stops adversaries from generating a new, genuine signature without the secret key. Monitoring and Security Apps The study shows three key defensive applications that use cryptographic backdoors to protect IP and manage MLaaS model access. Safe and reliable NN watermarking Backdoor watermarking confirms model ownership: For particular trigger samples, the model owner, who has the secret signing key ($sk$), establishes valid message-signature pairs. Integration within the independent verification circuit makes watermarking independent of model parameters. Because it is in the immutable verification circuit, the watermark is resistant to NN parameter changes, unlike typical NN watermark methods. Verification: An authorised auditor with valid signatures can query the model and get perfect trigger set accuracy, while parties without signatures get mediocre accuracy.
User Authentication for Security This protocol restricts model usage to authorised parties, making it harder for attackers to steal the model: Mechanism: Inference requires a working secret signing key. The system signs input messages using data. Access Control: If the signature is valid, the NN classifier predicts the final outputs. If an invalid key or no key is supplied, the verifier produces “garbage” or useless outputs.
Unauthorised IP tracking The model owner can designate a single authorised user as the cryptographic backdoor IP breach source: Unique Labelling: Instead of creating unique trigger sets for each user, the system generates a single trigger set with unique trigger labels. Cryptographic Traceability: A hash function and the user's secret key generate this unique label set deterministically and cryptographically. Attribution: A distributed model copy will only deliver perfect trigger set accuracy with the correct user key and labels. Detection: Assessing the model copy with a different user's label set reduces accuracy. This unique performance profile ensures that each distributed model correlates to a single secret key, making leak detection easier.


















