Back

Synthetic Data: A New Frontier for Cyber Deception and Honeypots

Cyber Threat Intelligence

deception, threat hunting, network intelligence, threat actor, honeypot, honeytrap

Investigating numerous incidents, Resecurity has developed a unique practice of using deception technologies for counterintelligence purposes. This may include solutions, tools, models, and methods that mimic legitimate enterprise environments to mislead potential threat actors and allow them to conduct malicious activities in a controlled manner. Many of these concepts originate from traditional honeypots, which enable network defenders to perform threat hunting passively—by deploying traps using misconfigured applications and network services, or dummy resources to log intruders.

With the rapid evolution of AI and ML, deception could be accelerated by using synthetic data—purposely generated data that has patterns and characteristics of real-world data without containing actual proprietary information. In the context of threat hunting, previously breached data can be highly effective for designing deception models that appear extremely realistic and attract threat actors. For example, a purposely planted honeypot—containing realistically looking (but practically useless) records—can motivate threat actors to attempt to steal it.

November 21, 2025 — Resecurity identified a threat actor attempting to conduct malicious activity targeting our resources. The actor was probing various publicly facing services and applications. Our DFIR team logged the threat actor at an early stage and documented the following Indicators of Attack (IOA):

156.193.212.244 (Egypt)
102.41.112.148 (Egypt)
45.129.56.148 (Mullvad VPN)
185.253.118.70 (VPN)

Understanding that the actor is conducting reconnaissance, our team has set up a honeytrap account. This led to a successful login by the threat actor to one of the emulated applications containing synthetic data. While the successful login could have enabled the actor to gain unauthorized access and commit a crime, it also provided us with strong proof of their activity. Both Office 365 and VPN accounts are highly effective for creating honeypot (honeytrap) accounts to detect, track, and analyze hacker activity. Such accounts are widely used in enterprise environments to detect unauthorized access attempts and gather threat intelligence. The most successful honeypot deployments use realistic, well-monitored decoy accounts that mimic high-value targets but are isolated from real assets. In addition, you can use honeytrap accounts for own applications - on emulated environment, isolated from production resources and closely monitored.

For synthetic data, we used two different datasets: over 28,000 records impersonating consumers and over 190,000 records of payment transactions. Notably, in both cases, we utilized already known breached data available on the Dark Web and underground marketplaces—potentially containing PII—making the data even more realistic for threat actors. Such data is readily available from open sources and can be used as an important element for cyber deception—especially when the threat actor is advanced and may perform various checks to verify that the data is not completely fake. Otherwise, this could affect their further tactics or lead to a complete halt of their planned actions. In our scenario, our goal was to allow the threat actor to conduct activity and feed them with synthetic data to observe their attack path and infrastructure. This task has not involved the use of passwords or API credentials.

- Payment Information (Stripe Records)

To prepare this, we used specialized synthetic data generation tools (e.g., SDV, MOSTLY AI, Faker) to create realistic, schema-compliant Stripe transaction and customer data. Our goal was to reproduce exactly the same structure that the data would have according to Stripe’s official API schemas for customers, transactions, and subscriptions. In the official Stripe API, a transaction is typically represented as an object with fields such as:

id: Unique identifier for the transaction 
amount: Amount of the transaction 
currency: Currency code (e.g., USD) 
created: Timestamp of when the transaction occurred 
type: Type of transaction (charge, refund, payout, etc.) 
status: Status of the transaction (succeeded, pending, failed, etc.) 
customer: Reference to the customer object 
metadata: Custom key-value pairs for additional information

- Faked Customer Records (Consumer Records)

username 
email 
firstname 
lastname 
organisation 
date

A combination of both datasets allows for mimicking a possible business application that involves consumers with financial transactions, which could be of interest to financially motivated threat actors. 

The threat actor fell into our trap and began planning automation to dump the available data. It took some time, and on December 12, they resumed activity. It is possible that the threat actor was developing a custom scraper to facilitate data dumping. By that time, they used a large number of residential IP proxies to automate their activity, which helped our DFIR team gather substantial knowledge about their TTPs and the network infrastructure they used. This data is typically called "abuse data"—artifacts collected as a result of the threat actor abusing a specific application or service, or misusing it. Abuse data can also be used for early-stage threat detection when the same actor targets other enterprises, acting as Indicators of Compromise (IOCs). Sharing fresh abuse data can help network defenders hunt for threat actors operating on the same infrastructure more effectively.

Between December 12 and December 24, the threat actor made over 188,000 requests attempting to dump synthetic data. During this period, the Resecurity team documented the activity and collaborated with relevant law enforcement authorities and ISPs to share information about it.

Notably, the actor became quite busy and, at some point, disclosed his real IP addresses due to proxy connection failures, creating an OPSEC issue.

A similar issue occurred during new attempts, leading to another disclosure. In both cases, information about the attacker's hosts was reported to law enforcement.

Observing this activity, our team generated additional synthetic data of a different nature to give the actor more room for maneuvering. This led to the disclosure of other important details that confirmed his origin.

Processing a large dataset of synthetic data led to several OPSEC mistakes, resulting in the identification of the exact servers used by the attacker for automation—where he was using lists of residential IP proxies to spoof the source.

After acquiring a substantial number of residential proxies, we began blocking them, which limited the actor to a smaller number of possible hosts for proxifying the traffic. This led to the resurgence of the same IPs identified earlier.

Once the actor was located using available network intelligence and timestamps, a foreign law enforcement organization, a partner of Resecurity, issued a subpoena request regarding the threat actor.

The conclusion of this activity confirms that cyber deception using synthetic data can be highly effective, not only in threat intelligence gathering but also in investigative tasks. Depending on the jurisdiction, cybersecurity teams should ensure compliance with privacy laws and consult legal counsel before deploying such measures.

Newsletter

Keep up to date with the latest cybersecurity news and developments.

By subscribing, I understand and agree that my personal data will be collected and processed according to the Privacy and Cookies Policy

Cloud Architecture
Cloud Architecture
445 S. Figueroa Street
Los Angeles, CA 90071
Google Maps
Contact us by filling out the form
Try Resecurity products today with a free trial
Resecurity
Close
Hi there! I'm here to answer your questions and assist you.
Before we begin, could you please provide your name and email?