Cybersecurity experts develop a dark web-trained AI
Researchers from the Korea Advanced Institute of Science and Technology (KAIST) and data intelligence organization S2W have introduced DarkBERT, an unprecedented language model specifically trained on data extracted from the dark web.
DarkBERT aims to empower cybersecurity professionals by equipping them with a cutting-edge tool to identify and flag potential threats lurking within the depths of the internet's underbelly.
This unique AI system harnesses the language used in dark web environments, enabling it to enhance the comprehension abilities of AI tools. DarkBERT has the potential to become an invaluable asset for cybersecurity professionals and law enforcement agencies.
Trained on the TOR network
To ensure optimal adaptation to the language prevalent on the dark web, the research team undertook a meticulous process. They extensively crawled the Tor network, constructing a comprehensive database for DarkBERT.
The team implemented deduplication, data filtering, and thorough pre-processing techniques to address ethical concerns associated with dark web content, which often contains sensitive information.
During the training process, DarkBERT was exposed to two distinct sets of data over a period of 16 days. The pre-processed data underwent careful scrubbing, with redactions made to protect the identities of victim organizations, details regarding leaked data, menacing statements, and illicit images.
Notably, a significant portion of the data set, comprising over a thousand pages, was categorized as adult entertainment.
How can a Dark Web-Trained AI benefit cybersecurity?
This milestone development paves the way for further exploration and refinement of dark web-trained AI systems, offering promising avenues to fortify digital defenses against cyber threats. By understanding the methodology of these threat actors, we can keep ourselves safe even in the darkest places on the internet.
You may find further information about DarkBERT here.
Can you access DarkBERT?
While DarkBERT represents a groundbreaking achievement, the release of this AI model to the public is not currently anticipated due to the potential risks associated with dark web materials. However, academic institutions may request access to DarkBERT for research purposes.Advertisement