Technology

YouTuber Takes On AI Bots To Protect Content From Being Stolen

Published on February 3, 2025

A YouTube creator has developed a novel approach to deterring AI-powered content theft by exploiting an old subtitle format’s advanced features to contaminate automated transcript collection.

According to Ars Technica, F4mi is trying to protect her content by seeding her transcripts with junk data that is invisible to humans but harmful to any AI that dares to try to work from a poached transcript file. What the YouTuber is doing is poisoning “any AI summarizers that were trying to steal my content to make slop.” And by ‘slop’ she means knock-offs produced by faceless YouTube channels that use AI tools to generate content out of someone else’s work.

F4mi, known for creating technology-focused content, has implemented a strategy using Advanced SubStation Alpha (.ass) subtitle files to insert invisible text that confounds AI summarization tools.

The method involves placing out-of-bounds text segments with zero size and transparency alongside legitimate subtitles, effectively hiding content that only becomes visible to AI systems attempting to scrape the transcripts.

The hidden content combines public domain works with AI-generated text containing intentionally incorrect information. When processed by common AI summarization tools, these invisible elements overwhelm the actual content, producing unusable output.

Implementation Challenges

While the technique shows promise, it faces several technical limitations. YouTube’s platform doesn’t directly support .ass files, requiring conversion to their proprietary .ytt format. Mobile users have reported issues with the modified subtitles, including visible black boxes over videos and application crashes due to processing demands.

F4mi has addressed these mobile display issues through a custom Python script that conceals the additional text as black-on-black content during video fade-to-black moments. However, Ars Technica notes this solution creates additional overhead that can impact performance on mobile devices.

Current Limitations

The method’s effectiveness varies across different AI systems. Advanced models like ChatGPT can sometimes filter out the intentionally inserted content to produce accurate summaries. Additionally, the technique doesn’t prevent transcription by audio-based AI tools like OpenAI’s Whisper or screen readers that can extract visible subtitles.

Despite these constraints, the approach represents an emerging response to automated content repurposing, particularly targeting “faceless YouTube channels” that use AI tools to generate content from existing creators’ work.

The development comes as content creators face increasing challenges from automated channels that employ AI tools to generate scripts, voiceovers, imagery, and music without human intervention.

WIRED recently reported that major tech companies used content from thousands of YouTube videos to train AI models without creators’ knowledge or consent. Subtitles from 173,536 YouTube videos, sourced from over 48,000 channels, were utilized by prominent tech firms, including Anthropic, Nvidia, Apple, and Salesforce.

Dave Wiskus, CEO of Nebula, a streaming service partially owned by creators, described the practice as “theft,” adding that it’s “disrespectful” to use creators’ work without consent.