<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Automation on DiyMediaServer</title><link>https://diymediaserver.com/tags/automation/</link><description>Recent content in Automation on DiyMediaServer</description><generator>Hugo -- gohugo.io</generator><language>en-us</language><lastBuildDate>Wed, 10 Jun 2026 10:34:53 -0600</lastBuildDate><atom:link href="https://diymediaserver.com/tags/automation/index.xml" rel="self" type="application/rss+xml"/><item><title>Automated MKV Cleanup With mkvmerge, SABnzbd &amp; Cron</title><link>https://diymediaserver.com/post/2026/automated-mkv-cleanup-mkvmerge-sabnzbd-cron/</link><pubDate>Wed, 10 Jun 2026 06:19:48 -0600</pubDate><guid>https://diymediaserver.com/post/2026/automated-mkv-cleanup-mkvmerge-sabnzbd-cron/</guid><description>&lt;img src="https://diymediaserver.com/post/2026/automated-mkv-cleanup-mkvmerge-sabnzbd-cron/featured.png" alt="Featured image of post Automated MKV Cleanup With mkvmerge, SABnzbd &amp; Cron" />&lt;p>Every media library eventually has this one issue in common. You open a movie in Jellyfin, it starts playing in a French dub you didn&amp;rsquo;t ask for, and you spend twenty seconds digging through the audio menu while the family gives you the side-eye. Multiply that by a library full of releases stuffed with five audio dubs, a commentary track, and a dozen subtitle tracks, and you have a real problem. My library had grown to roughly 2,200 MKV files, and each one was carrying around junk it did not need.&lt;/p>
&lt;p>The good news? This is automatable, and the right tool does it losslessly in seconds rather than hours. The bad news, which I learned the hard way, is that the obvious approach quietly destroys foreign films and anime. This post walks through how I built an automated MKV cleanup tool with mkvmerge, including every mistake I made before it actually worked. The whole thing is now open source: two self-contained Python scripts in the &lt;a class="link" href="https://github.com/KryptikWurm/mkv-track-stripper" target="_blank" rel="noopener"
>mkv-track-stripper repo on GitHub&lt;/a>, MIT licensed, ready to drop into your own setup. You&amp;rsquo;ll get the script logic, the SABnzbd hook, the cron sweep, and the reasoning behind each safety decision. The mistakes are where the real lessons live.&lt;/p>
&lt;div class="alert alert-tldr">
&lt;span class="alert-icon">💭&lt;/span>
&lt;div class="alert-content">
&lt;strong>TL;DR:&lt;/strong>
Use mkvmerge to losslessly remux MKVs and strip unwanted audio and subtitle tracks, then set it up at two points: a Python SABnzbd post-processing hook (&lt;code>mkv_strip_pp.py&lt;/code>) cleans new downloads before Radarr and/or Sonarr imports them, and a Python library sweep (&lt;code>mkvclean.py&lt;/code>) handles the back catalog and can run as a cron job as an unattended catch-all. The mkvclean script tracks processed files in a checkpoint file so nothing gets processed twice, every remux is verified before the original is replaced, and forced subtitles always survive. Both scripts live in the &lt;a class="link" href="https://github.com/KryptikWurm/mkv-track-stripper" target="_blank" rel="noopener"
>mkv-track-stripper&lt;/a> repo.
&lt;/div>
&lt;/div>
&lt;aside class="tested-on" aria-label="Tested configuration">
&lt;div class="tested-on-label">Tested on&lt;/div>
&lt;dl class="tested-on-list">&lt;dt>Debian&lt;/dt>&lt;dd>13 (Trixie)&lt;/dd>&lt;dt>Docker&lt;/dt>&lt;dd>29.5.2&lt;/dd>&lt;dt>Jellyfin&lt;/dt>&lt;dd>10.11.10&lt;/dd>&lt;dt>Radarr&lt;/dt>&lt;dd>6.11&lt;/dd>&lt;dt>SABnzbd&lt;/dt>&lt;dd>5.0.3&lt;/dd>&lt;dt>Mkvtoolnix&lt;/dt>&lt;dd>92.0 &amp;amp; 96.0&lt;/dd>&lt;dt>Python&lt;/dt>&lt;dd>3.13.5&lt;/dd>&lt;dt>Date&lt;/dt>&lt;dd>2026-06-10&lt;/dd>&lt;/dl>
&lt;/aside>
&lt;p>I ran this entire pipeline on Debian 13 with MKVToolNix 92.0 &amp;amp; 96.0, SABnzbd 5.0.3, and Radarr 6.11, validating the output in Jellyfin 10.11.10 before letting it loose on the full library. Every thing in this post are real lessons learned, scars included.&lt;/p>
&lt;h2 id="why-mkvmerge-beats-ffmpeg-for-stripping-tracks">Why mkvmerge Beats ffmpeg for Stripping Tracks
&lt;/h2>&lt;p>For anything video-related, most people reach for ffmpeg or Tdarr first. I did too. They&amp;rsquo;re capable tools, but for stripping tracks out of MKVs at scale, mkvmerge from the MKVToolNix suite is the better fit. The reason comes down to what each tool is built for.&lt;/p>
&lt;p>mkvmerge is a lossless remuxer, not a transcoder. It copies streams as-is into a new MKV while letting you drop, reorder, or relabel tracks, with no re-encoding involved. The operation is fast and there&amp;rsquo;s zero quality loss. ffmpeg is designed around decode and encode processes. Copying a subset of tracks usually means more verbose &lt;code>-map&lt;/code> expressions, and some Matroska features don&amp;rsquo;t copy 1:1 without surprises.&lt;/p>
&lt;p>With mkvmerge you can keep or drop tracks by ID or by language with flags like &lt;code>--audio-tracks&lt;/code>, &lt;code>--subtitle-tracks&lt;/code>, and &lt;code>--language&lt;/code>, and you set default and forced flags per track. MKVToolNix is dedicated to Matroska. It handles chapters, tags, editions, and segment linking cleanly. ffmpeg&amp;rsquo;s Matroska muxer is good but not tuned for every edge case, and users report occasional player quirks with ffmpeg-muxed MKVs on ordered chapters and dual-subtitle anime.&lt;/p>
&lt;h3 id="i-started-with-tdarr-and-it-was-the-wrong-tool">I Started With Tdarr, and It Was the Wrong Tool
&lt;/h3>&lt;p>Before I discovered MKVToolNix, I reached for Tdarr. It&amp;rsquo;s the tool everyone points you at: a slick web UI, a node-and-worker architecture, and a plugin library that promises hands-off library maintenance. I spent an evening standing up the server, attaching a node, and wiring a flow to drop the tracks I didn&amp;rsquo;t want.&lt;/p>
&lt;p>It was the wrong tool for two reasons. First, Tdarr is built around transcoding, and its flows kept nudging me toward re-encoding the video to &amp;ldquo;process&amp;rdquo; a file. I didn&amp;rsquo;t want a new encode. I wanted the exact same video and one or two audio tracks copied into a fresh container, untouched. Running a lossy, hours-long transcode only to delete a subtitle track is the opposite of what this job needs.&lt;/p>
&lt;p>Second, the whole tool was wildly oversized for the task. A server, a worker node, a database, and a huge list of plugins is a lot of moving parts to own and debug when the actual operation is &amp;ldquo;remove these track IDs.&amp;rdquo; Tdarr is genuinely good at what it&amp;rsquo;s &lt;em>for&lt;/em> - bulk transcoding and health-checking a library on dedicated hardware - but for lossless track stripping it buried a one-line &lt;code>mkvmerge&lt;/code> call under a stack of infrastructure I&amp;rsquo;d have to maintain forever. I tore it down and went back to a script.&lt;/p>
&lt;script>
(function () {
var ua = navigator.userAgent || "";
if (!/Android|iPhone|iPad|iPod/i.test(ua)) return;
function fix() {
var links = document.querySelectorAll(
'.product-box .affiliate-button[href*="amzn.to"],' +
'.product-box .affiliate-button[href*="amazon."]'
);
for (var i = 0; i &lt; links.length; i++) {
links[i].removeAttribute("target");
var rel = (links[i].getAttribute("rel") || "").split(/\s+/)
.filter(function (t) { return t &amp;&amp; t !== "noopener"; }).join(" ");
if (rel) links[i].setAttribute("rel", rel);
else links[i].removeAttribute("rel");
}
}
if (document.readyState === "loading") {
document.addEventListener("DOMContentLoaded", fix);
} else { fix(); }
})();
&lt;/script>&lt;script type="application/ld+json">{"@context":"https://schema.org","@type":"Product","description":"Beelink SER5 (Ryzen 5 5600H). A palm-sized mini PC with a 6-core/12-thread CPU that feels snappy for everyday work and homelab duties, handling Docker stacks, light VMs, and Plex without guzzling power. With NVMe plus a 2.5″ bay, Wi-Fi 6, and multi-display output, it\u0026rsquo;s a quiet, tidy upgrade for desk or media setups.","image":["https://diymediaserver.com/images/products/beelink.jpg"],"name":"Beelink SER5 (Ryzen 5 5600H).","offers":[{"@type":"Offer","availability":"https://schema.org/InStock","price":"0","priceCurrency":"USD","seller":{"@type":"Organization","name":"Amazon"},"url":"https://amzn.to/45HW6CT"}],"sku":"B0FBWGBVZ2"}&lt;/script>
&lt;div class="product-box" data-asin="B0FBWGBVZ2">
&lt;div class="product-box-image">&lt;img src="https://diymediaserver.com/images/products/beelink_hu_774d1cf288658941.webp" width="600" height="481" alt="Beelink SER5 (Ryzen 5 5600H)" loading="lazy" decoding="async">&lt;/div>
&lt;div class="product-box-content">
&lt;div class="product-box-description">
&lt;strong>Beelink SER5 (Ryzen 5 5600H).&lt;/strong>
A palm-sized mini PC with a 6-core/12-thread CPU that feels snappy for everyday work and homelab duties, handling Docker stacks, light VMs, and Plex without guzzling power. With NVMe plus a 2.5″ bay, Wi-Fi 6, and multi-display output, it&amp;rsquo;s a quiet, tidy upgrade for desk or media setups.
&lt;/div>
&lt;div class="product-meta-row">
&lt;div class="product-price">
&lt;strong>Amazon Price:&lt;/strong>
&lt;span class="price-loading">Loading...&lt;/span>
&lt;span class="price-value" style="display:none;">&lt;/span>
&lt;/div>
&lt;div class="product-availability">
&lt;strong>Availability:&lt;/strong>
&lt;span class="availability-loading">Checking...&lt;/span>
&lt;span class="availability-value" style="display:none;">&lt;/span>
&lt;/div>
&lt;/div>
&lt;/div>
&lt;div class="product-box-links">
&lt;a href="https://amzn.to/45HW6CT" class="affiliate-button" target="_blank" rel="noopener nofollow sponsored">Amazon&lt;/a>
&lt;/div>
&lt;div class="product-affiliate-disclaimer">
&lt;small>&lt;em>Contains affiliate links. I may earn a commission at no cost to you.&lt;/em>&lt;/small>
&lt;/div>
&lt;/div>
&lt;h3 id="inspect-before-you-touch-anything">Inspect Before You Touch Anything
&lt;/h3>&lt;p>Before automating, you need to see what&amp;rsquo;s actually inside a file. Two commands do everything:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">mkvmerge -i movie.mkv
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This lists tracks with their IDs, types, languages, and names in human-readable form. For scripting, you want the JSON version:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">mkvmerge -J movie.mkv
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This is a machine-readable description you can parse safely. The fields that matter are &lt;code>tracks[].id&lt;/code>, &lt;code>tracks[].type&lt;/code>, &lt;code>tracks[].properties.language&lt;/code>, and &lt;code>tracks[].properties.track_name&lt;/code>. When a release is genuinely strange (ordered chapters, segment linking), &lt;code>mkvinfo movie.mkv&lt;/code> gives even deeper details for debugging.&lt;/p>
&lt;p>One thing early on that I had to learn: &lt;code>mkvmerge -J&lt;/code> exits with code &lt;code>1&lt;/code> when it has warnings about a file, but it still emits perfectly valid JSON. An early version treated that as a probe failure and skipped the file. Only exit codes of 2 or higher mean the probe actually failed.&lt;/p>
&lt;h2 id="the-first-script-and-the-trap-hiding-in-it">The First Script, and the Trap Hiding in It
&lt;/h2>&lt;p>My first script was embarrassingly simple: keep only English audio and English subtitles, drop everything else. It looked completely reasonable. It ran fine against a folder of Hollywood blockbusters. Then I pointed it at my anime folder.&lt;/p>
&lt;p>Every single file came out silent.&lt;/p>
&lt;p>The script had stripped the only audio track in each file because the language tag said &lt;code>jpn&lt;/code>, not &lt;code>eng&lt;/code>. There was no error and no warning. Just dead silence at movie night while everyone stared at me. That moment is when I stopped writing naive language filters and started reasoning from what tracks are actually present, rather than what I assumed should be there.&lt;/p>
&lt;p>The fix is conditional logic, and it&amp;rsquo;s conservative on purpose. Breaking one film is worse than failing to clean five.&lt;/p>
&lt;ul>
&lt;li>If a file has a single audio track, keep it no matter the language. A movie with one audio stream is never a candidate for audio removal.&lt;/li>
&lt;li>If a file has multiple audio tracks but none match your preferred languages, keep them all and log a notice rather than risk going silent.&lt;/li>
&lt;li>Never drop all subtitle tracks when the audio is in a language you don&amp;rsquo;t understand.&lt;/li>
&lt;/ul>
&lt;p>There&amp;rsquo;s no reliable original-language flag to lean on. Real-world files have audio mistagged as &lt;code>und&lt;/code>, fansubs with multiple &lt;code>eng&lt;/code> subtitle tracks (full subs versus signs-and-songs), and scene releases that label things inconsistently.&lt;/p>
&lt;p>That &lt;code>und&lt;/code> (undetermined) tag deserves its own rule, because it&amp;rsquo;s a coin flip: it might be the English track a sloppy release group forgot to label, or it might be a dub you&amp;rsquo;ll never play. The selection logic in both scripts treats &lt;code>und&lt;/code> as a preferred language &lt;em>only until a real English track shows up&lt;/em>. If the file has properly tagged English audio, &lt;code>und&lt;/code> tracks are no longer given the benefit of the doubt:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="n">has_eng_audio&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="nb">any&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">t&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">get&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;properties&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="p">{})&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">get&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;language&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="o">==&lt;/span> &lt;span class="s2">&amp;#34;eng&amp;#34;&lt;/span> &lt;span class="k">for&lt;/span> &lt;span class="n">t&lt;/span> &lt;span class="ow">in&lt;/span> &lt;span class="n">audio&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">target_audio_langs&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="nb">list&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">audio_langs&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="k">if&lt;/span> &lt;span class="n">has_eng_audio&lt;/span> &lt;span class="ow">and&lt;/span> &lt;span class="s2">&amp;#34;und&amp;#34;&lt;/span> &lt;span class="ow">in&lt;/span> &lt;span class="n">target_audio_langs&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">target_audio_langs&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">remove&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;und&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>
&lt;div class="alert alert-warning">
&lt;span class="alert-icon">⚠️&lt;/span>
&lt;div class="alert-content">
&lt;strong>Warning:&lt;/strong>
&lt;strong>Lessons learned:&lt;/strong> Filtering blindly by language is the single most dangerous thing you can do to a media library. A silent movie is a broken movie. Always handle the single-track, no-preferred-match, and &lt;code>und&lt;/code> cases before you let any removal logic run.
&lt;/div>
&lt;/div>
&lt;h2 id="safety-as-a-first-principle">Safety as a First Principle
&lt;/h2>&lt;p>Once the language logic was sane, the next set of mistakes were about how I handled the files themselves. Three rules emerged, and all of them are non-negotiable.&lt;/p>
&lt;p>&lt;strong>Write a new file, verify it, then and only then replace the source.&lt;/strong> Both scripts remux to a hidden temp file (via &lt;code>tempfile.mkstemp&lt;/code>) in the same directory as the original, then call &lt;code>os.replace&lt;/code> to swap it atomically onto the original path. Same-directory output is deliberate twice over: a rename within one filesystem is atomic, so Radarr, Sonarr or Jellyfin never catches a half-written file mid-swap, and it sidesteps cross-device link errors when your storage is a MergerFS or ZFS pool. Before any of that, a pre-flight check confirms the directory has at least 105% of the original&amp;rsquo;s size free, so a full disk can&amp;rsquo;t produce a truncated output.&lt;/p>
&lt;p>&lt;strong>Verify means actually verify.&lt;/strong> A non-empty output file and a happy exit code aren&amp;rsquo;t proof of a good remux. Before the swap, &lt;code>verify_remux()&lt;/code> re-probes the temp file with &lt;code>mkvmerge -J&lt;/code> and confirms it still contains a video track and &lt;em>exactly&lt;/em> the audio and subtitle track counts that were requested. If the re-probe fails or the numbers don&amp;rsquo;t add up, the original stays put and the error is logged. This is the guard against a truncated-but-nonzero output silently overwriting a perfectly good source.&lt;/p>
&lt;p>&lt;strong>Read exit codes correctly.&lt;/strong> This one trips people up (including me). mkvmerge does not use the standard &amp;ldquo;0 good, anything else bad&amp;rdquo; convention. It returns &lt;code>0&lt;/code> on full success, &lt;code>1&lt;/code> on success with warnings, and &lt;code>2&lt;/code> for a genuine error. A warning often means a track had a minor inconsistency that mkvmerge handled fine, so the scripts accept both:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="n">result&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">subprocess&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">run&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">cmd&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">capture_output&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="kc">True&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">text&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="kc">True&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">ok&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">(&lt;/span>&lt;span class="n">result&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">returncode&lt;/span> &lt;span class="ow">in&lt;/span> &lt;span class="p">(&lt;/span>&lt;span class="mi">0&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="mi">1&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="ow">and&lt;/span> &lt;span class="n">os&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">path&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">exists&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">tmp&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="ow">and&lt;/span> &lt;span class="n">os&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">path&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">getsize&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">tmp&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="o">&amp;gt;&lt;/span> &lt;span class="mi">0&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Only code 2, or a zero-byte output file, or a failed verification re-probe, means stop and preserve the original intact.&lt;/p>
&lt;p>There&amp;rsquo;s one more thing the swap has to get right that I didn&amp;rsquo;t anticipate: the cleaned file is a &lt;em>new&lt;/em> file, so it arrives owned by whoever ran the script, with fresh permissions and timestamps. On a library shared between Radarr, Sonarr, Jellyfin, and an NFS export, that&amp;rsquo;s a quiet way to break things days later. After a successful remux, &lt;code>preserve_metadata()&lt;/code> copies the original&amp;rsquo;s ownership, mode, timestamps, extended attributes, and POSIX ACLs onto the cleaned file before the swap. If ownership can&amp;rsquo;t be restored (you&amp;rsquo;re not running as root or as a user who doesn&amp;rsquo;t own the files), it logs a warning and keeps going rather than dying - the cleanup still worked, the file has a new owner.&lt;/p>
&lt;p>&lt;strong>The subtitle judgment call.&lt;/strong> Subtitles get a more aggressive policy than audio, and I want to be honest that this is a choice rather than an obvious truth. A missing subtitle doesn&amp;rsquo;t break playback the way missing audio does - worst case, you re-add a sub later. So: keep your preferred languages, drop the rest, and drop SDH/hearing-impaired tracks along the way. But there&amp;rsquo;s one exception that&amp;rsquo;s absolute. Forced subtitles - the ones that translate the single line of Elvish in an otherwise English film - are always kept, no matter what language they&amp;rsquo;re tagged as:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="n">is_forced&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">props&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">get&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;forced_track&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="ow">or&lt;/span> &lt;span class="s2">&amp;#34;forced&amp;#34;&lt;/span> &lt;span class="ow">in&lt;/span> &lt;span class="n">track_name&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">is_sdh&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">props&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">get&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;flag_hearing_impaired&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="ow">or&lt;/span> &lt;span class="s2">&amp;#34;sdh&amp;#34;&lt;/span> &lt;span class="ow">in&lt;/span> &lt;span class="n">track_name&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="k">if&lt;/span> &lt;span class="n">is_forced&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">keep_subs&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">append&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">t&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="s2">&amp;#34;id&amp;#34;&lt;/span>&lt;span class="p">])&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="k">elif&lt;/span> &lt;span class="n">lang&lt;/span> &lt;span class="ow">in&lt;/span> &lt;span class="n">sub_langs&lt;/span> &lt;span class="ow">and&lt;/span> &lt;span class="ow">not&lt;/span> &lt;span class="n">is_sdh&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">keep_subs&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">append&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">t&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="s2">&amp;#34;id&amp;#34;&lt;/span>&lt;span class="p">])&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Note the fallback to a &lt;code>&amp;quot;forced&amp;quot;&lt;/code> substring in the track name. An earlier version trusted the Matroska flag alone, and plenty of real-world releases name the track &amp;ldquo;Forced&amp;rdquo; without ever setting the flag. If you rely on SDH subtitles, add them back by removing the &lt;code>is_sdh&lt;/code> check or adjusting the config to your needs.&lt;/p>
&lt;h2 id="the-bash-era-and-why-it-didnt-survive">The Bash Era, and Why It Didn&amp;rsquo;t Survive
&lt;/h2>&lt;p>The first working version of all this was a pair of Bash scripts gluing &lt;code>mkvmerge -J&lt;/code> to &lt;code>jq&lt;/code>, and they taught me a lot of lessons that anyone scripting against a real library will hit.&lt;/p>
&lt;p>Filenames with spaces, apostrophes, and brackets will quickly break a naive loop. A movie called &lt;code>Amelie (2001) [1080p].mkv&lt;/code> breaks &lt;code>for f in $(find ...)&lt;/code> instantly, because the shell splits on whitespace. The fix is NUL-safe handling: &lt;code>find ... -print0&lt;/code> piped into &lt;code>IFS= read -r -d ''&lt;/code>. Then the progress counter I added always reported zero at the end, because piping &lt;code>find&lt;/code> into a &lt;code>while&lt;/code> loop runs the loop in a subshell where incremented variables die on exit - the fix is process substitution, &lt;code>done &amp;lt; &amp;lt;(find ...)&lt;/code>. And to make re-runs skip finished files, the Bash version renamed every cleaned file to &lt;code>movie.cleaned.mkv&lt;/code> and excluded that pattern from the next &lt;code>find&lt;/code>. The name was the marker.&lt;/p>
&lt;p>It all worked. I still retired the whole thing, for three reasons:&lt;/p>
&lt;ol>
&lt;li>&lt;strong>The marker rename was the wrong kind of clever.&lt;/strong> Renaming every file in the library means renaming it out from under Radarr, Sonarr and Jellyfin, which tracks files by path.&lt;/li>
&lt;li>&lt;strong>The logic outgrew Bash.&lt;/strong> Once I wanted forced-subtitle detection, junk-track flags, metadata edits, and output verification, the &lt;code>jq&lt;/code> one-liners turned into the hardest-to-read part of the system. The same selection logic in Python is named functions you can actually reason about.&lt;/li>
&lt;li>&lt;strong>Two scripts, one brain.&lt;/strong> The cron sweep and the bulk-pass script were 90% copy-paste of each other, drifting apart with every fix. The Python rewrite collapsed them into one script, &lt;code>mkvclean.py&lt;/code>, that handles both jobs.&lt;/li>
&lt;/ol>
&lt;p>The shell lessons still stand - NUL-safe loops and subshell scoping will bite you in any Bash project. But the &lt;a class="link" href="https://github.com/KryptikWurm/mkv-track-stripper" target="_blank" rel="noopener"
>shipped versions of these tools&lt;/a> are Python 3.10+, end to end, with no &lt;code>jq&lt;/code> dependency at all.&lt;/p>
&lt;div class="alert alert-warning">
&lt;span class="alert-icon">⚠️&lt;/span>
&lt;div class="alert-content">
&lt;strong>Warning:&lt;/strong>
&lt;strong>Lessons learned:&lt;/strong> Write the quick Bash version to learn the problem, but notice the moment the logic outgrows it. For me that moment was the third nested &lt;code>jq&lt;/code> expression. Every safety feature that now guards my library - verification, forced-sub detection, metadata healing - would have been miserable to bolt onto the Bash scripts and was straightforward in Python.
&lt;/div>
&lt;/div>
&lt;h2 id="the-checkpoint-that-makes-the-whole-thing-self-repeating">The Checkpoint That Makes the Whole Thing Self-repeating
&lt;/h2>&lt;p>Here&amp;rsquo;s something worth calling out: mkvmerge doesn&amp;rsquo;t skip files it already cleaned. The tool has no memory whatsoever. It will happily re-remux a file you cleaned yesterday, every single run. Your script has to do all the skipping.&lt;/p>
&lt;p>&lt;code>mkvclean.py&lt;/code> solves this with a checkpoint file instead of the old rename-marker. Every successfully handled file is appended to &lt;code>~/.mkvclean_checkpoint&lt;/code> as a JSON line recording its path, modification time, and size. On the next run, any file whose path, mtime, and size all match its checkpoint entry is skipped instantly - no probe, no remux, nothing. Filenames never change, so Radarr, Sonarr, and Jellyfin are none the wiser.&lt;/p>
&lt;p>The mtime-and-size part does something the rename-marker never could: it detects upgrades. When Radarr replaces a movie with a better release, the new file has a new size and mtime, the checkpoint entry no longer matches, and the sweep cleans the new file automatically on its next pass.&lt;/p>
&lt;p>The checkpoint is append-only during a run, so it compacts itself at startup - duplicate entries collapse to one line per file and entries for files that no longer exist under the scanned root are pruned. Combined with &lt;code>fcntl&lt;/code> file locking (a manual run and a cron run can physically never overlap) and an &lt;code>os.scandir&lt;/code>-based traversal that pulls file metadata during the walk instead of &lt;code>stat&lt;/code>-ing everything twice, the sweep stays fast and safe on a multi-terabyte array.&lt;/p>
&lt;p>The &lt;code>--batch&lt;/code> flag is the right way to start cautiously and build confidence before committing to the whole library:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">./mkvclean.py /media/Storage/Movies --batch &lt;span class="m">20&lt;/span> &lt;span class="c1"># first run: 20 files only&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">./mkvclean.py /media/Storage/Movies --batch &lt;span class="m">50&lt;/span> &lt;span class="c1"># next run: 50 more (checkpointed files are skipped)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">./mkvclean.py /media/Storage/Movies --batch &lt;span class="m">0&lt;/span> &lt;span class="c1"># 0 = no limit: finish everything remaining&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Stop at any point with Ctrl+C - the script traps the signal, logs that it was interrupted, and the checkpoint means re-running always picks up exactly where you left off. It also removes the tmp file it was building.&lt;/p>
&lt;p>The SABnzbd hook doesn&amp;rsquo;t need a checkpoint at all. It runs once on each download&amp;rsquo;s folder right after the download completes, and it skips the remux entirely when there&amp;rsquo;s nothing to remove. A &lt;code>DRY_RUN&lt;/code> constant at the top (and a &lt;code>--dry-run&lt;/code> flag on the sweep) lets you see exactly what would happen before anything touches real files.&lt;/p>
&lt;p>While you&amp;rsquo;re still learning your own language rules, run in small batches and spot-check a handful of files in Jellyfin before continuing.&lt;/p>
&lt;h2 id="what-gets-stripped-beyond-languages">What Gets Stripped Beyond Languages
&lt;/h2>&lt;p>Language filtering was the original goal, but once the scripts were reading every track&amp;rsquo;s properties anyway, a second category of junk became impossible to ignore.&lt;/p>
&lt;p>&lt;strong>Commentary and descriptive audio.&lt;/strong> Director&amp;rsquo;s commentary, &amp;ldquo;descriptive video service&amp;rdquo; tracks, and audio descriptions are junk for most libraries, and they&amp;rsquo;re often tagged &lt;code>eng&lt;/code> - so a pure language filter keeps them. The scripts detect them through the explicit Matroska flags first (&lt;code>flag_commentary&lt;/code>, &lt;code>flag_visual_impaired&lt;/code>), which catch untitled or non-English commentary tracks, then fall back to track-name matching:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="n">is_junk&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">(&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">props&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">get&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;flag_commentary&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="ow">or&lt;/span> &lt;span class="n">props&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">get&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;flag_visual_impaired&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="ow">or&lt;/span> &lt;span class="nb">any&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">x&lt;/span> &lt;span class="ow">in&lt;/span> &lt;span class="n">track_name&lt;/span> &lt;span class="k">for&lt;/span> &lt;span class="n">x&lt;/span> &lt;span class="ow">in&lt;/span> &lt;span class="n">JUNK_AUDIO_NAME_PATTERNS&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>where the name patterns are &lt;code>commentary&lt;/code>, &lt;code>description&lt;/code>, &lt;code>director&lt;/code>, and &lt;code>dvs&lt;/code>.&lt;/p>
&lt;p>&lt;strong>Default-flag enforcement.&lt;/strong> This fixes the single most common Jellyfin complaint - the wrong audio track playing by default. The first kept audio track becomes the sole default, and the default flag is explicitly &lt;em>cleared&lt;/em> on every other kept track, so a stale flag carried over from the source can&amp;rsquo;t leave two defaults fighting each other:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="k">for&lt;/span> &lt;span class="n">i&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">tid&lt;/span> &lt;span class="ow">in&lt;/span> &lt;span class="nb">enumerate&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">keep_audio&lt;/span>&lt;span class="p">):&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">cmd&lt;/span> &lt;span class="o">+=&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="s2">&amp;#34;--default-track-flag&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="sa">f&lt;/span>&lt;span class="s2">&amp;#34;&lt;/span>&lt;span class="si">{&lt;/span>&lt;span class="n">tid&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2">:&lt;/span>&lt;span class="si">{&lt;/span>&lt;span class="mi">1&lt;/span> &lt;span class="k">if&lt;/span> &lt;span class="n">i&lt;/span> &lt;span class="o">==&lt;/span> &lt;span class="mi">0&lt;/span> &lt;span class="k">else&lt;/span> &lt;span class="mi">0&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2">&amp;#34;&lt;/span>&lt;span class="p">]&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Subtitle default flags are cleared across the board, so nothing forces subs on by default.&lt;/p>
&lt;p>&lt;strong>Cosmetic metadata.&lt;/strong> Release groups leave a trail: the global container title set to the release filename, tag blocks, track names like &amp;ldquo;Commentary by&amp;hellip;&amp;rdquo;. The scripts strip the global title, wipe tags, and clear junk track names by default (each pass individually toggleable). They&amp;rsquo;ll also fill in an undefined track language when the track&amp;rsquo;s &lt;em>name&lt;/em> gives it away - a track named &amp;ldquo;English&amp;rdquo; tagged &lt;code>und&lt;/code> gets relabeled &lt;code>eng&lt;/code>, and crucially this inference runs &lt;em>before&lt;/em> the language filter so the corrected tag actually affects what&amp;rsquo;s kept. Attachment removal (cover art, embedded fonts) exists but is off by default, because embedded fonts can matter for styled ASS/SSA subtitles.&lt;/p>
&lt;p>&lt;strong>The mkvpropedit fast-path.&lt;/strong> Here&amp;rsquo;s a neat optimization that fell out of all this: when a file&amp;rsquo;s tracks are already exactly what you want but its &lt;em>header&lt;/em> is still wrong - a stale default flag, a junk title - there&amp;rsquo;s no reason to remux gigabytes. &lt;code>mkvpropedit&lt;/code> (also from MKVToolNix) edits headers in place on the existing file, in milliseconds, with no temp file, no disk-space requirement, and ownership/permissions preserved for free because the file never moves. The scripts plan header-only edits whenever track selection comes back unchanged, so a second pass over a clean library is nearly instant and still fixes metadata.&lt;/p>
&lt;script type="application/ld+json">{"@context":"https://schema.org","@type":"Product","description":"Intel NUC 12 Pro (NUC12WSHi5). A compact mini PC with a 12th-gen Core i5-1240P and Iris Xe that can drive up to four displays (dual Thunderbolt 4 + dual HDMI), plus 2.5GbE and Wi-Fi 6E. The H-chassis adds a 2.5″ bay alongside NVMe storage and up to 64GB RAM, making it a quiet, versatile homelab node or HTPC/office box.","image":["https://diymediaserver.com/images/products/NUC12.jpg"],"name":"Intel NUC 12 Pro (NUC12WSHi5).","offers":[{"@type":"Offer","availability":"https://schema.org/InStock","price":"0","priceCurrency":"USD","seller":{"@type":"Organization","name":"Amazon"},"url":"https://amzn.to/3JyPlM4"},{"@type":"Offer","availability":"https://schema.org/InStock","price":"0","priceCurrency":"USD","seller":{"@type":"Organization","name":"Newegg"},"url":"https://click.linksynergy.com/link?id=plNXx%2aS0a%2a8\u0026offerid=1786142.445831098683742853224568\u0026type=2\u0026murl=https%3a%2f%2fwww.newegg.com%2fnuc-12-pro-barebone-12th-gen-intel-core-i5-1240p-rnuc12wshi50000%2fp%2f1VK-004K-068E6%3fitem%3d9SIBPASKBH4509"}],"sku":"B0BKQ7KRZ1"}&lt;/script>
&lt;div class="product-box" data-asin="B0BKQ7KRZ1">
&lt;div class="product-box-image">&lt;img src="https://diymediaserver.com/images/products/NUC12_hu_9869ddf3e939c2f8.webp" width="600" height="377" alt="Intel NUC 12 Pro (NUC12WSHi5)" loading="lazy" decoding="async">&lt;/div>
&lt;div class="product-box-content">
&lt;div class="product-box-description">
&lt;strong>Intel NUC 12 Pro (NUC12WSHi5).&lt;/strong>
A compact mini PC with a 12th-gen Core i5-1240P and Iris Xe that can drive up to four displays (dual Thunderbolt 4 + dual HDMI), plus 2.5GbE and Wi-Fi 6E. The H-chassis adds a 2.5″ bay alongside NVMe storage and up to 64GB RAM, making it a quiet, versatile homelab node or HTPC/office box.
&lt;/div>
&lt;div class="product-meta-row">
&lt;div class="product-price">
&lt;strong>Amazon Price:&lt;/strong>
&lt;span class="price-loading">Loading...&lt;/span>
&lt;span class="price-value" style="display:none;">&lt;/span>
&lt;/div>
&lt;div class="product-availability">
&lt;strong>Availability:&lt;/strong>
&lt;span class="availability-loading">Checking...&lt;/span>
&lt;span class="availability-value" style="display:none;">&lt;/span>
&lt;/div>
&lt;/div>
&lt;/div>
&lt;div class="product-box-links">
&lt;a href="https://amzn.to/3JyPlM4" class="affiliate-button" target="_blank" rel="noopener nofollow sponsored">Amazon&lt;/a>
&lt;a href="https://click.linksynergy.com/link?id=plNXx%2aS0a%2a8&amp;amp;offerid=1786142.445831098683742853224568&amp;amp;type=2&amp;amp;murl=https%3a%2f%2fwww.newegg.com%2fnuc-12-pro-barebone-12th-gen-intel-core-i5-1240p-rnuc12wshi50000%2fp%2f1VK-004K-068E6%3fitem%3d9SIBPASKBH4509" class="affiliate-button" target="_blank" rel="noopener nofollow sponsored">Newegg&lt;/a>
&lt;/div>
&lt;div class="product-affiliate-disclaimer">
&lt;small>&lt;em>Contains affiliate links. I may earn a commission at no cost to you.&lt;/em>&lt;/small>
&lt;/div>
&lt;/div>
&lt;h2 id="moving-cleanup-into-the-pipeline-with-sabnzbd">Moving Cleanup Into the Pipeline With SABnzbd
&lt;/h2>&lt;p>Sweeping the entire library on a schedule works, but the smarter move is to clean each file once, at download time, before Radarr or Sonarr ever imports it. SABnzbd makes this possible with post-processing scripts. That&amp;rsquo;s the job of &lt;a class="link" href="https://github.com/KryptikWurm/mkv-track-stripper/blob/main/mkv_strip_pp.py" target="_blank" rel="noopener"
>&lt;code>mkv_strip_pp.py&lt;/code>&lt;/a>: drop it into your SABnzbd scripts directory, make it executable, then assign it to your &lt;code>movies&lt;/code> (and/or &lt;code>tv&lt;/code>) category in Settings → Post-Processing in the SABnzbd UI so only those downloads trigger cleanup.&lt;/p>
&lt;p>SABnzbd exports job details as both positional arguments and environment variables. The hook reads the job folder from &lt;code>SAB_COMPLETE_DIR&lt;/code> (falling back to &lt;code>argv[1]&lt;/code> for manual testing on the command line) and the download status from &lt;code>SAB_PP_STATUS&lt;/code> (falling back to &lt;code>argv[7]&lt;/code>). If the download was already marked failed, the hook exits immediately without touching anything.&lt;/p>
&lt;p>Because the cleaned file is swapped atomically onto the original path, the filename Radarr or Sonarr expects to import never changes. They never see a seam - it imports the file it expected, already trimmed.&lt;/p>
&lt;p>The most important design principle here: cleanup must never block an import. If stripping a file fails, leave the original intact, log it, and still let the import proceed. A working movie that didn&amp;rsquo;t get cleaned is fine. A broken import because your cleanup script choked is not.&lt;/p>
&lt;h3 id="two-different-exit-code-systems-dont-mix-them">Two Different Exit-Code Systems (Don&amp;rsquo;t Mix Them)
&lt;/h3>&lt;p>This trips people up because two separate exit-code conventions show up in the same context, and they mean different things.&lt;/p>
&lt;p>&lt;strong>SABnzbd&amp;rsquo;s exit codes&lt;/strong> tell SABnzbd what to do with the job after your post-processing script runs:&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Exit code&lt;/th>
&lt;th>SABnzbd interpretation&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;code>0&lt;/code>&lt;/td>
&lt;td>Success&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>1&lt;/code>&lt;/td>
&lt;td>Job failed&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>2&lt;/code>&lt;/td>
&lt;td>Retry the download&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>&lt;strong>mkvmerge&amp;rsquo;s exit codes&lt;/strong> (covered earlier) describe the remux itself: &lt;code>0&lt;/code> success, &lt;code>1&lt;/code> success with warnings, &lt;code>2&lt;/code> real error. Same numbers, completely different meanings. If you setup your SABnzbd hook to mkvmerge&amp;rsquo;s convention, a remux warning (mkvmerge &lt;code>1&lt;/code>) would mark a perfectly good download as failed in SAB. That&amp;rsquo;s exactly the kind of silent breakage you don&amp;rsquo;t want.&lt;/p>
&lt;p>The hook keeps these rigorously separated. Any cleanup failure - mkvmerge returning &lt;code>2&lt;/code>, a failed verification, an unexpected exception on one file - is handled as &amp;ldquo;log it, leave the original, move on.&amp;rdquo; The hook still exits &lt;code>0&lt;/code> in all those cases, because a movie that wasn&amp;rsquo;t cleaned is still importable. The only time it exits &lt;code>1&lt;/code> is for genuine setup problems (mkvmerge or mkvpropedit not found, job directory missing), because those mean nothing can work at all.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="n">log&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">info&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;Found &lt;/span>&lt;span class="si">%d&lt;/span>&lt;span class="s2"> MKV file(s).&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="nb">len&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">mkvs&lt;/span>&lt;span class="p">))&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="k">for&lt;/span> &lt;span class="n">path&lt;/span> &lt;span class="ow">in&lt;/span> &lt;span class="n">mkvs&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">log&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">info&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;Inspecting: &lt;/span>&lt;span class="si">%s&lt;/span>&lt;span class="s2">&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">os&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">path&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">basename&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">path&lt;/span>&lt;span class="p">))&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">try&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">result&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">process_file&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">path&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">AUDIO_LANGS&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">SUB_LANGS&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">resolved_mkvmerge&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">resolved_mkvpropedit&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">DRY_RUN&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">CLEANUP&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">if&lt;/span> &lt;span class="n">result&lt;/span> &lt;span class="o">==&lt;/span> &lt;span class="s2">&amp;#34;nothing&amp;#34;&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">log&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">info&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34; Nothing to strip, leaving file as-is.&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">elif&lt;/span> &lt;span class="n">result&lt;/span> &lt;span class="o">==&lt;/span> &lt;span class="s2">&amp;#34;stripped&amp;#34;&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">log&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">info&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34; Done.&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">elif&lt;/span> &lt;span class="n">result&lt;/span> &lt;span class="o">==&lt;/span> &lt;span class="s2">&amp;#34;fixed&amp;#34;&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">log&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">info&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34; Metadata fixed in place (no remux).&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">except&lt;/span> &lt;span class="ne">Exception&lt;/span> &lt;span class="k">as&lt;/span> &lt;span class="n">e&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="c1"># never let one bad file break the import&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">log&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">error&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;Unexpected error on &lt;/span>&lt;span class="si">%s&lt;/span>&lt;span class="s2">: &lt;/span>&lt;span class="si">%s&lt;/span>&lt;span class="s2">. Original kept.&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">os&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">path&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">basename&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">path&lt;/span>&lt;span class="p">),&lt;/span> &lt;span class="n">e&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">log&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">info&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;Finished. Exiting 0 so import proceeds.&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="k">return&lt;/span> &lt;span class="mi">0&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Even the logging follows the never-block rule. SABnzbd captures the script&amp;rsquo;s stdout into its job history, and the file log (defaulting to &lt;code>/config/mkv_strip_pp.log&lt;/code>, which lands on the persistent volume in Docker) is best practice - an unwritable log path produces a warning instead of a crash, because an earlier version managed to fail an entire download over a log file it couldn&amp;rsquo;t open.&lt;/p>
&lt;p>Configuration is a block of constants at the top of the script - preferred languages, dry-run, and toggles for each cleanup pass:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="n">AUDIO_LANGS&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="s2">&amp;#34;eng&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s2">&amp;#34;jpn&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s2">&amp;#34;und&amp;#34;&lt;/span>&lt;span class="p">]&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">SUB_LANGS&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="s2">&amp;#34;eng&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s2">&amp;#34;und&amp;#34;&lt;/span>&lt;span class="p">]&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">LOG_FILE&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;/config/mkv_strip_pp.log&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">DRY_RUN&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="kc">False&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">STRIP_TITLE&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="kc">True&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">STRIP_TAGS&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="kc">True&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">INFER_LANGUAGE&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="kc">True&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">CLEAR_JUNK_TRACK_NAMES&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="kc">True&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">STRIP_ATTACHMENTS&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="kc">False&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="installing-mkvmerge-inside-a-docker-container">Installing mkvmerge Inside a Docker Container
&lt;/h2>&lt;p>If you run SABnzbd in Docker, you hit an error the first time you try to use mkvmerge inside it: the binary isn&amp;rsquo;t there. The tempting fix is &lt;code>docker exec&lt;/code> into the running container and installing it by hand. Don&amp;rsquo;t do this. The moment you pull a new image, your install vanishes.&lt;/p>
&lt;p>The correct fix is a tiny custom Dockerfile that extends the base image, and the repo &lt;a class="link" href="https://github.com/KryptikWurm/mkv-track-stripper/blob/main/Dockerfile" target="_blank" rel="noopener"
>ships one&lt;/a>:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-dockerfile" data-lang="dockerfile">&lt;span class="line">&lt;span class="cl">&lt;span class="k">FROM&lt;/span>&lt;span class="s"> lscr.io/linuxserver/sabnzbd:latest&lt;/span>&lt;span class="err">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="err">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="err">&lt;/span>&lt;span class="c"># Add mkvmerge (from MKVToolNix) and the optional ACL so the post-processing script can run.&lt;/span>&lt;span class="err">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="err">&lt;/span>&lt;span class="c"># The LinuxServer.io image is Alpine-based, so use apk.&lt;/span>&lt;span class="err">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="err">&lt;/span>&lt;span class="k">RUN&lt;/span> apk add --no-cache mkvtoolnix acl&lt;span class="err">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>That &lt;code>apk&lt;/code> line hides my actual debugging moment. My first Dockerfile used &lt;code>apt-get&lt;/code>, and the build died with &lt;code>apt-get: not found&lt;/code>. The base image was Alpine, not Debian. Alpine uses &lt;code>apk&lt;/code>, not &lt;code>apt&lt;/code>.&lt;/p>
&lt;p>In &lt;code>docker-compose.yaml&lt;/code>, point the service at the Dockerfile with &lt;code>build:&lt;/code> instead of a plain &lt;code>image:&lt;/code>, and mount the hook into the container&amp;rsquo;s scripts directory:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-yaml" data-lang="yaml">&lt;span class="line">&lt;span class="cl">&lt;span class="nt">services&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">sabnzbd&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">build&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">. &lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># builds the Dockerfile in this repo&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">container_name&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">sabnzbd&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">environment&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="l">PUID=1000&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="l">PGID=1000&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="l">TZ=Etc/UTC&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">ports&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="s2">&amp;#34;8080:8080&amp;#34;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">volumes&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="l">/path/to/appdata/sabnzbd:/config&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="l">./mkv_strip_pp.py:/config/scripts/mkv_strip_pp.py&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="l">/path/to/downloads:/downloads&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">restart&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">unless-stopped&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>One habit changes after adding packages to a custom Dockerfile. Updating the container is now &lt;code>docker compose up -d --build&lt;/code> (with &lt;code>--pull always&lt;/code> if you want the freshest base) rather than a plain &lt;code>docker compose pull&lt;/code>, so your added packages get rebuilt on top of the fresh base.&lt;/p>
&lt;p>The &lt;code>mkvclean.py&lt;/code> runs directly on the host, so the host needs Python 3.10+ and MKVToolNix there, plus the optional &lt;code>acl&lt;/code> package if you want POSIX ACLs preserved on cleaned files:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">apt-get install -y mkvtoolnix python3 acl
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;script type="application/ld+json">{"@context":"https://schema.org","@type":"Product","description":"MINISFORUM MS-A2. A compact mini-workstation built around up to a 16-core Ryzen 9 9955HX, with dual 10GbE SFP+ plus dual 2.5GbE, flexible storage (U.2 + M.2 including 22110), and triple 8K display outputs. A strong homelab node or small server with serious I/O.","image":["https://diymediaserver.com/images/products/MS-A2.jpg"],"name":"MINISFORUM MS-A2.","offers":[{"@type":"Offer","availability":"https://schema.org/InStock","price":"0","priceCurrency":"USD","seller":{"@type":"Organization","name":"Amazon"},"url":"https://amzn.to/4o0suZN"}],"sku":"B0F8JG2SHN"}&lt;/script>
&lt;div class="product-box" data-asin="B0F8JG2SHN">
&lt;div class="product-box-image">&lt;img src="https://diymediaserver.com/images/products/MS-A2_hu_4a3b46e80711e3b1.webp" width="600" height="354" alt="MINISFORUM MS-A2" loading="lazy" decoding="async">&lt;/div>
&lt;div class="product-box-content">
&lt;div class="product-box-description">
&lt;strong>MINISFORUM MS-A2.&lt;/strong>
A compact mini-workstation built around up to a 16-core Ryzen 9 9955HX, with dual 10GbE SFP+ plus dual 2.5GbE, flexible storage (U.2 + M.2 including 22110), and triple 8K display outputs. A strong homelab node or small server with serious I/O.
&lt;/div>
&lt;div class="product-meta-row">
&lt;div class="product-price">
&lt;strong>Amazon Price:&lt;/strong>
&lt;span class="price-loading">Loading...&lt;/span>
&lt;span class="price-value" style="display:none;">&lt;/span>
&lt;/div>
&lt;div class="product-availability">
&lt;strong>Availability:&lt;/strong>
&lt;span class="availability-loading">Checking...&lt;/span>
&lt;span class="availability-value" style="display:none;">&lt;/span>
&lt;/div>
&lt;/div>
&lt;/div>
&lt;div class="product-box-links">
&lt;a href="https://amzn.to/4o0suZN" class="affiliate-button" target="_blank" rel="noopener nofollow sponsored">Amazon&lt;/a>
&lt;/div>
&lt;div class="product-affiliate-disclaimer">
&lt;small>&lt;em>Contains affiliate links. I may earn a commission at no cost to you.&lt;/em>&lt;/small>
&lt;/div>
&lt;/div>
&lt;h2 id="the-cron-sweep-or-why-you-still-need-one">The Cron Sweep, or Why You Still Need One
&lt;/h2>&lt;p>With per-download cleanup in place, you might think the sweep is redundant. It&amp;rsquo;s not. The SABnzbd hook only sees files SABnzbd downloads. It misses manual rips, files copied in from elsewhere, media you reorganized, and anything from a category the hook isn&amp;rsquo;t attached to. A scheduled sweep is your catch-all for eventual consistency across the whole library.&lt;/p>
&lt;p>The sweep is &lt;a class="link" href="https://github.com/KryptikWurm/mkv-track-stripper/blob/main/mkvclean.py" target="_blank" rel="noopener"
>&lt;code>mkvclean.py&lt;/code>&lt;/a> again - the same checkpoint-aware script from the bulk pass, run unattended. Install it somewhere on the &lt;code>PATH&lt;/code>:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">sudo cp mkvclean.py /usr/local/bin/
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">sudo chmod +x /usr/local/bin/mkvclean.py
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Locking is built in: the script takes a kernel-level &lt;code>fcntl&lt;/code> lock on &lt;code>/tmp/mkvclean.lock&lt;/code> at startup and exits quietly if another run already holds it, so a cron firing while a manual run is still chewing through the library can never cause two sweeps to step on each other. Logging is built in too - everything goes to the path given by &lt;code>--log&lt;/code> (default &lt;code>~/mkvclean.log&lt;/code>) as well as stdout - so the crontab line stays clean with no redirection:&lt;/p>
&lt;pre tabindex="0">&lt;code class="language-cron" data-lang="cron">0 4 * * * /usr/local/bin/mkvclean.py /media/Storage/Movies --batch 0 --log ~/mkvclean-cron.log
&lt;/code>&lt;/pre>&lt;p>Thanks to the checkpoint, a typical nightly run skips almost everything instantly and only touches whatever&amp;rsquo;s new or upgraded since the last pass. Safe to run as often as you like.&lt;/p>
&lt;h2 id="the-one-time-bulk-pass">The One-Time Bulk Pass
&lt;/h2>&lt;p>The third job is the one you run once: cleaning the back catalogue that existed before any of this automation. Same script, run by hand, starting with a dry run.&lt;/p>
&lt;p>Always do the dry run first on a new library. It reports exactly what would be stripped from every file without modifying a single byte:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">./mkvclean.py /media/Storage/Movies --dry-run
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Read the log. Confirm it&amp;rsquo;s keeping what you expect - especially on anime and foreign films. Then start small, with a batch spanning different languages and sources:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">./mkvclean.py /media/Storage/Movies --batch &lt;span class="m">20&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Spot-check a handful of files in Jellyfin to confirm audio and subtitle selection looks right. If everything looks good, continue in larger batches:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">./mkvclean.py /media/Storage/Movies --batch &lt;span class="m">100&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">./mkvclean.py /media/Storage/Movies --batch &lt;span class="m">0&lt;/span> &lt;span class="c1"># 0 = no limit: finish everything remaining&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>The language preferences are flags rather than edits to the script - &lt;code>--audio eng,jpn,und --subs eng,und&lt;/code> are the defaults. If you catch a bad rule, stop with Ctrl+C, adjust the flags, and re-run; checkpointed files won&amp;rsquo;t be touched again. There&amp;rsquo;s also a &lt;code>--prefer-audio-channels&lt;/code> flag for the hoarder special: releases carrying both a lossless 7.1 track and an AC3 5.1 track in the same language. Set it and the sweep keeps only the best English track (exact channel-count match wins, otherwise most channels) instead of all of them.&lt;/p>
&lt;p>Files a bad run already processed may need re-downloading; that&amp;rsquo;s exactly why you dry-run first and verify in batches rather than letting it run unattended overnight.&lt;/p>
&lt;script type="application/ld+json">{"@context":"https://schema.org","@type":"Product","description":"MINISFORUM MS-01 Mini Workstation. The MS-01 i5 is a tiny mini PC with plenty of cores, multiple NVMe slots, and real homelab networking (dual 10G SFP+ plus 2.5 GbE), which makes it perfect for a Proxmox compute node. It has more than enough power for Jellyfin, the *arr stack, downloads, and a few VMs or LXCs, without turning your closet into a jet engine or space heater.","image":["https://diymediaserver.com/images/products/ms-01.jpg"],"name":"MINISFORUM MS-01 Mini Workstation.","offers":[{"@type":"Offer","availability":"https://schema.org/InStock","price":"0","priceCurrency":"USD","seller":{"@type":"Organization","name":"Amazon"},"url":"https://amzn.to/4p3HhTI"},{"@type":"Offer","availability":"https://schema.org/InStock","price":"0","priceCurrency":"USD","seller":{"@type":"Organization","name":"Newegg"},"url":"https://click.linksynergy.com/link?id=plNXx%2aS0a%2a8\u0026offerid=1786142.4458318191324330626506341\u0026type=2\u0026murl=https%3a%2f%2fwww.newegg.com%2fminisforum-barebone-systems-mini-pc-intel-core-i5-12600h%2fp%2f2SW-002G-000K9%3fitem%3d9SIBJ6VKBD4204"}],"sku":"B0D454DQSP"}&lt;/script>
&lt;div class="product-box" data-asin="B0D454DQSP">
&lt;div class="product-box-image">&lt;img src="https://diymediaserver.com/images/products/ms-01_hu_17a6cbd7910305fd.webp" width="600" height="265" alt="MINISFORUM MS-01 Mini Workstation" loading="lazy" decoding="async">&lt;/div>
&lt;div class="product-box-content">
&lt;div class="product-box-description">
&lt;strong>MINISFORUM MS-01 Mini Workstation.&lt;/strong>
The MS-01 i5 is a tiny mini PC with plenty of cores, multiple NVMe slots, and real homelab networking (dual 10G SFP+ plus 2.5 GbE), which makes it perfect for a Proxmox compute node. It has more than enough power for Jellyfin, the *arr stack, downloads, and a few VMs or LXCs, without turning your closet into a jet engine or space heater.
&lt;/div>
&lt;div class="product-meta-row">
&lt;div class="product-price">
&lt;strong>Amazon Price:&lt;/strong>
&lt;span class="price-loading">Loading...&lt;/span>
&lt;span class="price-value" style="display:none;">&lt;/span>
&lt;/div>
&lt;div class="product-availability">
&lt;strong>Availability:&lt;/strong>
&lt;span class="availability-loading">Checking...&lt;/span>
&lt;span class="availability-value" style="display:none;">&lt;/span>
&lt;/div>
&lt;/div>
&lt;/div>
&lt;div class="product-box-links">
&lt;a href="https://amzn.to/4p3HhTI" class="affiliate-button" target="_blank" rel="noopener nofollow sponsored">Amazon&lt;/a>
&lt;a href="https://click.linksynergy.com/link?id=plNXx%2aS0a%2a8&amp;amp;offerid=1786142.4458318191324330626506341&amp;amp;type=2&amp;amp;murl=https%3a%2f%2fwww.newegg.com%2fminisforum-barebone-systems-mini-pc-intel-core-i5-12600h%2fp%2f2SW-002G-000K9%3fitem%3d9SIBJ6VKBD4204" class="affiliate-button" target="_blank" rel="noopener nofollow sponsored">Newegg&lt;/a>
&lt;/div>
&lt;div class="product-affiliate-disclaimer">
&lt;small>&lt;em>Contains affiliate links. I may earn a commission at no cost to you.&lt;/em>&lt;/small>
&lt;/div>
&lt;/div>
&lt;h2 id="how-jellyfin-sees-the-result">How Jellyfin Sees the Result
&lt;/h2>&lt;p>A quick note on the playback side, because it&amp;rsquo;s where you confirm the whole thing worked. Jellyfin uses ffmpeg for transcoding. That behavior is unaffected by whether the file was muxed with ffmpeg or mkvmerge, as long as the MKV is spec-compliant. So cleaning with mkvmerge causes Jellyfin no trouble.&lt;/p>
&lt;p>The two issues Jellyfin users hit most are the wrong default audio track and missing subtitles. The default-track enforcement described earlier handles the first one directly: every cleaned file comes out with exactly one default audio track (the first kept one) and no default-flagged subtitles, so Jellyfin&amp;rsquo;s track picker starts from a sane baseline instead of whatever flags the release group left behind. And when a file needs &lt;em>only&lt;/em> that flag fixed, the &lt;code>mkvpropedit&lt;/code> fast-path patches the header in place without remuxing at all, which is instant.&lt;/p>
&lt;p>Missing subtitles are guarded by the forced-track rule - the subtitle that translates the alien dialogue in an otherwise-English film survives every cleanup, even when it&amp;rsquo;s mistagged.&lt;/p>
&lt;h2 id="the-finished-system">The Finished System
&lt;/h2>&lt;p>What started as a one-off cleanup grew into two scripts covering three jobs, now public as &lt;a class="link" href="https://github.com/KryptikWurm/mkv-track-stripper" target="_blank" rel="noopener"
>mkv-track-stripper on GitHub&lt;/a> under the MIT license:&lt;/p>
&lt;ol>
&lt;li>&lt;strong>Per-download cleanup&lt;/strong> via the SABnzbd hook (&lt;code>mkv_strip_pp.py&lt;/code>, runs at download time, keeps the original filename so Radarr imports cleanly).&lt;/li>
&lt;li>&lt;strong>Scheduled catch-all sweep&lt;/strong> via cron (&lt;code>mkvclean.py --batch 0&lt;/code>, locked with &lt;code>fcntl&lt;/code>, checkpoint-aware, catches everything the download path misses).&lt;/li>
&lt;li>&lt;strong>One-time bulk pass&lt;/strong> over the back catalogue (the same &lt;code>mkvclean.py&lt;/code>, dry-run first, batched, resumable, verified in Jellyfin).&lt;/li>
&lt;/ol>
&lt;p>The repo carries the Dockerfile for the SABnzbd image, a changelog, and a README covering setup and troubleshooting - including the one warning everyone asks about, &lt;code>Could not preserve ownership ... Operation not permitted&lt;/code>, which means the script isn&amp;rsquo;t root and couldn&amp;rsquo;t chown the cleaned file back to its original owner. The cleanup itself succeeded.&lt;/p>
&lt;p>In practice the logs show three patterns: a file with nothing to strip and a clean header is checkpointed in milliseconds, a file needing only a flag or title fix gets an in-place &lt;code>mkvpropedit&lt;/code> edit, and a bloated release with 30 subtitle tracks gets trimmed down in about 30 seconds, losslessly.&lt;/p>
&lt;h2 id="frequently-asked-questions">Frequently Asked Questions
&lt;/h2>&lt;details class="collapse md" >
&lt;summary>➤ Will this re-encode my video and lose quality?&lt;/summary>
&lt;div class="collapse-content">No. mkvmerge is a remuxer, not a transcoder. It copies the video and chosen audio and subtitle streams into a new container untouched, so there&amp;rsquo;s zero quality loss and the operation finishes in seconds rather than the hours a re-encode would take.&lt;/div>
&lt;/details>
&lt;details class="collapse md" >
&lt;summary>➤ Will it ever leave a movie silent?&lt;/summary>
&lt;div class="collapse-content">Not if you build the safety logic in. The rules that prevent it are simple: never strip the only audio track, and when no audio matches your preferred languages, keep everything and log a notice instead of dropping anything. Undetermined (&lt;code>und&lt;/code>) tracks only get dropped when a properly tagged English track exists alongside them. The output is also re-probed before the original is replaced, so a broken remux can never overwrite a good file.&lt;/div>
&lt;/details>
&lt;details class="collapse md" >
&lt;summary>➤ Will it delete forced subtitles?&lt;/summary>
&lt;div class="collapse-content">No. Forced subtitles are always kept regardless of language, detected through the Matroska &lt;code>forced_track&lt;/code> flag or a &amp;ldquo;forced&amp;rdquo; marker in the track name. Those are the subs that translate the one line of Elvish in an otherwise-English film, and losing them genuinely breaks the movie.&lt;/div>
&lt;/details>
&lt;details class="collapse md" >
&lt;summary>➤ Does the sweep re-process files it already cleaned?&lt;/summary>
&lt;div class="collapse-content">No. &lt;code>mkvclean.py&lt;/code> records every handled file&amp;rsquo;s path, size, and modification time in a checkpoint file and skips matching files instantly on later runs. If Radarr upgrades a movie, the new size and mtime no longer match, so the replacement gets cleaned automatically on the next sweep.&lt;/div>
&lt;/details>
&lt;details class="collapse md" >
&lt;summary>➤ Does this work with Sonarr too, not just Radarr?&lt;/summary>
&lt;div class="collapse-content">Yes. You bind the SABnzbd post-processing script to a &lt;code>tv&lt;/code> category in the SABnzbd UI the same way you bind it to &lt;code>movies&lt;/code>. The library sweep doesn&amp;rsquo;t know or care about Radarr or Sonarr at all - it walks the filesystem.&lt;/div>
&lt;/details>
&lt;details class="collapse md" >
&lt;summary>➤ Why does apk add mkvtoolnix fail in my Docker image?&lt;/summary>
&lt;div class="collapse-content">&lt;code>apk&lt;/code> is Alpine&amp;rsquo;s package manager. If your base image is Debian or Ubuntu, you need &lt;code>apt-get install mkvtoolnix&lt;/code> instead. The reverse trips people up too: &lt;code>apt-get: not found&lt;/code> means your base is Alpine. Always check the base image distro before picking a package manager.&lt;/div>
&lt;/details>
&lt;details class="collapse md" >
&lt;summary>➤ How do I list MKV tracks and their languages from the command line?&lt;/summary>
&lt;div class="collapse-content">Run &lt;code>mkvmerge -i file.mkv&lt;/code> for a quick human-readable list of track IDs, types, and languages. For scripting, use &lt;code>mkvmerge -J file.mkv&lt;/code> to get machine-readable JSON you can parse safely, reading &lt;code>tracks[].properties.language&lt;/code> for each track.&lt;/div>
&lt;/details>
&lt;h2 id="wrapping-up">Wrapping Up
&lt;/h2>&lt;p>A few themes carried through every layer of this build, and they&amp;rsquo;re worth restating: verify before you replace, make every operation idempotent, let failures fail safe, and choose your tools for where the project is heading rather than where it started. The single most important lesson, the one that cost me a movie night, is that you reason from the tracks a file actually contains, never from what you assume it should contain.&lt;/p>
&lt;p>A good automation is rarely the first script you write. It&amp;rsquo;s the system you arrive at after the task has taught you what it actually needs. Mine started as a ten-line &amp;ldquo;keep only English&amp;rdquo; filter, went through a Bash-and-&lt;code>jq&lt;/code> era, and ended as two Python scripts with checkpointing, verification, and safety defaults baked in - now maintained in the open at &lt;a class="link" href="https://github.com/KryptikWurm/mkv-track-stripper" target="_blank" rel="noopener"
>github.com/KryptikWurm/mkv-track-stripper&lt;/a>.&lt;/p>
&lt;p>If you want to adopt this, you don&amp;rsquo;t have to rebuild it: clone the repo, run &lt;code>mkvclean.py --dry-run&lt;/code> against a test folder, read the log, then layer in the SABnzbd hook and the cron sweep once you trust the selection rules. Keep the verify-before-replace defaults in place, and you can roll this out across thousands of files without ever losing a movie night.&lt;/p></description></item></channel></rss>