Sep 7, 2025

My Video Production Stack (for YouTube)

There are two barriers to publishing videos on YouTube. The first one is getting over thinking you are cringe and the second one is video editing. You have to battle the first one on your own, but I’m hoping that with what I share below, you can ease the burden of the latter.

No-hands editing, processing and publishing

The goal of this video production stack is to take a raw video, then automatically edit it, transcribe it, come up with timestamped chapter markers for the YouTube description, and upload it to my YouTube channel. Basically this is a fully automatic publishing workflow after the point I hit stop on the record button.

This, of course, is very specific to the kinds of videos that I make, which is talking head videos with a screen share. But that is a large enough genre on YouTube that this setup might be useful to others out there.

I have several different tools across multiple git repos to accomplish this. Today I am making all those repos public. Let me explain below what each one does.

The Complete Workflow

My YouTube publishing workflow consists of four main components, each handling a different part of the video processing pipeline. They’re written in Python and designed to work seamlessly together:

color_edit - Intelligent video editing based on visual cues
yt_chapter_maker - AI-powered chapter generation and title suggestions
yt_upload - YouTube API integration for uploading
video_upload_workflow - The orchestrator that ties everything together

Let me walk you through each component and how they work internally.

color_edit: Visual Cue-Based Video Editing

The first and most crucial tool in my stack is color_edit. This is the tool that automatically creates a fully edited video from my raw recording. Before I created this tool for myself, I used to spend multiple hours reading each video.

The key innovation here is using colored frames as editing markers. When recording, I display:

Green frames to mark sections I want to keep
Red frames to mark sections I want to remove
Regular content is neither red nor green

This is much easier to demonstrate than to explain. Check out this video where I visually explain how this works.

Another important function performed by this tool is the removal of silent parts of the video, also known as “dead air”.

The final output is a tightly edited video with no dead air and only the sections I explicitly marked as keepers.

yt_chapter_maker: AI-Powered Chapter Generation

Once the video is edited, yt_chapter_maker takes over to generate metadata. This tool is pretty simple. It prompts an LLM to look at the transcript for the video and then suggest few options for titles as well as timestamped chapter markers that can go in the video description.

yt_upload: YouTube API Integration

The yt_upload tool handles the actual YouTube upload process using the YouTube API. You will need to get client secrets using the Google Cloud console in order to invoke the API. Also, the very first time you run it, you will have to do an OAuth dance through the browser.

The uploaded video is kept as private so that you look at the video and its metadata before publishing it.

video_upload_workflow: The Orchestrator

Of course I don’t want to have to invoke all of the above steps individually every time. So I have a meta-orchestrator that runs the entire pipeline. It also takes care of transcribing the video using OpenAI’s whisper tool.

There’s some caching of intermediate outputs to make retries faster in case something breaks midway.

Audio

Getting good audio is way more important than high-resolution video. People can watch a potato webcam video if it has decent audio, but will not stay for 2 seconds on video shot on an 8K RED camera if the audio sounds like it is coming out of a tincan.

Fortunately, it is very hard to go wrong with any decent modern microphone. These are the mics I’m currently using (depending on where I am recording).

Experience

Various parts of the above setup came about organically over time.

color_edit is the oldest one. I’ve been using it for more than 4 years now and every video on my channel in that time frame has been automatically edited using it. I haven’t even been making any changes to it recently. It does its job well and without any fuss.

The other parts are more recent and were mostly cobbled together in the last year or so. All the same, at this point they have been used to publish many videos on my channel and I intend to keep using this pipeline.