Goal
I wanted to make a software that can record |Inkscape timelapses.| One can do it by just recording the screen, but it also records all your canvas movements so it's not a good viewing experience. The timelapse program should have the ability to record the canvas in-place.
The result looks like this:
Compared to just recording the screen:
I had an idea of how this can work: |Make an Inkscape extension that captures the current SVG periodically and make a custom format that records the diffs between each capture.| However, I was not sure about how feasible this idea is, because the functionality of extensions are very limited. This is the main reason why I decided to try vibe-coding on this program specifically.
The other reason is to keep myself from falling out of the loop. As much as I don't like how capitalists are ruining this technology, I have to be realistic if I want to survive this world.
Process
I set up "OpenCode" as some of my university friends have been suggesting that to me. I hooked it up to "DeepSeek" as the provider, mainly because I still have credits left over from a previous university course project. The first thing I did after installing OpenCode is to set the permission of "bash" to "ask".
The repository already had files inside, but they are just Inkex (an Inkscape extension library) templates downloaded from the Inkex website with like 1 or 2 modifications by me. First thing I did was `/init` in OpenCode, which sets up the `AGENT.md` file. The agent immediately recognizes the repository is a template of Inkex.
The way I prompt the agent is I basically tell the agent a lower-level implementation idea, instead of letting it figure out stuff. Is this a good idea? I don't know. So I asked it to finish the extension. It immediately recognizes the flaws in my idea, because it has more familiarity with Inkscape code than me. After about 8 minutes, with a few questions, it created a working extension that can be loaded into Inkscape.
The timelapse extension works like this:
- User starts recording
- The extension runs a separate Python script
- The script triggers the extension via the command-line to capture SVGs
It doesn't work.
Not because the AI made a non-functional extension, but the approach I made was wrong. Turns out it's really hard to make your extension periodically grab the SVG code from Inkscape, because each Inkscape command-line call runs on a different instance.
I had to change the approach. I thought, "What if we try to utilize the auto-save feature of Inkscape?" Unfortunately, Inkscape only supports auto-save every 1 minute. Not granular enough for a timelapse. But I asked the agent to implement it anyway.
While the agent is thinking, I was also thinking. I thought the auto-save approach sucks. So I stopped the agent before it made any changes. But then, I recall something I saw in the agent's thinking. It kept mentioning about this approach of using D-Bus.
Here's where it really shines.
|I didn't even know Inkscape had a D-Bus interface.| Heck, I don't even know how to call D-Bus. It would've probably taken me at least a week to discover Inkscape supports D-Bus. The agent was also very good at writing commands to test the D-Bus interface and figuring out ways we can capture a timelapse using this.
By the second day, I already have a working prototype. However, I'm catching edge cases myself that the AI didn't quite think of. For example, "What happens when I have multiple Inkscape windows opened?" Then the agent spent more time to understand the multi-window behaviour, and we ended up having a function that injects a marker to the document.
After everything is understood and implemented, I asked it to change the recording format. It was using JSON before. I know from my programming experience that I want it to be append-only. So the agent changed it to JSONL. And then I thought, it can use a good compression. So the agent changed it to a custom format, TLD.
Just to reassure you, I did read all the code the agent generated every time it changed something.
Here's how the final program works:
- User starts recording
- Program takes an initial capture using D-Bus (`export-plain-svg`, `export-file-name`, `export-do`)
- Every "interval", capture the SVG again, and append the diff (compared to last capture) to the file
For funsies, I also asked it to rewrite it in Go, and added some utilities and stuff. The program is basically done.
Review
After this experience, I think I have a lot more to say about the pros and cons of vibe-coding, instead of just basing off my guesses and hating it for no reason.
|Pro: Efficiency|
You cannot deny this one. AI codes really fast. Together with reasoning capabilities, it generates code with pretty decent quality. The other factors depend on the agent. The structure of OpenCode gives enough context to the model that it can code what the programmer actually wants.
|A little caveat: This project is NOT BIG by any means.| It was pretty consistent in all of the code generated, but the structure of the program is quite simple. I wonder how it performs if the project has many more components that logically integrates with each other. |This project also uses common languages (Python, Go) and features (D-Bus) that have existed for decades.| The model had plenty of data involving them to be trained on. Although, at some point, I did provide it the link to new Inkscape source code so it can learn some of the new D-Bus interfaces, so maybe it's fine with new stuff?
|Pro: Feasibility Study|
Earlier in this post, I did mention one of the main reasons I chose this project to be vibe-coded, is that I didn't know if my approach was good. And as it turns out, it wasn't.
But if I were to figure this out on my own, it would've probably taken me a few days just to code up an Inkscape extension, a day or two to find out this approach doesn't work, and maybe a week to discover the D-Bus interface. Then I have to learn how to call D-Bus to achieve what I want. This program could've taken me a month, with half the time coding a failing application that won't work.
Instead, using AI, I can pivot to a completely different approach in about 10 minutes.
|Con: Experience-based|
I am a programmer myself. I have implemented programs and I gained my experience and intuition in software development from them. |In order to write good code with AI, you have to provide your own experience to create something efficient.|
The big example in this project is the timelapse file format. Before I told the AI about the format I wanted, it was doing it with an in-memory JSON in Python. Every capture rewrites this JSON into the file. Most if not all video formats works in an append-only style, since videos often grow quite big, and if you need to hold all the data in memory, you need memory as big as the video itself. Also, when the data gets so big, the reencoding will take very long as well. Therefore, rewriting the file is inefficient.
Now say, if you are not an experienced programmer (or in my case, a tech-savvy guy), you won't have this intuition. That's why companies that fired their engineers are facing big trouble in their codebase. If the beginning of the codebase is bad, later developments will only get more complicated. We call this "tech debt" and I have been on that path (not AI tho, it's just because I was stupid).
|Con: No Learning|
Vibe-coding has basically turned the (start of a) project into a no-failure project, for the user, at least. Consider the Inkscape extension approach. Even if it was the wrong approach, if I were to code an extension by hand, I would've at least learnt how to make another extension in the future. Instead, the AI took the learning from me. Sure, you can argue "just read the code bro," but as a student who had grinded for a good Math Module 2 (Calculus and Algebra) grade in HKDSE, I can guaratee you that it's much less effective. |The only way to get good at a skill is to practice it.| Just reading the code will significantly reduce your thinking compared to implementing it on your own.
|Con: Messiness|
This project has not gone on long and big enough to face this issue, but I can see this problem starting to bulge. In late development (of the week), I asked it to change up the way the capture interval gets calculated. It does it no problem, but the solution doesn't look very clean to me. Again, it may only become a problem later, so I can't really tell.
Cost
Most people probably care about this: |How much did it cost?| DeepSeek is so cheap, you won't believe it.
The entire project took |8.27 CNY = 9.57 HKD = 1.22 USD.| Cheaper than your McDonald's burger. Here's one more fun thing: I topped up my billing account with 10 CNY (1.47 USD) last year for a university course project. To this day I'm still using that 10 dollars.
Future
So... What now? Am I gonna keep using AI for vibe-coding? |Most likely not.|
I have always found making software fun. It's like a puzzle, and I enjoy solving every part of it. |I enjoy using my brain.| Frustration is normal, and giving up is easy. But, every time, it's when you actually plow through that, you gain some new knowledge.