Vibe-Coding in (Digital) Humanities

March 28, 2026

I was thinking about writing this thought for a while. I would say almost since we started publishing the vibe-coded advent calendar¹ for DigiLab at IMAFO ÖAW. However, I was not sure what my stand on the use of LLMs for coding is, and to be honest, I am still not sure. The technology advances so fast that is hard to keep pace with the development. There are very different opinions in the developer community and the results seems to vary a lot, too. However, I started noticing that more and more scholars are leaning towards vibe-coding as a way of taking the development of the tools they want from the slow and expensive software engineers back in their hands. And I think that is a dangerous mindset.

If we completely ignore the most obvious reason as to why you should employ a software engineer² then we should think about what output an LLM produces and under which conditions. Many people are talking about democratisation of coding. How LLMs allow us to build websites quickly and easily. But not many consider that instead of the knowledge you need to have in the markup, stylesheets and potentially JavaScript, you need to know how to set up your agents and skills files and folders, which isn't the easiest task either. Then it still remains essential to have the same architectural knowledge of software engineering, otherwise you just build a massive monolith.³

Then there is the boogeyman of supply chain attacks that most of these scholars don't even realise (and it's real).⁴ These scholar hobby coders don't know where to look. Often they don't even know what versions of libraries they’re using. And LLMs are not making the task much easier, sometimes they hallucinate non-existing library that was later on used in supply chain attack. Often they pick up a random version of an existing library.⁵ This can further endanger the code base of the project, as there were quite nasty issues in Next.js and React.js recently.⁶

It's just dangerous to assume that, if LLM says I implemented this according your requirements, that the implementation is right. Sometimes, if you spend time reading the thinking process you can find: for purpose of this project I simplify this implementation. Not even your agent file filled with: think hard, never take shortcuts will help. In software engineering, this is called technical debt. The LLM is borrowing against the future health of your app to give you a quick result today, introducing hidden flaws and structural weaknesses that you will eventually have to pay for. There is no easy way out of it. I noticed in the forums and at the website of the wise people of internet⁷ that there are approaches that can limit this ability of LLMs to simplify, use orchestration of agents,⁸ make sure that different agents are used for code and review. However, can you see the complexity? It's not democratisation, we are just moving from verbal coding to overseeing and we still need to have the same knowledge to spot suspicious behaviour.

Furthermore it's becoming much clearer that LLMs can repeat things that they were trained on verbatim.⁹ This brings us to another issue, although for a code the law seems to be more free in terms of use and reuse. There might be issues. If you decide to write your own HTR¹⁰ what makes you sure you're not stealing code from Transkribus or eScriptorium? Or even worse, some other scholar that is just like you, but he actually spend the time developing the tool by hand?¹¹ I mean, in programming you have to solve similar algorithms and the amount of possible solutions is more limited than if you're writing a story, but let's consider it. The reason why am I mentioning this is primarily because it set us up for a paradoxical situation. On one hand we (as scholars) are complaining that our work is not properly credited when used for training of models (LLM or single purpose ones like Transkribus). On the other we are happily stealing work of other people through LLM output. We don't even consider that there could be real projects and people spending their own time behind the code LLMs tries to replicate.¹² Many apps contain information about open-source libraries they used in the licenses section to give recognition to people who built the necessary code.¹³ Similarly we do that in academia via footnotes. But can you do the same for your vibe-coded project? I doubt it.

You might want to say, "But I use this tool just for myself." Okay, but this does not solve what developers call the "Day 2 problem." Vibe-coding is amazing for Day 1—getting a prototype running looks like magic. But it is notoriously terrible for Day 2—fixing a bug six months later, updating a deprecated library, or adding a complex feature. Scholars who bypass engineers eventually hit a wall where their vibe-coded app breaks, and they won't know how to fix it because they never understood the underlying code. Because of this, isn’t it a pity that we are building tons of disposable tools? We are spending literally hours of our work and burning compute power (often paid for by the public) to build fragile tools that will inevitably break and that no one else can benefit from.¹⁴

Okay, so now you think this is just another AI hate thought. Don't get me wrong, I think LLMs can be useful. I was using them to develop some tools and I'm impressed with how quickly and easily you can build a functional app. But there is a massive difference between generating code and building software. The democratisation of coding that is praised by many scholars doesn't mean that we need less expertise. It just shifted. Vibe-coders need to know how to set up skills, agent files and how to spin the orchestration of agents to work together. What still preserves and makes true software engineers unique is the understanding of the architecture, foreseeing the financial and technical costs of "Day 2" maintenance, and taking responsibility for the ethical gray areas of licensing. If you possess that foundational knowledge, agentic coding can safely supercharge your research. But if we, as scholars, treat LLMs as a way of blindly bypassing the expertise of software engineers, we aren't innovating, we are just automating the creation of technical debt and disposable waste.

Some of the tools might not work anymore, but you can look at it here: https://advent-calendar.humanities.tools.

Because software engineers fill the gap in the knowledge you don't posses.

You might want to argue that you can easily ask the LLM to change the code, but the bigger the file, the more tokens you need and the pricier the project gets. Eventually you reach the limit of the context and the model starts hallucinating. There are definitely ways, it's just not about a simple prompt.

A recent example is the python library litellm (https://futuresearch.ai/blog/litellm-pypi-supply-chain-attack/).

During the work on the aforementioned vibe-coded advent calendar I noticed that Google AI Studio picks up seemingly random versions of React. I couldn't find a pattern in it, sometimes it was React 17, 18 and sometimes even the latest 19.

The CVE I have in my mind is https://www.cve.org/CVERecord?id=CVE-2025-55182 and https://github.com/vercel/next.js/security/advisories/GHSA-9qr9-h5gf-34mp.

Reddit (https://reddit.com)

There is a nice article about AI fatigue https://hbr.org/2026/03/when-using-ai-leads-to-brain-fry or if you are paywalled different one: https://www.dolthub.com/blog/2026-01-15-a-day-in-gas-town/. So it has it's own complexities.

It's a huge discussion, but I will limit to this recent article: https://arxiv.org/abs/2603.20957

10.

Handwritten Text Recognition, often also referred as ATR (Automatic Text Recognition) in the field of Digital Humanities.

11.

A very nice article on the trouble with LLM-generated code, for example, here: https://lwn.net/SubscriberLink/1064541/1a399d572a046fb9/

12.

LLMs were trained on code of open-source and sometimes even closed-source projects. You most likely heard about Microsoft Copilot, there were discussions that they used all open-source data on GitHub. Unfortunately GitHub recently changed policies regarding data training and the internet is full of this change, so I can't find the relevant older articles. The change itself sparks some controversies too https://github.com/orgs/community/discussions/188488?sort=top and the change: https://github.blog/news-insights/company-news/updates-to-github-copilot-interaction-data-usage-policy/.

13.

For example Google Chrome on Windows have this https://support.google.com/webdesigner/answer/10151269?hl=en.

14.

Here you can again argue, "but then I can ask improved LLM to fix my problem." Sure you can, but that brings us again to my previous note. The bigger files, the more tokens you burn, the more complex the code will become until it's financially impossible to keep the app updated.

Subscribe to Thoughts on Digital Humanities

to get updates in Reader, RSS, or via Bluesky Feed

programming

digital humanities