The New Programming

In working with a set of PDF files I needed to distinguish areas by typeface name and size. I found PDFBox, which offers very fine-grained capability to alter PDFs programmatically. But it’s a Java library, and my Java is less than fluent.

Using Claude and GPT-4o I managed to implement what I wanted. The proof of concept is a program that applies a distinct color to each fontname,size pair in the document. This was evidently too big for either tool to handle directly, but a combination of decomposition and multiple tools got me close.

Claude did a decent job creating a program that set colors per font name, and applied them correctly across the document. The main difficulty was exhausting the chat and having to start a new one. This happened numerous times.

I also used Claude to start a program to list all of the fontname,size pairs that appeared in the document. The way that fonts are handled in my document set makes tracking sizes surprisingly difficult. The font size can be calculcated correctly only immediaely before a Tj operator renders text. Dealing with this was somewha different in versions 2 and 3 of the library, and fixing it relied on GPT-4o searching the library for the correct library calls.

The final task was to combine these different mechanisms to color fonts by fontname,size pairs. This was too much for either program to handle. Each ended up going in circles, often removing functionality despite explicit instructions not to.

Ultimately I merged the two functions by hand. With slightly more Java knowledge I probably could have asked for a sequence of smaller changes that added up to the merge. What I learned from manually merging likely equips me to try that next time.

Takeaways from this exercise exploring a new library with the tools include:

I haven’t tried yet, but wonder whether the tools can work effectively with diffs and thereby reduce the accumulation of chat content and allow for more iteration within a session.

Most of this is unsurprising. The first two points amount to laying out your architectural guidance. The third point means that if a tool has online access, give it an open-source repo url and demand that it look at it. The final takeaway was more surprising.

This short exercise taught me enough to become comfortable in the java code without having to work through every picayune detail. This learning seemed far easier and faster than having to battle every detail all the way through. In the end it was easy enough to merge the two programs.

The biggest surprise was how easy iteration became, so easy that it was difficult to keep track of which version exhibited a specific behavior when experimenting. I’ve long approved of numerous small checkins on a dev branch, possibly to squash later for merge into an upstream branch. But this was more extreme. In 30 minutes I was easily able to test 10 different ways of addressing a particular issue. At that point, if you like the behavior of the fourth version, but with a variation discovered in the ninth, what do you do?

With such fast iteration it seems appropriate to commit every single update to a project file. Record everything now, squash later. Maybe Jujutsu is the right direction. Rapid iteration has always been advantageous for experimentation. At these new speeds the problem shifts from formulating the new experiment to managing the results of many experiments. Capturing full history seems like the obvious first step. What’s the next step?