A few weeks ago, I was asked to review a body of work for an engineer who was up for promotion. The body of work consisted of roughly a dozen individual pieces of content to review – Word docs, PDFs, diagrams, and HTML files. Each document served as evidence that the candidate had mastered the complex systems and architectures required for promotion.
I wanted to use an LLM to help me analyze their work. After uploading a few documents to a model, I thought “I need a tool that can convert all these different file formats into a single markdown document that I can feed to an LLM.” It’s not a complex problem: convert files, join them together, and ensure the output doesn’t exceed token limits. But instead of writing the code myself, I decided to run an experiment: could I create a command line utility to recursively go through a directory, convert binary files like Word and PDF into text, and take all the other text files and conactenate them into one LLM-friendly file – without writing a single line of code myself?
The Plan/Act Paradigm
Right now (and this will change soon), many coding agents (eg Cline, Cursor) allow a coder to flip between distinct modes of interactivity. These modes sometimes have different names, but they roughly do the same things. ...read more of The Shift from Actors to Planners