Further updates on LLM-assisted coding

30/04/2025

As part of the day job I've been doing a fair bit of LLM assisted coding. Some general thoughts on the current state of play (for posterity, including me in a month!)

In the wacky world of LLMs, old and stable is often good. Even for recently trained models (e.g. Claude Sonnet 3.7), they struggle with newer versions of libraries. In particular, where there's a lot of churn in a particular library, even when prompted heavily I often get a mixture of old and new code styles that errors out. Microsoft libraries are particularly prone to this. For example, the .NET Microsoft Graph SDK was significantly changed between v4 and v5 - and it's very hard to get v5 code out of the various LLMs. I also had a similar deal with the SharePoint PnP libraries.
The only model that got me out of that particular hole was o4-mini-high, which I've been impressed with. I still prefer Claude for most coding tasks but the combination of reasoning and being able to search the current documentation to work it out got me something working without major issues.
I've been playing with the latest OpenAI models a bit recently - and there's been a fair few released in a short space of time. It's not always obvious which model to pick for a particular job, so I often end up starting with the default 4o and then going up the scale as things get tricky. I do prefer the Claude approach of having one good model!
The implementation of search for ChatGPT seems a lot more integrated than that for Microsoft Copilot (Chat and for M365) - I'd like better citations, but the feel of it is better. Anthropic are lagging here.
I've been looking at the MCP ecosystem a bit with the release of the remote authentication RFC last week. I can see this unlocking a bunch of new interoperability for SaaS applications and LLMs. I'm hoping to spend some time getting a basic MCP server up and running with Claude Desktop using the Python SDK locally and then see if I can get the .NET SDK working with Azure Functions and EasyAuth, now the remote auth uses OAuth 2.1 and therefore will be interoperable with Entra ID.
I also want to see if I can get OpenAI Codex working with Azure OpenAI and see how that compares with Claude Code (good but expensive!). It's nice to see OpenAI do more with open source software, but I'm waiting for a pull request to be merged that will add Azure OpenAI to the supported providers.

I'm keen to get back to doing some more proofs of concept including looking more at MCP, Pydantic AI, and some of the other agent frameworks. It's an interesting time - and I wish I had a bit more of that myself!