Owen Morris

Building, developing... and growing.

By day I'm the CTO for Doherty Associates, but I love diving into the tech and can't stop programming.

This is my outlet covering the things that interest me technically in Cloud, AI, Data and Dev but are far too in-depth for corporate use!

Occasional references to being husband and Dad, very rarely runner.

Diving (back) into Python - Notebooks

21/09/2024

As I've continued to spend more time bringing myself up to speed with Python, I've found myself using notebooks more and more - but with a twist. I'm sure most readers are probably familiar with notebooks by now, but I don't recall them being around when I was using Python heavily last - from memory, iPython was around, but Jupyter hadn't been invented.

Notebooks

For the uninitiated, notebooks were originally pioneered by Mathematica, but came to prominence as part of the Jupyter project (originally iPython Notebook), here's an early description about it from Linux Weekly News (a site I thoroughly recommend by the way, and have been reading since I was in university, which is mumble years ago now).

They work by hosting code (in this case Python) in individual blocks called Cells, that can be executed individually, within an interpreter that keeps running (the Kernel). The code is run in the default Python namespace and so can be accessed as long as the Kernel is running as part of the session. The cells can contain markdown as well as running code. The output from the execution of the cells is written back into the notebook UI, so you can use it a bit like the REPL in Python (or F# or Lisp etc.) The output can be just text from the REPL (e.g. print statements), but can also be visualisations, array output displayed at HTML tables, etc.

This makes Jupyter notebooks into a literate programming environment, with all the affordances of the modern stack. Having the interim state saved and available for inspection can make interactive development a really pleasant experience - and you can document as you go in Markdown so you can express the why as well as the how the code works.

I first started using them in earnest in F# as they support the bottom up functional approach really well and they've become where I start most projects in Python or other languages - they're great for some exploratory programming when you're trying to get an idea up and running, the workflow is to build up short snippets of code in a cell, check it produces the right output, promote it to a function, then build up the program as you go in the same way. When you're in Python, you also have access to libraries such as Matplotlib and Pillow, and so you can see graphs or output of your scripts live in the output cells as you go.

The only real downside is that the structure of the notebook is just a JSON object and because output can change within runs, they don't play very well with Git as virtually any interaction will change the file, so commits can be dirty unless output is cleared.

Notebooks in VSCode

By default, the Jupyter environment is accessed via a browser, but my preferred method is via Visual Studio code (which is where most of my development happens outside of any C# that I happen to be doing). There are extensions available for both Jupyter Notebooks, and also .NET languages such as F# (plus others), called Polyglot Notebooks. The .NET one also supports SQL as a bonus, and can even share data between languages (so, for example, you can execute a query and then plot that in a JS library like ECharts). Another nice feature of Polyglot notebooks is that you can load nuget packages and DLL files from other projects by a magic command. The only real prerequisite is that you need to have Python and Jupyter (to run the kernel) installed for the Jupyter extension, the Polyglot one will require .NET 8.

Here's VSCode running a Python and F# notebook - in this instance, Python was probably the better experience as Matplotlib did the thing I wanted out of the box, whereas I had to try several F# libraries before I got something simple that ran locally.

Python Notebook
F# Notebook

How I use it

As I mentioned earlier, I start a lot of my projects using a notebook, building up short snippets of code that help me flesh out the idea, and then gradually make it more structured (for example, creating functions & modules in F#, or building up classes in Python), and using the markdown documentation to leave a trail of what I did and why, so that I can pick up the experiment a few days or weeks later when I have some free time. Once I've proved the concept then I'll use the nbconvert utility (or the 'Export notebook to script' command in VSCode) to generate a python or F# file that I can continue with generating a normal project. This approach really works for me and it's helped me work productively in fits and starts as time allows.

Bonus - Google Colab

The Google Colab service also uses Jupyer notebooks as its main interface, so I've been using it for some AI related experiments. It's a great place to start as there's a generous free tier of GPU enabled instances, so you have simple access to libraries such as the HuggingFace Transformers library, that makes using most open source large language models very simple to experiment with - more on this in a later post.

Diving (back) into Python - Django

04/09/2024

As part of a couple of personal projects I've been needing to write some code for a web application outside of the work context, so I've been able to make some different choices than I'd normally make and have been working under some different constraints:

  1. Speed of delivery is important and needs to support quick working
  2. I need to be able to drop and resume development as quickly as possible, as I've only been able to work on this project in short bursts - even more important at the moment as life has been very busy recently and there's been less time than previously to work on it. Ecosystem has been quite important
  3. Needs to be something I'm at least familiar with, as I wanted to concentrate on building the thing rather than learning on the job

This forced me a little bit outside my comfort zone when choosing which stack to use.

Technology Choices

The choices were really: .NET (probably F#), a full-stack JS framework or something in Python, the languages I'm most familiar with. I also wanted to stay outside the Azure space to avoid any overlap with my day-job.

I narrowed down the options to:

  1. F# with Giraffe and Dapper, plus Fable on the frontend - but discounted this eventually as I hadn't done much with Giraffe or Dapper at all, so felt it was a bit too much of a learning curve. The ecosystem of packages is small for F# only packages, but you also have wider .NET libraries to use. I'd like to spend a bit more time on this stack in the future, but discounted it as I wanted to spend as little time as possible getting something running end-to-end with a frontend and backend working together.
  2. Next.JS - as it's React based there's quite a bit of overlap with the day job, but I needed things for the backend too. I tried a couple of different ORMs and Drizzle was the one that I liked most. Next does look nice and handles things like auth with its own integrations. I was also using it as a static site generator for this blog. As with most things JS though, I felt like I had to assemble a lot of the stack by hand.
  3. C# and Asp.Net/Blazor - I've done a couple of experiments with this and have since been involved with a couple of things in work that have used this stack. I discounted it at the time but again, I'd actually like to spend more time with it myself.
  4. Django - I'd used this briefly when it came out, and looking at the docs it felt very familiar still. I did a couple of spikes on it and the ORM was easy to use. The ecosystem of libraries around it is pretty comprehensive, so I settled on this as the choice.

First thoughts

Python was my main language from the end of university, but I hadn't programmed in it to much of an extent since around 2016 or so, so I needed to revisit the current state of the art. The great thing about Python is that it hasn't changed that much, so even though I felt like a bit of a caveman, it felt familiar. There are a few things that I'm still coming up to speed with like tooling, but I was able to make some fairly fast progress.

Django has been quite similar - they got so much right at the start that the fundamentals haven't changed that much since the beginning, but the current features have mainly improved things.

The admin system is the main thing that's saved me time when building - having the ability to scaffold out data for models allows CRUD operations to be done as needed in the frontend. It saves so much time that I'm amazed that other frameworks haven't followed suit.

The ecosystem of packages is wide, but there are mainly one main third-party packages that does a particular function and there mainly aren't several overlapping ones (like the JavaScript ecosystem).

I ended up adding in JS in the form of React in the frontend as well in the interests of my speed of build, although getting this wired up correctly was frustratingly slow! I used normal Django templates for many things, and then used React for interactive pages in a 'MPA' style. I'm pretty happy with the results but am looking at things like HTMX with a curious eye...

Libraries

In addition to the out of the box functionality, I've pulled in the following packages, which I've been generally pleased with:

  • django-tailwind (for backend CSS)
  • django-ninja (for API creation). I really love how easy this library has been to use and how little boilerplate it uses.
  • allauth - for social logins. I tried a couple of other libraries for auth with varying levels of success.
  • django-webpack-loader (for loading multiple React bundles - I tried and failed to get django-vite working for this, something to retry at some point)
  • Flowbite and the React bindings to get the same UI look and feel between frontend in React and Django templates

Some backend tasks were written as django management commands, I haven't found a good scheduling library yet.

Things that I need to spend more time on

All in all, I got things up and running to a certain point, but there's work to do before finishing. I want to understand Allauth better as I just got the basics working and authentication is always a big requirement for modern apps. I also spent far too much time getting the django webpack integration running alongside tailwind, and I really want to see if I can transition to django-vite in the future.

Resources I used Aside from Copilot & ChatGPT, plus the readme's for the

libraries above, I found the following articles useful:

Current thoughts

All in all, I'm relatively happy with the choice of stack at the moment and I think it's got me up and running relatively quickly. I'm sure I'll add libraries as I go, but my experience thus far has been quite good.