I started writing weeknotes as part of learning generously when I started at Recurse. But the distinction between Recurse and non-Recurse has increasingly blurred and I’d like to continue making weeknotes after I never graduate. To that end, I’m trying out a new format for my weeknotes. Tell me what you think!

See previous weeknotes via the weeknotes tag.


Updates

  • Decided to extend to a 12-week batch at the Recurse Center. You can continue following my progress by filtering for posts with the Recurse tag.
  • “QS Rubicon”: After 8 years of doing extensive self-tracking and ML-powered Quantified Self projects, I’ve decided to just stop collecting most data. I have plenty to say about the whole experience, so it should probably be its own blog post or talk. But the immediate feeling of being “out of control” is still something I’m trying to reflect on.

Inputs / outputs

  • Wrote and published 4 posts
  • AAAHHH! The Creative Coding prompt at RC this week was “Aaaahhhh!” I whipped up something between a click trainer game and a psychological torture device. It also works on mobile if you like torturing yourself on the go.
  • Participated in an intense Rock Paper Scissors tournament at RC. It involved multiple rounds of submitting a Python bot (with no dependencies) that played 200 rounds against some simple bots as well as the other participants’ bots. I managed to finish 3rd using a third-order Markov model with a stop-loss heuristic to switch back to the Game Theory optimal random play style.
  • Made massive improvements to my dotfiles for better macOS package management with advanced Homebrew wizardry, and better LSP configuration for Python and Markdown in NeoVim.
  • Improved PR to llm with tests.

Try AAAHHH full screen at aaahhh.vercel.app

Ideas

  • Industrial techno is more powerful than nootropics for high-octane coding.
  • My hunch: The selective pressure of gradient descent optimises the explainability out of the model.
    • Based on implications of representational superposition.
    • LLMs favour almost-orthogonal representations (89º-91º) which makes interpretability harder, but adds a ton more “capacity” to the model. My hunch is gradient descent is such a strong pressure on the network that it’s akin to evolutionary pressure on gene networks. The chaos and difficulty of interpretation comes from aggressively optimising for performance.
  • “There are no AI-shaped holes” via Matt Clifford

    “There are no AI-shaped holes lying around” -> this is how I reconcile the facts that (a) AI is already powerful and (b) it’s having relatively little impact so far. Making AI work today requires ripping up workflows and rebuilding for AI. This is hard and painful to do…

  • Leave something for tomorrow: “One of my favorite things to do: Stop working right in the middle of something and leave the unfinished work for the next day. On the next day, you know exactly where to pick up and can start right away. Hemingway, apparently, had the same habit. Here’s the crucial bit: ‘stop when you are going good and when you know what will happen next.’” via Thorsten Ball
  • Google’s NotebookLM turns massive quantities of text into almost-perfect podcast discussions. I tried it on my MSc Thesis. It happily gobbled up the entire PDF and produced an accessible and almost entirely correct discussion about it in around 5 minutes. I’m shocked to be saying this, but Google absolutely cooked!
  • macmon, a sudoless CPU/GPU/power monitor for Apple Silicon. Vlad pipped me to implementing this before I could learn enough Rust. So good, it killed my leading RC project idea.
  • This interview with DHH (creator of ruby on rails). Deeply cracked and charismatic. This got me pumped!
  • Vale - A linter for prose (which I’m using while writing this post in neovim)
  • An S3 doc about 1X Technologies’ robot for the home
  • Glow is a handy command line tool for rendering markdown. It doesn’t support streaming, but is nevertheless handy for viewing README etc.
  • Pieter Levels (@levelsio) on the Lex Fridman podcast. The personification of a cracked and scrappy developer doing things simply and rapidly.
  • 3Blue1Brown video on LLM interpretation. As usual for Grant, it’s an excellent video full of intuition pumps. But the section on superposition was both mind-shattering and brilliantly visualised.
  • Dylan Beattie’s video on UTF-16 edge cases, told in the most lovely and brilliant storyteller style.

This week I learned

(Copy-pasted from my #TIL-tagged notes in Obsidian from the past week.)

  • From one of Karpathy’s NN:ZtoH videos
    • #TIL in transformers, a decoder block masks out future inputs with a lower triangular matrix, but an encoder block does not. Thus an encoder can “see the future.” Via Let’s build GPT: from scratch, in code, spelled out. - YouTube
    • #TIL Karpathy has a nice set of intuitions on why Residual / skip connections work well: a “gradient superhighway.” Because addition operation means that backprop distributes the gradients equally across branches, so you maintain a direct input-output pathway throughout training, meaning the complex branches don’t choke the training initially. Let’s build GPT: from scratch, in code, spelled out. - YouTube
    • BatchNorm is normalising (u=0, s=1) activations across batch dimension.
    • LayerNorm is normalising activations across the layer dimension.
  • #TIL about this trick with /dev/tty. My current workaround for streaming LLM response but still rendering rich (with glow) at the end: llm "write some markdown" | tee /dev/tty | glow
  • #TIL cmd+option+H hides others on macOS. I use cmd+H to hide the focussed app all the time, but being able to hide everything but the current app is super helpful for quickly cleaning up. Via GitHub - dharmapoudel/hammerspoon-config: My configuration files for Hammerspoon.
  • #TIL you can use mas and cu alongside homebrew to manage and automatically update non-homebrew apps that are installed via the Mac App Store. Also, instead of brew cask uninstall <caskname> you can do brew cask zap <caskname> which may also do additional removal of preferences, caches, updaters, etc. stored in ~/Library. — via this gist, notes in [[Brew-Bundle-Brewfile-Tips-·-GitHub]]
  • #TIL (via Simon Willison) that with a uv shebang and some header material, you can make a great deployable Python script / app: #!/usr/bin/env -S uv run
    #!/usr/bin/env -S uv run
    # /// script
    # requires-python = ">=3.12"
    # dependencies = [
    #     "flask==3.*",
    # ]
    # ///
    print("Hello world")
    
  • #TIL if you see line endings that are ^M in a CL tool, it’s how unix-like systems represent the Windows line endings \r\n. So if you do find-replace on the \r it gets rid of that. Then use :set fileformat=unix in vim and save.
  • #TIL with GNU strings you can quickly print all the strings found in a binary. It didn’t seem to work on Rust binaries, so I presume it’s just C. Handy for reverse engineering stuff.
  • #TIL I can use the following workflow to find misspellings in my text files:
    • cat index.md | aspell --lang=en_GB list | sort | uniq
    • cat content/posts/**/*index.md | aspell --lang=en_GB list | sort | uniq gave me an alphabetical wordlist to consider for my ignore.txt list.
  • #TIL with nvim-lspconfig, [d and ]d map to vim.diagnostic.goto_prev() and vim.diagnostic.goto_next() (respectively). A good way to linearly clear warnings and error in neovim.