Swift challenge mode – Dynamic Data

This isn’t something I’ve solved, more of something I’m working on, but I thought there were some interesting things to share with anyone else “walking this path”.

The Swift programming language is a static, strongly-defined language with a huge emphasis on leveraging types to help provide programmatic safety. It’s not always something I remember, as I spent the better part of two decades solving programming problems primarily with dynamic languages (C, Python, JavaScript, Objective-C, etc). The challenge that I’m faced with is interoperability with these dynamic languages. In this case, it’s through the lens of data – and more specifically cross-platform CRDT libraries.

Both Automerge and Y-CRDT started out with implementations in JavaScript, and have both recently rewritten their cores in the Rust language, with an eye towards using that high performance core as an underpinning for a variety of both platforms and languages. I’ve been contributing to both libraries and their Swift language bindings as these core progress. In the APIs that these libraries expose, types that represent lists and dictionaries don’t have the same constraints that Swift does – and mapping a potentially very dynamic data structure into Swift isn’t a straightforward task. At least it’s not straightforward to both provide a mapping that’s developer-driven and also results in a fairly ergonomic, “swift idiomatic” result.

I found two patterns, that are pretty consistent within the Swift ecosystem, for handling this sort of problem space. The two patterns end up layering together, depending on how far you need to go.

The first pattern is to provide a type that wraps that fully describes all the potential variations that could exist. It’s surprisingly powerful, and does a great job of mapping simple data types into the strongly-typed world of Swift. For a simple, one to one mapping, you can do this with a enum – and I’ve spotted exactly that used for encoding and decoding JSON in the swift-extras-json repository (JSONValue.swift). The other part that makes this convenient is a matching protocol that has functions for converting into, and out of, this type. Then for any type, you can pretty readily provide functions that do the conversion mapping, even potentially throwing an error if something doesn’t work out. The limitations of this technique is that it’s a one-to-one mapping, and it doesn’t really compose – at least not by itself. It is most effective for the simplest data structures or data that you deal with as an effectively atomic unit.

The downside of using a single wrapper type is that it doesn’t easily compose. As soon as you want to group multiple, different things together – it kind of falls down.

The second pattern is far more complex, but well established in Swift – the Codable protocol. This protocol setup is a pretty genius idea, but understanding how Codable works under the covers is not at all straightforward.

My brain immediately went to asking questions like “how can I self-inspect types to be able to encode or decode instances?” with the idea that “Surely, the swift language has already has this…”, mostly thinking of Mirror and type reflection. I use the codable protocol myself, but never knew how it worked. So I took to digging, thinking “there’s got to be something that provides the introspection details for codable to work”. I was right – there is – but it wasn’t at all where I thought it would be. Turns out the answer is “Either you provide it, or the Swift compiler – in some cases – can provide the default code”.

The reference that explained it for me is The Flight School Guide to Swift Codable. (Side note: ALL of the flight school guides are excellent resources – if you don’t have these in your library, take the time to get them and thumb through them all. I guarantee you’ll learn something.) The “magic” that looks at the type, knows about its stored properties, is the bit of Swift compiler goodness that synthesizes Codable conformance.

In hindsight, I thought “Huh. Maybe it should have been clear to me that you’d want to provide a way for a developer to provide their own control over encoding and decoding their types.” I have to admit I was disappointed the blunt-force hammer I was hoping to find didn’t exist. The effort of investigating how it worked got me to change up how I was thinking about the problem, which is definitely good.

After I understood a bit more about Codable works with encoders and decoders, I found a thread on the Swift Forums from 2019 asking about how the compiler provides default conformance. Interesting to me – in the thread Itai Ferber talks about how “ideally this kind of functionality could be exposed in the standard library using Macros” – but at the time macros didn’t exist within Swift. As I’m writing this, the review for SE-0397 Freestanding Declaration Macros is just starting, which I suspect has the capabilities that Itai referred to in that Forums post. The new macro capability is something I haven’t even started to grasp, but I’m looking forward to doing so. Since it isn’t available currently (with Swift 5.8), using macros to solve this challenge was outside of the pool of possible solutions to investigate.

I like that nail down a specific goal for myself when I’m trying to solve these kinds of issues – a specific problem that exemplifies what I’d like to be able to do. In this case, it is reading and writing from a CRDT backing store into “something that I define” that provides a schema, and which I can in turn use with SwiftUI. My specific use case is a data model that has a list of composed objects – each object having an image and text, with the text leveraging the collaborative capabilities so that multiple people could work together to provide descriptions for images that were added. This particular use case hits a several corners at once – it expects nested CRDTs (the text inside a list), and provides an interesting challenge of expose a list (or list-like thing) of a concrete type extracted from the model that houses the CRDTs that also allows for accessing the various text pieces through a SwiftUI TextField (which expects a Binding<String> to be able to update).

In the newer Automerge bindings, Alex Good cobbled some nifty property wrappers that work really nicely for exposing wrapper properties that map to simpler scalar values. I’ve been working on refining those a smidge, but also trying to come up with how to represent or reflect a array of something – maybe struct, maybe class – that is mapped from a List CRDT type and the data in it. I suspect I need to leverage a custom decoder to get there, reading from the CRDT data store and decoding into some form that can in turn provide those wrapper properties that enable read-only values as well as bindings that are ever-so-useful with SwiftUI. I spent a bunch of time painting myself into corners with various class structures that acted like collections, and while I got some basic things working, the end result was that creating them – or more specifically defining the structure of what you’d get back, felt super-awkward. If you want to follow my trials ( and mistakes) as I learn them, I’m working in a public repository (AMTravelNotes) on GitHub.

At this point, I think the path forward involves leveraging Codable; likely making a custom Encoder/Decoder that’s specific to reading and writing from Automerge documents. Alex built a very similar thing called AutoSurgeon that implements the same rough pattern in Rust using its rough equivalent of Codable called Serde. Fortunately, I’ve found a number of examples of custom encoder and decoder libraries. The Flight School Codable book includes an example that reads and writes to MsgPack. While digging around for other implementations, I found the library PotentCodables, which has a whole slew of interesting implementations, including CBOR and even ASN.1. Now that it’s been released, I think I’d like to take some time to read through and try to understand the code for JSON codable that’s included in the new Swift Foundation repository. It looks to be a notable step up on complexity, but with that complexity comes some impressive performance.

The current iteration of the Y-CRDT swift language bindings doesn’t yet support nested CRDTs, so I can’t quite yet attempt the same scenario with those bindings.

ML Understanding

OpenAI announced the release of GPT-4 a few days ago, with the not-surprisingly flurry of news reports, quite a few hyperbolic, and a few good. I watch RSS feeds (with the amazing open-source NetNewsWire app) of various individuals, forums, and a few online sources for my technology news – so it was a fairly notable deluge. One of the most common words in the opening statements that describe what this new update was about is “Understands”. Bullshit and poppycock! Using the word understanding is pissing me off.

Some better journalists put the word in quotes in their article, which I read as “appears to understand”. Others don’t even bother to do that. Yes, I’m being really judge-y. For the better journalists, I wish they wouldn’t take the techy, pop-culture-focused journalistic shorthand. It’s lazy, and worse – incredibly misleading. Misleading enough that I annoyed myself into writing this blog post.

Large Language Models (LLMs) don’t understand a damn thing. They blindly, probabilistically map data. There’s no reasoning, deduction or analytic decisions in the process. They’re consummate mimics and predictors of expected output, trained with obscenely huge amounts of data. Sufficiently so that when they reply to questions asked, they do a darned impressive job of seeming to know what they’re talking about. Enough to fake out tests, in the latest GPT-4 announcement. That probably should say something (negative) about how well those tests actually judge understanding, not how good GPT systems are getting.

To my horror, the most popular showboat of this pattern is that they confidently assert any number of things which have no basis in fact. It’s the worst kind of toadying lickspittle, saying what you want to hear. A person, who’s been told to expect understanding from the system, is being given words utterly without meaning – “making shit up” from its internal mappings. Some happen to align with facts, but there’s zero validation of those facts.

There’s two sides to an LLM – there’s data coming in, called “encoding” which maps the input the model expects into where that input fits within the model itself. The other side is “inference” which is the generative side of the equation – it’s trying to pick the most likely representation coming out – in the case of models such as ChatGPT, that’s in the form of text and sentences. With other models the inputs are text and the output images, or inputs of sound files and output of text.

One of the properties that I love about LLMs is that with the volume of data, it turns out that languages – even fairly radically different languages – end up mapping in very similar ways in large enough volumes. When we can apply the correspondences for a few, we quickly get sufficient mappings for even words we don’t know to use the LLM as a very reasonable translator. And not just of words, but of phrases and whole sentences. That very capability was included a recent release of Whisper (another project by OpenAI), with a model that was small enough to fit within a single computer. This model works off audio streams, and converts it into text. And one of the more magical capabilities of Whisper is the ability to automatically recognize the language and translate the results into English on the fly. So you can speak in Japanese or Spanish, and it returns the text in an English translation.

Just like the generative properties, it’s also random – but fortunately our perceptions of translation is that it’s lossy and potentially wrong. It’s viewed as “likely close, but may not be entirely accurate”. For the most part, people don’t take a translation as an absolute statement of fact – although that seems to be the reaction to the generative models of something like Bing or ChatGPT. The value is in the “it’s close enough that we can close the gaps”. That’s a whole different world from taking the “generative text” as gospel truth of answers and facts, like it knew something, or looked it up in some knowledge base. It didn’t – it just mapped information from an obscenely large number of patterns into what it guessed was a reasonable response. Generative text from LLMs isn’t reporting facts, it’s reporting what maps most closely to what you’re prompting it, there’s no understanding, or reasoning, involved.

If you made it this far, I should mention that there IS some really cool research happening that uses large language models (and related machine-learning systems) to do data extraction – the refine out information into a knowledge base, or based on past experiences predict basic physics reactions. That’s the cool stuff, and where the magic is happening. Our local UW ML research groups are doing some amazing work on that front, to the end result being able to deduce and induce reasoning to answering questions.

CRDT work

What are you working on these days?

I’m all over the place at the moment, but one topic keeps standing out – working on CRDTs. If you don’t know this crazy arsed geek acronym, it stands for Conflict-free Replicated Data Types, and it’s a means to enable eventually consistent data replication – “sync” in a word, or the “lots of sync” that comes with interactive collaboration. CRDTs evolved out of the world of Google Docs (and examples before that, but everyone seems to know of Google Docs) that provides a way for multiple people to interactively collaborate on a document, seeing what each other was typing at the same time. My first experiences with anything like this date back to SubEthaEdit (which is still out there and available), and a browser-based thing called EtherPad, used extensively in the early, early days of OpenStack conferences.

A few of years ago, a friend asked for a bit of help with a programming project that sent me down the rabbit hole of “Okay, how does this stuff actually work?”. While doing that research, I found the article by Alexei Baboulevitch (from march 2018) Data Laced with History: Causal Trees & Operational CRDTs. It’s a brutally dense article, but goes into depth that I hadn’t seen earlier. After reading it (and re-reading it, multiple times) some of it sunk in. Alexei even made source code available (written in Swift 4). At the time, the code my friend was working in was C++, so I helped port some of those concepts into a C++ implementation. (Yes, I was an absolute glutton for punishment).

Since then, I found several places that I wanted to use the same techniques, but there wasn’t much in terms of Swift native libraries available, and surprisingly little conversation about it. More recently, the topic has popped up on a couple of places – Objc.io did a series introducing the basic concepts on Swift Talk #294: CRDTs – Introduction (a really well done introduction). And I found thread on the Swift Forums about that was (more or less) talking about proposals for standard types that might fit into the Swift standard library. I’d also be remiss if I didn’t point out the walk-through article Replicating Types with Swift at AppDecentral, which has great code examples on GitHub as well.

Most of the “interesting” heavy lifting in advancing the state of the art was primarily being done in the browser space (aka Javascript first), which is pretty heavily dominated by two really good projects: Automerge and Yjs. Around 2021 both started tackling ports of the core algorithms into the Rust language, prodded a bit by I think by the post 5000x faster CRDTs: An Adventure in Optimization, who makes some really compelling points.

I did find some swift-native CRDT implementations – but what had been done wasn’t as deep as I’d like, and didn’t support features that I wanted to use. One of them (bluk/CRDT) was close, but constrained such that you had to pass the entire history of the data structure in order to make it work effectively. I’d learned enough that I could make a variant that did what I was after, so I created my own CRDT library (heckj/CRDT). The most meaningful difference is that my version was a delta-CRDT. That’s a fancy way of saying “You don’t need to send the entire history of a data structure, only the “recent bits since you last looked” in order to sync things up. It’s darned useful optimization given how CRDTs trade off memory for the ability to be eventually consistent. So if you don’t have to transfer the “whole history”, it’s a win.

In truth, I did the “easy parts” version – enough to have something out there that I could use and build on. But there’s several areas where my current implementation is weak. Martin Kleppmann has a great set of slides and a video explaining the intricacies on these weak areas called CRDTs: The Hard Parts. If you’re in to implementation details, I recommend watching that video. Martin’s a good speaker, and a prolific researching in the CRDT space. Oh – and he’s fairly deeply involved in one of those Javascript libraries I mentioned earlier: Automerge. The TLDR of that talk are that “just having eventually consistent data” can still end up with a seriously shitty user experience. (He says it more politely). There’s some things you can do to solve for that better user experience – so that the end result is not only “eventually consistent” (a baseline requirement for CRDTs), but is also more in line with “what a user expects”. Unsurprisingly, these additions come with a bit of complexity.

I looked at implementing that complexity in my heckj/CRDT library, and so far I’ve held off. Instead, I went and looked at getting involved in potentially using either Automerge or Yjs in Swift through the lens of those Rust language rewrite efforts. And that’s where I’m at currently – helping out and collaborating a bit with Aidar Nugmanoff, who’s establishing the Swift language bindings over the Yrs (the Rust-based rewrite of Yjs by Bartosz Sypytkowski). That effort, by the way, was the source for my recently popular article: Creating an XCFramework.

If you’re interested in the algorithms, Bartosz has a whole series of blog posts that beautifully outline how Yrs tackles their implementation of the CRDT algorithm “YATA”. I seriously thought about implementing the YATA algorithms (which I liked a bit better than RGA from an academic perspective) in my CRDT library, and I might still do so down the road.

Both Automerge and Yjs have very well established communities around them, so it seemed like a bigger win to be able end up with something that was cross-platform AND cross-language compatible. So for now, Aidar is doing the heavy lifting for a Swift front-end to Y-CRDT, and I’m helping out. The extended Y-CRDT dev team (which cover quite a variety of languages: Python, Ruby, Rust, Javascript, WASM, etc) is fun to collaborate with, and has been very welcoming. For my part, I’d love to have an end-result that works both in iOS and macOS and is binary-compatible with existing platforms, and I’m learning a lot along the way.

Automerge had earlier swift bindings, but they were awkward as all get out to use. (I contributed documentation to that effort a while back.) I think they took that as “lessons learned” and put that feedback into their own Rust-port/rewrite effort. Automerge just released their 2.0 – which I suspect solidifies their Rust implementation to the point they could re-create new Swift language bindings, at least I’m hoping that’s the case.

If you want to see what Aidar and I are up to (its definitely a work-in-progress right now), the Y-CRDT swift language bindings work is being done as open-source work at https://github.com/y-crdt/y-uniffi. Right now everything is in a single repository, but that’ll need to expand out a bit in order to effectively support the layers of Swift packages that we’re planning. We’ll have one layer which is the Rust side of this exposed through the Mozilla supported tooling UniFFI, and another which is all in Swift providing the Swift idiomatic interfaces. In the meantime, I’m getting a deeper-than-anticipated introduction to the Rust language.

For now, I’m helping the Y-CRDT project rather than re-implementing the bits on my own. It feels like there’s more of a win that way. Last year, about this time, I was fiddling with a library that was a ghetto-simple version of what Apple released as the (amazing) Swift Charts. To be honest, I wouldn’t be entirely surprised if Apple again published something like a CRDT library – at least for their own platforms. They’re clearly already using the technology in some of their apps: Notes at first, and I’d almost be willing to bet it’s being used in the “Collaborate with…” stuff that was released for the Pages, Keynote, and Numbers last year.

Circling back to my own library

There is more work that I’d like to do within heckj/CRDT, even outside of the complex lifting to support really effective user experiences in collaborative string editing (let alone collaborate editing of attributed strings). So for now, I’m also getting the benefit of seeing how other folks are building atop core CRDT libraries, and getting more familiar with the considerations and choices that brings.

There’s some interesting “quirks” to using CRDTs that any low-level library expects you to manage: the algorithms can provide “unexpected” results if you don’t correctly identify different instances that are collaborating. Without the correct identity assertion above it, the CRDT algorithm is unable to achieve the “consistent” part of eventually consistent. In academic terms, most CRDT algorithms are very susceptible to “byzantine attacks”, although that term seems an unfair label to the ancient empire.

For the use case of “syncing between my own iPhone, mac, and iPad” – you need to come up with a means to uniquely identify where you’re syncing from for the corner case where you make different, conflicting changes on different devices that are disconnected from each other and then later, sync them up. The algorithms frequently use a technique such as “last-write-wins”, but if you have two change histories that report that they’re the exact same instance, there’s no easy way to get a consistent end-result.

There’s a use case that Automerge (and Martin) focus on called Local-first, one part of which is entirely skipping the whole “hosted service middleman” thing. That leads to either peer-to-peer interactive sessions to collaborate, or transferring files that maintain their own history – so you can sync up any time down the road, and it’s still ultimately consistent. There’s a whole encoding scheme that Martin Kleppmann pioneered that compresses the history-overhead data so that a document that has “1MB” of text in it doesn’t end up with a “20MB” file representing that data and all its changes. That’s one of the topics he talks about in his CRDT: The Hard Parts presentation. That’s not something I’ve enabled in my own library, but it’s an obvious addition I’d like to make.

The other use case is real-time collaborative editing – whether that’s editing text or something else. The whole idea of authenticated edits and allowing collaborators, or not, is not something that the raw, underlying CRDT libraries really support. I want to look a bit more in depth as to potential ways of tackling that, and supporting “inviting someone to collaborate”. There’s also a whole space of “what does this library look like as idiomatic swift” and “how composable is it” – both of which I don’t have solid answers to.

Getting back into doing work with CRDTs was inspired by Distributed Actors in Swift, released last year, and which looks like a super-interest substrate to support sync, collaboration, and more. The examples from last year’s WWDC (Tic Tac Fish) included a a Bonjour-based actor implementation as well as a WebSocket example – and it’s all just been stewing in my rear brain since.

Creating an XCFramework

In the past couple of years, I’ve had the occasion to want to make an XCFramework – a bundle that’s used by Apple platforms to encapsulate binary frameworks or libraries – a couple of times. In both cases, the reason wasn’t that I didn’t want to ship the source, but because the source was from a language that isn’t directly supported by Xcode: Rust. The goal of both of these efforts was basically the same – to expose and use library code written in Rust from the Swift language.

Import bits to know about creating an XCFramework

As of this article (2023 for future-me), the best path for exposing a Rust library to Swift if through a C-based FFI interface. Swift knows how to talk to C libraries – both static and dynamic libraries. To use these libraries in an iOS or Mac OS app, what appears to me to be the “best path” is to use this C FFI interface path, and extend that low-level C based API interface with idiomatic Swift. This article aims to walk through some of the specifics of getting from the static library space into Swift. The details of what goes into making an XCFramework is rather sparse, or perhaps more appropriately “terse” through Apple’s documentation. The relevant documents there (definitely worth reading), include:

The key pieces to know when doing tackling this are embedded in the core of the article: Creating a multiplatform binary framework bundle:

  1. For a single library, use the xcodebuild -create-xcframework command with the -library option. There’s also a -framework option, but reserve that for when you need to expose multiple static, or dynamic, libraries as a binary deliverable.
  2. Avoid using dynamic libraries in when you want to support iOS and iOS simulator, as only macOS supports the dynamic linking for these using a Framework structure. Instead, use static libraries.
  3. Use the lipo command to merge libraries when you’re building for x86 and arm architectures, but otherwise DO NOT combine the static libraries for the different platforms. You’ll want, instead, to have separate binaries for each platform you’re targeting.
  4. These days, the iOS simulator libraries need to support BOTH x86 and arm64 architectures, so yep – that’s where you use lipo to merge those two into a single “fat” static library – at least if you’re targeting the iOS simulator on macOS. Same goes for supporting the macOS platform itself.
  5. Get to know the codes called “triples” that represent the platforms you’re targeting. In the world of Rust development, three Apple platforms are “supported” without having to resort to nightly development: iOS, iOS simulator, and macOS. The “triples” are strings (yep – no type system here to double-check your work). Triple is ostensibly to support “CPU”, “Vendor”, and “platform” – but like any fairly dynamic type thing, it’s been extended a bit to support “platform variants”.

The triple codes you’ll likely want to care about, and their platforms:

  • x86_64-apple-ios – the original iOS Simulator on an Intel Mac
  • aarch64-apple-ios-sim – the iOS simulator on an M1/arm based Mac.
  • aarch64-apple-ios – iOS and iPadOS (both are only arm architectures)
  • aarch64-apple-darwin – M1/arm based Macs
  • x86_64-apple-darwin – Intel based Macs

Building libraries for platform and architecture combinations

If you’re following along wanting to generate Rust libraries, then you’ll want to make sure you tell Rust that you want to be able to compile to those various targets. The following commands enable the targets (iOS, iOS simulator, and macOS) for the Rust compiler:

rustup target add x86_64-apple-ios
rustup target add aarch64-apple-ios
rustup target add aarch64-apple-darwin
rustup target add x86_64-apple-darwin
rustup target add aarch64-apple-ios-sim

There’s also the variant triple’s aarch64-apple-ios-macabi and x86_64-apple-ios-macabi that aim to represent Mac Catalyst, but those aren’t easily built with the released, supported version of Rust today. Likewise, there’s work to do the same for watchOS and tvOS. All of those, as far as I know, are only available in “Rust nightlies”, so I’m going to skip over them. That final string extension of -sim or -macabi on the triple is relevant when it comes to XCFrameworks – the command that builds the XCFramework understands that detail and encodes it into the XCFramework’s Info.plist as a platform variant key.

When you go to compile your library, specify the target you want to generate. If you’re using cargo from the Rust ecosystem, the –target option is the key to what you need. For example, the command cargo build --target aarch64-apple-ios --package myPackage --release tells Rust to compile your package for the aarch64-apple-ios target, and with a release build.

Another suggestion I’ve seen is to include that same target as the CFLAGS environment variable. I’m not sure where and how this does anything additional, but given how “interesting” compilers are about picking up environment variables for compilation, I’ve taken the example to heart (fair warning – I’m admitted I’m giving you potentially poor advice, blindly). The pattern I’ve seen used there: export CFLAGS_x86_64_apple_ios="-target aarch64-apple-ios" before invoking cargo build command. (For reference, I spotted this in This Week in Glean from January 2022, providing some examples of how they do this kind of Rust library inclusion.)

If you’re coming from another language, I’ll leave it to you – but there is likely equivalent command options for specifying a “target triple” for the code you’re compiling, and possibly additional compiler flags to specify.

The end result of all this should be a static library – typically a “.a” file – dropped in a directory somewhere. DEFINITELY DOUBLE CHECK that you’ve compiled it for the architectures you expect. Apple’s documentation suggests you use the file command in the terminal to inspect the architecture of your static library, but I’ve found that detail to be pretty useless. In macOS 13.2, file returns the message current ar archive. and no detail about the architecture. Instead, use the command lipo -info with the path to your library to make sure you’ve got the right architecture(s) in there. A couple of examples:

$ lipo -info target/aarch64-apple-ios/release/libuniffi_yniffi.a

Non-fat file: target/aarch64-apple-ios/release/libuniffi_yniffi.a is architecture: arm64

$ lipo -info target/x86_64-apple-ios/release/libuniffi_yniffi.a

Non-fat file: target/x86_64-apple-ios/release/libuniffi_yniffi.a is architecture: x86_64

For those platforms where you need to support more than one architecture (IOS simulator and macOS), you’ll want to take those single-platform static library binaries and merge them together – correctly. The key here is, again, the lipo command. The form of the command is:

lipo -create path_to_first_architecture_static_binary \
path_to_second_architecture_static_binary \
-output path_to_new_location_with_combined_binary

In practical terms, I usually generate a new directory in which I’ll hold these “fat” static libraries – one for each of those platforms: iOS-simulator and apple-darwin.

Once that’s merged together, use lipo -info to double check your work:

$ lipo -info target/ios-simulator/release/libuniffi_yniffi.a

Architectures in the fat file: target/ios-simulator/release/libuniffi_yniffi.a are: x86_64 arm64

At this point, we only need a few more pieces to be able to assemble the final XCFramework.

Gathering the headers and defining a module map

The first thing you’ll need to find is the C header file that matches the static library you just built. Different libraries and frameworks have these in different places, but you’ll need at least one to expose the library over to Swift.

In the case of the project I’ve been helping with, we’re using the Mozilla library UniFFI to generate both the header, and a bit of associated Swift code to make consumption a bit nicer. In its case, it generates the C header for us from the rust library code and declarations that we made. (UniFFI is probably worth its own blog article, it’s been really nice to work with.)

The other thing you critically need is a modulemap file. A module map is a text file that defines what, from header files, should be exposed to Swift – and the name of the module that’s being exposed. This file is not only critical, but unfortunately a bit arcane as well. The documentation for the structure of a modulemap is in the Clang Modules documentation under module map language. If the modulemap file is missing, everything appears to work – but when you attempt to import the module into Swift, you’ll get the error “no such module“, but very little additional detail to go on to understand WHY its not available.

If you’re exposing a static binary with a single header, the “easy” path is to expose everything in that header – functions, types, etc – through to Swift. The module map file for that goal looks something like the following:

module yniffiFFI {
    header "yniffiFFI.h"
    export *
}

In the example above, the module that is being exposed to Swift is yniffiFFI. It references a single header file yniffiFFI.h, expected to be in the same directory as the modulemap.

In the case of UniFFI, the tools helpfully generate a modulemap file, and names it after the name of the Rust library. The UniFFI documentation for module shows an example of providing an explicit link to the modulemap file via a command line compilation. When you’re creating an XCFramework, however, you can’t specify the name of the module map. The XCFramework and compiler expects the name of the file to be the far more generic (and default) module.modulemap. So if you’re using UniFFI, make sure you rename the file to module.modulemap before you provide it to a command assembling an XCFramework.

Aside: It took me a week and half to hunt down that the module wasn’t being picked up when it was named anything other than module.modulemap. Don’t do that to yourself – rename the file.

Building the XCFramework

To create the XCFramework on Apple platforms, the only “supported” route is to use the xcodebuild -create-xcframework command. The other option is to hand-assemble the directory structure of the XCFramework bundle and create the appropriate Info.plist manifest within it. Unfortunately, Apple doesn’t provide any real documentation of the structure(s) expected in an XCFramework. The script that Mozilla uses when assembling their Rust libraries for iOS follows the hand-assembly option, and uses a larger “framework” (vs library) structure as well.

To learn about the -create-xcframework command, use the following command to show the details of how to use xcodebuild to create an XCFramework:

$ xcodebuild -create-xcframework -help

which (in my version) shows:

OVERVIEW: Utility for packaging multiple build configurations of a given library or framework into a single xcframework.
USAGE:
xcodebuild -create-xcframework -framework <path> [-framework <path>...] -output <path>
xcodebuild -create-xcframework -library <path> [-headers <path>] [-library <path> [-headers <path>]...] -output <path>
OPTIONS:
-archive <path>                 Adds a framework or library from the archive at the given <path>. Use with -framework or -library.
-framework <path|name>          Adds a framework from the given <path>.
                                When used with -archive, this should be the name of the framework instead of the full path.
-library <path|name>            Adds a static or dynamic library from the given <path>.
                                When used with -archive, this should be the name of the library instead of the full path.
-headers <path>                 Adds the headers from the given <path>. Only applicable with -library.
-debug-symbols <path>           Adds the debug symbols (dSYMs or bcsymbolmaps) from the given <path>. Can be applied multiple times. Must be used with -framework or -library.
-output <path>                  The <path> to write the xcframework to.
-allow-internal-distribution    Specifies that the created xcframework contains information not suitable for public distribution.
-help                           Show this help content.

For each platform that you’re including in your XCFramework, you’ll want to include a pair of options to the command line with -library pointing to the static binary file (the one you made into a fat static binary file is multiple architectures need to be supported), and a -headers that points to a directory which contains the module.modulemap file and the header file(s) that it references. I code this in a shell script, so this part of the script ends up looking something like:

xcodebuild -create-xcframework \
    -library "./${BUILD_FOLDER}/ios-simulator/release/${LIB_NAME}" \
    -headers "./${BUILD_FOLDER}/includes" \
    -library "./$BUILD_FOLDER/aarch64-apple-ios/release/$LIB_NAME" \
    -headers "./${BUILD_FOLDER}/includes" \
    -output "./${XCFRAMEWORK_FOLDER}"

where the variable XCFRAMEWORK_FOLDER is a directory named for the module that I specified in module.modulemap followed by .xcframework. Using the yniffiFFI module example above, the resulting XCFramework name is yniffiFFI.xcframework.

Structure of an XCFramework

The structure of the library-focused XCFramework is fortunately pretty simple. You can use a command on the terminal such as tree to dump it out:

$ tree yniffiFFI.xcframework
yniffiFFI.xcframework
├── Info.plist
├── ios-arm64
│   ├── Headers
│   │   ├── module.modulemap
│   │   └── yniffiFFI.h
│   └── libuniffi_yniffi.a
└── ios-arm64_x86_64-simulator
    ├── Headers
    │   ├── module.modulemap
    │   └── yniffiFFI.h
    └── libuniffi_yniffi.a
5 directories, 7 files

Each platform-triplet that is supported will have its own directory under the xcframework. In the example above, the framework has support for ios-arm64 (iOS and iPadOS) and the combined static binary file for the iOS simulator. Alongside each of the platform directories is a manifest file: Info.plist.

The module that’s exported within module.modulemap is expected to match the name of the XCFramework. The name of that module is not, however, encoded into the Info.plist manifest.

You can dump the manifest using the plutil command with the -p option in the terminal. In my example: plutil -p yniffiFFI.xcframework/Info.plist results in:

{
  "AvailableLibraries" => [
    0 => {
      "HeadersPath" => "Headers"
      "LibraryIdentifier" => "ios-arm64_x86_64-simulator"
      "LibraryPath" => "libuniffi_yniffi.a"
      "SupportedArchitectures" => [
        0 => "arm64"
        1 => "x86_64"
      ]
      "SupportedPlatform" => "ios"
      "SupportedPlatformVariant" => "simulator"
    }
    1 => {
      "HeadersPath" => "Headers"
      "LibraryIdentifier" => "ios-arm64"
      "LibraryPath" => "libuniffi_yniffi.a"
      "SupportedArchitectures" => [
        0 => "arm64"
      ]
      "SupportedPlatform" => "ios"
    }
  ]
  "CFBundlePackageType" => "XFWK"
  "XCFrameworkFormatVersion" => "1.0"
}

You can expect the top level keys to always have the keys CFBundlePackageType, XCFrameworkFormatVersion, and AvailableLibraries. AvailableLibraries is an array of dictionaries, each of which has the following keys:

  • HeaderPath – a string that represents the directory path to the include files that are shared with this binary
  • LibraryIdentifier – a string that represents the directory name that identifies the platform “triplet” within the XCFramework
  • LibraryPath – a string that represents the name of the binary being exposed
  • SupportedPlatform – a string that represents the the core supported platform
  • SupportedArchitectures – an array of architectures that this static binary supports, such as arm64 and/or x86_64.

And optionally the following key

  • SupportedPlatformVariant – a string that represents the variant of the platform, such as simulator.

This structure is matched by open-source code in SwiftPM: XCFrameworkMetadata.swift, which was interesting to see how it was used in SwiftPM, but ultimately not very helpful for any debugging – as swift build only builds for the current platform (which is what invokes swiftpm), and when I was debugging, I was trying to sort out an iOS only framework.

If you look at those values, they should match exactly to the tree structure and file names you see in the .xcframework directory. If something is awry – a directory misnamed for example – the XCFramework will simply appear to not be loaded.

Debugging the creation of an XCFramework

I hate to say it, but this is a right pain in the ass. There’s no real tooling available to “vet” if an XCFramework is correct, although depending on how it’s screwed up – you might get some semi-useful messages from development tools like Xcode.

If you’re using swift and Swift Package Manger on the command line to build using a binary target (and referencing an XCFramework), then the failures are mostly – intentionally – silent. The reason is that the build systems (Xcode, SwiftPM, etc) do some work to find, sometimes download, an XCFramework and expose it somewhere in your local operating system. Then it adds the paths to the binary, and the headers, to the swift compiler.

Does not contain the expected binary

If you get the message “... does not contain the expected binary artifact“, (a error that I got from Xcode while “learning the ropes”), then the issue is that the name of the XCFramework doesn’t match the name of the module that’s exposed within it. The naming convention is pretty darned tight – and definitely case sensitive. If you renamed the xcframework directory without doing all the work to rename its internals – you’ll be hitting this.

In addition to the name of exported module needing to be reflected in the name of the XCFramework, it also should be the name of the binary target in Package.swift.

Oh – and if you’re using a compressed binary target (such as a compress zip file hosted somewhere), absolutely make sure you compress the xcframework using ditto with the option --keepParent. This same error can crop up if you fail to include the –keepParent option because as Xcode unpacks the XCFramework, it’ll expand itself into a different name, and you’ll be right back at the same “does not contain the expected binary” error message.

No such module …

If you’re attempting to load a module from an XCFramework and you’re getting “no such module” as the error, then something has gone awry with the loading of the modules. There’s unfortunately very little you can do to see WHAT has been loaded. If you’re in Xcode, then the most convenient way to see what’s available to import is to use the code completion capability, and start typing “import …” into a Swift file, and see what shows up in the editor.

Another suggestion, if you’re building for your local platform with swift package manager, was to use the swift build --verbose command. You can then repeat the whole of one of the underlying commands that it doesn’t with an additional option -Rmodule-loading, which prints out an additional debugging message from the compiler that let’s you know what’s being loaded – or at least from where. You’ll see messages akin to the following:

<unknown>:0: remark: loaded module at /Users/heckj/src/y-uniffi/lib/.build/arm64-apple-macosx/debug/ModuleCache/2C5HQS9YK727C/SwiftShims-2DA6NLEWJC11R.pcm
<unknown>:0: remark: loaded module at /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX13.1.sdk/usr/lib/swift/Swift.swiftmodule/arm64e-apple-macos.swiftinterface
<unknown>:0: remark: loaded module at /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX13.1.sdk/usr/lib/swift/_StringProcessing.swiftmodule/arm64e-apple-macos.swiftinterface
<unknown>:0: remark: loaded module at /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX13.1.sdk/usr/lib/swift/_Concurrency.swiftmodule/arm64e-apple-macos.swiftinterface

These are all the locations which are actively getting loaded through the module cache. The notable thing isn’t what’s there – but what’s not there. In the event the XCFramework isn’t loaded, then you won’t see it – or it’s expanded version location – listed.

Aside: Exploring the swift module cache

If you’re really curious and want to see what’s in those files, you can directly look at anything with the .swiftinterface extension as a plain text file. That’ll show you not only the module name, but compilation flags and quite a lot of detail of the swift modules. For binary files, they’ll be exposed as .pcm files, which are a binary format. You can look at those by using the swift compiler through the swiftc -dump-pcm command. It dumps a LOT of detail, so I recommend piping the results through a pager, such as less:

swiftc -dump-pcm /Users/heckj/src/y-uniffi/lib/.build/arm64-apple-macosx/debug/ModuleCache/2C5HQS9YK727C/SwiftShims-2DA6NLEWJC11R.pcm | less

It’ll share all the details of the options of HOW the PCM was compiled (options, targets, etc) as well as include links to the inputs that generated it – including system module map files and headers. While interesting – and critical for a compiler and what it needs to do – I didn’t find it explicitly helpful in debugging why an XCFramework wasn’t loaded, so I’m including this detail only for completeness.

I’m going to guess the most common reason is that the XCFramework doesn’t contain the target that you’re attempting to build for. For example, swift build on macOS builds for the architecture and platform of your local machine – nothing else. In my case, with an M1 Mac, it looks and builds for the architecture arm64-apple-macosx11.0. If you failed to include any binaries for the target arm64-apple-darwin, this is exactly the reason.

Probably your best bet is to investigate the structure and architecture of the binaries included in the XCFramework – as well as look for the module.modulemap file and verify the module you’re expecting is there – and matches the name of the XCFramework. The commands tree and lipo -info are invaluable, both of which I showed above.

Compressing and using a remote XCFramework

If you’re using the XCFramework locally and directly, then you can reference it by location on your local machine. The binaryTarget stanza in a Package.swift file looks something like:

.binaryTarget(
    name: "yniffiFFI",
    path: "./yniffiFFI.xcframework"
),

The name for this target should match the name of the module being exposed, and the name of the XCFramework as well.

But a common way to want to use this – the whole “distributed binaries” point of binaryTarget – is to use an external, hosted version. You can use a service such as GitHub or GitLab to host binary results with a tag that work well for this sort of thing.

In order to do this, you’ll need to do two things:

  1. Compress the XCFramework into a .zip file.
  2. Compute a signature to use when referencing that .zip file.

Both of these are well defined in Distributing binary frameworks as Swift packages. You can use ditto to compress an XCFramework. As noted earlier, make sure to use the --keepParent option. As an example:

ditto -c -k --sequesterRsrc --keepParent "$XCFRAMEWORK_FOLDER" "$XCFRAMEWORK_FOLDER.zip"

Once the XCFramework is compressed, compute the checksum. The Apple documentation offers the command:

swift package compute-checksum path/to/MyFramework.zip

But the checksum is a SHA256 digest of the zip, so you can also use openssl to get the same result:

openssl dgst -sha256 "$XCFRAMEWORK_FOLDER.zip"

Use the resulting string where you want to define a remote binary target within a Package.swift:

.binaryTarget(
    name: "yniffiFFI",
    url: "https://github.com/heckj/yniffi/releases/download/0.0.1/YniffiFFI.xcframework.zip",
    checksum: "098a5bc1f62dd2efa3b316daa14e08cb584515c9ae866cc78e0ad4c3154ab6f2"
),

To be clear, the example above will not function as is – I made up the values. So feel free to copy it, but don’t expect that to work as is. It’s an example for you to use, replacing your own values.

Large language models and Search

Microsoft’s made an aggressive and delightful splash in the search market by deeply integrating the guts of OpenAI’s large language model ChatGPT with Bing Search. There’s an impressive interview by Joanna Stern on the topic (hat tip to Daring Fireball for the link). There’s a potential there that’s amazing, and others that are truly frightening.

The part that’s super-cool to me is that the model, under the covers, is a sort of translation from text string to a point in a multi-dimensional space (_really_ big freaking numbers) that includes the surrounding text to build a context and tries to make a guess at the concept that the word is representing. But most interestingly, these modules aren’t only for a single language. They can be constructed from concepts from a LOT of languages (which is how you get the amazing translation capabilities these days). One of the benefits of this, in the context of “trying to find something” is that you don’t have to be constrained to content from a single language. A sort of multi-language Wordnet database, which would be super-cool, if challenging.

I’ve no idea if Bing (or Google’s vague-ish LAMDA/Bard response) does this at all, but it was something even I briefly looked into. When I was doing a bit of work to help improve search at Swift Package Index, one of the things I noticed was that while _most_ of the documents and README’s were in english, there were a notable set in mandarin. For the most part, those packages published a snippet or such in english to assist with any sort of “Is this what I want?” research, but a LOT of the detail was buried and opaque – you pretty much need to cut and paste a bunch of content into translate to have a hope of a loose translation. I took a bit of time and dug around to the multi-language models to see if there was something that was desktop-class usable that I could use to transform the content into some indexable content that might be searched. Most of these latest hotness in language models are the kinds of things that take clusters of computers to present, let alone run encode or run inference. I didn’t find anything directly obvious, and it turns out that’s a field of active research, but there wasn’t anything obviously available for use in that kind of format.

I’m not nearly as much of a fan of the synthesis based on a query though. The results, well – can lie with extreme confidence, and if you’re doing any critical thinking about the results, you’re right in a pile of muck. And doing critical thinking is something that a whole bunch of people don’t seem to be too adept at these days. So suffice to say, I think that’s a bit of an existential failure. And fundamentally, it feels like glorified computer-assisted plagiarism that’s somehow been anointed as acceptable.

So hopefully the good parts will be retained and used – and with any luck, there’ll be a corpus and model that’s published somewhere, by someone, that’s not JUST in the hands of a mega-corporation that can be used to bolster search and information finding in some of these smaller corner areas. I’d love to see something like that working to help me find relevant libraries for a specific kind of task or technique within the Swift Package Index.

Native support for USD tools on an M1 Mac

I saw USD release 22.08 drop a few weeks ago, and notably within its release notes is the sentence: “Added support for native builds on Apple Silicon.”

I struggled quite bit with getting USD both installed and operational because, as it turns out, there’s a bit of quirk to Python that made things more difficult. macOS has stopped including Python3 in its default install, although it appears that if you install Xcode (or developer tools on the CLI), you’ll get a version of Python 3 (3.89) installed on the OS. Since I wasn’t sure what the path for Python3 support was looking like, and it took a while get a M1 native build of python in the first place, I’d switched over to installing Python using the packaging tool `conda`, which has been fairly popular and prevalent for ML folks to use.

To get started, I’d installed miniforge, downloading https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-MacOSX-arm64.sh and then running it to install and update:

sh ~/Downloads/Miniforge3-MacOSX-arm64.sh
source ~/miniforge3/bin/activate
conda update conda -y

With conda installed (and activated), I started installing the various dependencies:

conda install pyopengl
pip install pyside6

And then grabbed the open source distribution of USD from Github and set it to building, with the results installed at /opt/local/USD:

git clone https://github.com/PixarAnimationStudios/USD
cd USD
git checkout release
git pull
python build_scripts/build_usd.py /opt/local/USD

It compiled and installed fine, but any tool I attempted to use crashed immediately with the following error message:

python crashed. FATAL ERROR: Failed axiom: ' Py_IsInitialized() '
in operator() at line 149 of /Users/heckj/src/USD/pxr/base/tf/pyTracing.cpp
writing crash report to [ Sparrow.local:/var/folders/8t/k6nw7pyx2qq77g8qq_g429080000gn/T//st_python.29962 ] ... done.

I opened an issue on USD’s GitHub project, and quickly got some super helpful support from their team. It turns out that an additional build configuration option is needed, specifically for python installed through Conda, because of the way it links to python. The same issue happens regardless of architecture – so it’s quirk for x86 as much as aarch64 architectures.

The key to getting it working is passing adding a build configuration detail:

PXR_PY_UNDEFINED_DYNAMIC_LOOKUP=ON

I originally though that meant just exporting it as environment variables, but that doesn’t do the trick with USD’s build and installation process. Sunya Boonyatera provided the critical knowledge on how to add it in to get it to work:

python build_scripts/build_usd.py /opt/local/USD --build-args USD,"-DPXR_PY_UNDEFINED_DYNAMIC_LOOKUP=ON"

Nick Porcino was kind enough to jump in as well and provide an explanation:

Conda distributed Pythons are statically linked as a way to reduce unexpected system and installation coupling, but Python isn’t, to my knowledge, built in such a way that an external build system can robustly discover whether you are targeting a static or dynamically linked Python so we are forced to deal with it manually.

https://github.com/PixarAnimationStudios/USD/issues/1996#issuecomment-1217039438

Getting to this point wasn’t at all straightforward for me, so I’m hoping that by pushing this detail out there with my blog, Google will have it included in its index for the next individual, whether they’re searching on the error message or searching for how to install USD on an M1 Mac.

Side note – once its all installed, you still need to activate some paths on the CLI to get things available:

export PATH=/opt/local/USD/bin:$PATH
export PYTHONPATH=$PYTHONPATH:/opt/local/USD/lib/python

SwiftUI Field Notes: DocumentGroup

When you build a SwiftUI-lifecycle-based app for macOS or iOS, you’re expected to define a Scene – the first of which generally defines what (and how) that App displays when it is launched.

DocumentGroup is one of the types of Scene (for macOS and iOS platforms) that is focused on an App lifecycle built around viewing, creating, and editing documents. The common alternative, WindowGroup, doesn’t prevent you from reading, editing, or saving documents, but does change what happens when an App launches. The difference in “how it works” is when you launch an app using DocumentGroup, the app is expecting that the first thing you’ll want to do is either open a document or create a new one. It does this by kicking off an “open” dialog on macOS, or displaying a DocumentBrowser on iOS, configured based on the types your app says it knows how to create (and/or open). WindowGroup, by comparison, just tosses up whatever view structure you define when the app starts – not expecting you to want to start from the concept of “I want to open this document” or “I want to create an XXX document”.

You can define more than one DocumentGroup within the SceneBuilder for your app. You might do this, for example, to support opening and viewer or editing more than one file type. Only the first DocumentGroup that you define (and its associated file type) is linked to the “create new document” of the system provided Document Browser (or the File > New... menu within a macOS app). You’re responsible for exposing and/or providing an interface to create any new file types of additional file types.

Document Model Choices

When you define a DocumentGroup, provide it with a view that’s either the editor or viewer for a document type. That view is expected to conform to one of two protocols: FileDocument or ReferenceFileDocument. The gist of these protocols being that your editor (or viewer) view gets passed a configuration object that represents the document as a wrapper around your data model, and has with it a binding or reference to an ObservableObject, depending on which protocol you choose. Use FileDocument when you have a struct based data model, and ReferenceFileDocument when your data model is based on a class (and conforms to ObservableObject).

Apple provides a sample code entitled Building a Document-Based App with SwiftUI that uses the ReferenceFileDocument, and Paul Hudson has the article How to create a document-based app using FileDocument and DocumentGroup that shows using FileDocument.

The Default Call To Action

Be aware when designing and building an App using DocumentGroup that there’s no easy place to tie into the life cycle to provide a clear call to action before the system attempts to open a file as your app launches. The place where your code (and any control you have) kicks off is after the app has presented a file open/new view construct – either a DocumentBrowser (iOS) or the system Open dialog box (macOS). I find that experience can be confusing for anyone new to using the app, as there’s no easy way to introduce what the app is, what’s new, or provide instruction on how to use the app until after the person has chosen a document to open or created a new document.

If you need (or want) to provide a call to action, or to get some critical information established that you need for creating a new document, you need to capture that detail after the new document is passed to you, but potentially before you display any representation of the document. That in turn implies that a “new” document model that requires explicit parameters won’t work for conforming to FileDocument or ReferenceDocument. You are responsible for making sure any critical details get established (and presumably filled in) after the new document is created.

For example, if a new document requires some value (or values), add a function or computed property that returns a Boolean that indicates if the values exist or not. Then in your editor view, conditionally return a view that 1) captures and stores those initial values into the new document when they don’t yet exist or 2) displays the document if those values are already defined.

NavigationView (or not)

An interesting side effect of DocumentGroup is that the views you provide to it get very different environments based on platform. iOS (and Mac catalyst based apps) display the view you provide enclosed within a NavigationView that the document browser provides. With macOS, there is no default NavigationView. If you want iOS and macOS to behave with a similar design (for example, a two-column style navigational layout), you need to provide your own NavigationView for the macOS version.

I find NavigationView to be a quirky, somewhat confusing setup (I’m glad its deprecated with iOS 16 and macOS 13 in favor of the lovely newer navigation types). When you’re working with iOS, it may not be obvious that you can create a tuple-view to provide the two separate views that you need to enclose within NavigationView to get that two-column view layout. You can do this by providing a view that returns two separate views without an enclosing grouping or container view. For example, the following view declaration provides a two-tuple view (with SidebarView and ContentView) that the NavigationView provided by the built-in document browser will use when displaying your document.

var body: some View {
   SidebarView()
   ContentView()
}

If you provide your own NavigationView inside the view that the DocumentBrowser vends, you get a rather unfortunate looking “double navigation” view setup. Instead, provide a tuple of views based on the navigation style you’d like your editor (or viewer) to display.

If you’re used to NavigationView, but ready to target these later platform versions, definitely take the time to read Migrating to New Navigation Types and watch related WWDC22 talks.

I wish contextual SwiftUI was more predictable

I’m not certain how to phrase this. It’s either that I wish I was better at predicting what a SwiftUI view would look like, or that I wish SwiftUI was more predictable at how views render in different contexts. I recently built a multi-platform SwiftUI utility app, and was struck by how often I ran into unexpected results.

Contextual rendering of a view is a feature, perhaps a key feature, of the SwiftUI framework. The most obvious example is how views render differently by platform. A default button renders differently on macOS as compared to iOS. But this “different contextual representation” goes much deeper. Some views render differently based on if they are enclosed by another view, such as Form or NavigationView.

If you take the example in Apple’s reference documentation for Form, and replace the Form with a VStack, the result is notably different. For the following two screenshots illustrating the differences, I stripped out the enclosing Section views. The view on the left shows the view enclosed within Form, the view on the right shows the content enclosed within a VStack:

Example code at https://gist.github.com/heckj/72f713faf28014e560449f82b7ab357d

I feel obliged to point out that these effects, at least in the case of Form, are documented within the reference documentation for Form. That said, I’m fairly certain that not all view variations are explicitly documented.

In general, the combined effects of views get complicated — and hard to predict — as you compose them, primarily within container views. I do wonder if the struggle to predict how a view will render (and that they render contextually) is responsible for some of the feelings I’ve heard expressed by developers as “SwiftUI isn’t ready for production” or “It still feels beta to me”.

The productivity enhancement of being able to preview a view “live within Xcode” has gone a long way to mitigating this complexity for me, especially when constructing views that I want to operate across multiple platforms (macOS, iOS, iPadOS). Using Xcode 14 beta, the automatic live rendering is notably reducing the time for each iteration loop (barring a few odd bugs here and there – live rendering for macOS seems to come with notably more crashes while attempting to generate a the preview).

Still, I wish it were easier to predict – or barring that: look up – how a view renders in different contexts. The web/javascript development world has Storybook (an amazing resource, by the way), and the closest SwiftUI equivalent to that is the A Companion for SwiftUI app, which provides an interactive visual catalog. If you’re developing with SwiftUI, it’s a worthy purchase and investment. Be aware that even this app doesn’t always help with understanding the contextual variations for a platform or when composed.

The best advice I have to counter this difficulty in prediction is to lean into the previewing – be it for macOS, iOS, tvOS, or watchOS. Use Xcode’s previewing capability, and get to your combined views as quickly as possible. While it can be surprising that an enclosing view can have such an impact on how something renders, seeing the combination so easily while developing at least lets you verify and react when things don’t go as you expect.

What Apple might do with distributed computing

I’m excited to see not only async-await making it thoroughly into the language (in Swift 5.6), but also the extensions that enable actors and distributed actors with this general sweep of the Swift language embracing concurrency. It’s been several years in the making, and the past year has been building much of these base pieces through the open-source side of Swift, which has been fantastic to watch. The next version of Swift (version 5.7 I presume) is likely coming next week, but even that doesn’t yet contain all of the outlined goals for concurrency within it. There’s clearly more to come – so where will it go from here? And how will it get used?

I have some background with distributed computing, back from the early days when OpenStack was starting to become a thing, several years before Kubernetes rose up to provide (in my opinion) even better abstractions, even if it stopped short in just the right way to bring with it a hell-scape of complexity. I’ve been programming using the Swift language recently, primarily to be delivered on Apple platforms, but I keep my fingers in the linux distribution, and a close eye on “server-side swift” in general. I see a huge amount of opportunity in distributed computing and the capabilities that it could provide.

With more hope than any real knowledge, I’d love to see a replacement API structure that’s swift focused and distributed across devices as a successor for XPC. Expanding the cross-process communication capabilities to cross-device, and the supporting infrastructure from Apple operating systems to identify and broadcast to each other, could enable some interesting scenarios. XPC has long been critical in terms of sandboxing processes, and generally enabling cross-process communications. It’s used all over the place in macOS and iOS, in the background where we don’t see it. Enable that across devices… and you’ve made the cluster of Apple devices that I have in the house more powerful because they could leverage each other and what they can individually do.

To me, the obvious starting point is Siri interactions. I have a HomePod mini in my kitchen, and usually more than one other device nearby. When I say “Hey Siri”, everything lights up and tries to respond, and one of them usually runs with the request – but sometimes several make the same attempt. I’ll ask Siri on my phone to activate a shortcut, but the HomePod grabs it – and because it doesn’t have the installed apps or intents, it says it can’t fulfill my request. Wouldn’t it be nice if Siri could know all the intents of all the active devices (where you have permissions, naturally) in the area where the request was made? That kind of thing is only possible when you have a robust and dynamic clustering solution that can let devices come and go, and share information about what’s nearby and what their capabilities are. In short: distributed computing.

Another place that would be useful is data collection and presentation with HomeKit. It feels dumb to me that each device and some iOS app has to manage the history of data for the in-house sensors I have installed. It’s making a silo of every sensor, and restricting my ability to even see, let alone use, the information together. The most effective solution to this today is cloud service integration, but the “advertising freemium” model that has dominance is primarily focused on extracting value from me, not providing it – so no thank you! Instead, I’d love to have a device (such as a HomePod with a bit of storage) stashed in the house that just collected the data from those sensors and made it available. A little, local persistent cache that all the HomeKit things could send to, and iOS apps could read from. A collection center akin to what Apple developed for health, and that doesn’t need to be stashed in a secure enclave. Imagine that as a HomeKit update!

Those two ideas are just based on what’s available today with a bit of evolution. My wildest dreams go full 2001: A Space Odyssey and imagine a rack where you can plug in (or remove) additional modules to add compute, storage, and memory capability. Make that available to other devices with something distributed, authenticated, and XPC like. It could expand the capability of small devices by offering progressively larger amounts of offloaded compute and storage available to devices that are getting lighter, smaller, and more invisible. In short, it’s a place where you could do the offload/tether that used to exist for iPhones to Macs, or watches to iPhones. If the current equivalent of those little white boxes in 2001 had the internals of the low-end current iPhones, it’s a truly amazing amount of compute that you could made available.

Yeah, I know that last one is getting pretty out there, but I love the idea of it anyway. I know it’s possible – maybe not practical – but definitely possible. I helped create something akin to that idea, based on the lessons from NASA’s Nebula Compute (but with a LOT less integration of hardware and software about a decade ago – we were trying to ride it all over commodity gear). That company failed, but the similar vision was replicated later by CoreOS, and a variation included within Canonical’s Ubuntu. I haven’t seen it really come to fruition, but maybe…

In any case, I’m looking forward to WWDC starting next week, the announcements, and mostly seeing the work a number of friends have been quietly working on for a while.

What being an Open Source Developer means to me

I’ve periodically described myself as an “open source developer” in those pithy biography blocks that you sometimes get (or have) to create. Developing and providing open source software means something specific to me, and I’d like to share how I think about it.

At it’s core, it’s only a little about solving problems with code – it’s as much about the community for me. When I provide open source software, it means I am both willing and want to contribute to knowledge that others can use, with few constraints. I’m not seeking payment, nor am I seeking to become a servant. I build and share open source software because I enjoy having, being apart of, and building a community that helps each other. The software solution itself is useful, but almost secondary. I enjoy, and search for, the benefits of having a community with diverse skills. People that I can go to with questions to get answers, or at least opinions on directions to search, and warnings about steep cliffs or ravines that may lie along the path – at least if they know of them.

I’ll happily host software on Github or Gitlab, take the time to make it into a package that others can use, and submit it to a relevant package index. I’m often do that as a side effect of solving a problem for myself. Sometimes I like to share how I tackled it for anyone else coming along with questions. I get a huge amount of value from reading other people’s work solving problems, so I try to share my thinking as well. The solutions may not be optimal, or even correct in all cases, but it’s shared – and that’s the main thing. Because it isn’t easy to find solutions just looking at source (or reading Github discussions), I also write about some of my searching for solutions in this blog. It helps to get at least some of the details in the global search indices. A blog isn’t a great place to share code, but it’s a decent place to talk about a problem or solution – and point to the code.

I’ve run into a number of folks who don’t share my opinion of what’s suitable to open source. I’ve heard them reference their reasons for not publishing as open source, as if they’re feeling guilty for not doing so. I’m not sure that’s what they’re feeling, but I kind of read that into some of what’s said. I’d really like to tell them there no guilt or shame in not contributing, or in keeping your learning and explorations to yourself.

I think some of that guilt thing has evolved from technical recruiters using (and sometimes expecting) a person’s public GitHub repository to be a reflection of their capabilities or a kind of portfolio or work. I’ve looked myself when I’ve hired people, if there’s something there – in the past, I thought it could show interests that can be useful to know about or just a starting point for a conversation. The downside of this practice is that people end up thinking that whatever they might “open source” has to speak with perfection about their cleverness, clarity, or robustness of a solution. I get it, that concern for what others might think based on what I’ve done. While I personally value the sharing, I respect the desire to not want to be put in that position.

There’s also plenty of people who legally can’t share. I think it sucks that corporate policies with these legal constrains exist, and from a corporate policy perspective I think it’s brutally short sighted. That said, I’m also someone who believes that ideas are nearly free – that it is execution that matters in building and running a company. Still it’s common, especially for technology companies, to legally constrain their employees from participating in open source without their explicit, prior approval – or in some cases to deny it all together, shitty as that can be.

I like to learn and I like to teach – or more specifically, I like to share what I’ve learned and learn from others in the process. (I’ve never learned so much as when I taught.) I get a lot of value out of being a part of a community – more so one that shares, sympathizes during the rough spots, and celebrates the successes. I’ve been repeatedly fortunate to find and be involved in great communities, and I’ve learned to seek them out – or just help foster and build them. I’ve also run from “so-called” communities that had individuals that delighted in mocking others, or attacked an opinion because it was different from theirs (I’m looking at you, orange hell-site).

While I love sharing, I’ve learned (the hard way) to set boundaries. Just because I like and want to share doesn’t mean I’m beholden to someone else’s opinion of what I should do, how I solve a problem, or that I have any obligation to them beyond what I choose to give. I’ve had people demand support from me, or from a team I was a part of collaboratively developing some open source software. I’ve even had a few experiences where I was targeted in personal, verbal attacks. It’s unfortunate that this comes with the territory of sharing with the world, but it’s also been (fortunately) rare. Most of the time the people being abusive, or selfish jerks, aren’t part of the community group that that I value. In my experience, they often stumbled onto code I helped create while looking for a solution to some problem they have, and thought because I shared, they were owed something. Sometimes they’re just people having a bad day and relieving their frustrations in the wrong place. There was a time that I’d offer to help those folks who’d been abbrasive – but only for money as a contract or consulting fees. These days I often shunt such ravings into a trash can, shut the lid, and don’t look back. It’s not worth the time, or the money, to deal with assholes.

The most valuable thing is the people you can connect with. These may be folks you’ve known for years, or they might be people you’ve never met or heard from. One of the best parts of “being an open source developer” is putting solutions out there that someone can pick up and use, and along the side infer “hey – there’s a place for you here. Drop in, have a seat, and chat with us.” They might stop in for a few minutes, a couple of days, or maybe longer. I expect everyone moves on at some point, and my hope is that they take away a spark of being a part of positive community. A seed of how it could work that they can grow.

Back at the beginning of this, I said open source meant something specific to me – sharing knowledge with “few constraints”. That’s not the same as no constraints. The formal mechanism in software around these constraints is licensing. I watched the legal battles from the 80’s and 90s, felt (and dealt with) their fallout, and avidly followed the evolution of a formal definition of open source and the blooming of various licenses that provided a legal backing. I get the licensing and I take it seriously. While I understand the foundational desires behind the free-software movement and its various licenses – such as GPL and LGPL – that’s not for me; I’m not a fan. I’m not going to, and don’t want to, take that hard a line and impose those additional legal constraints for the product that represents the knowledge I’m sharing. About the only constraint I want to put on something is attribution. The gist of which is “don’t be a jerk and take credit for my work”, but really it boils down “Recognize that you, like I, stand on the shoulders of our peers and those that came before us.”

%d bloggers like this: