Iterating through Strings in Swift

I recently decided to dive into a new bit of learning – creating my own software language interpreter. No, I’ve not gone stark raving mad due to COVID isolation, it is an interesting challenge that I wanted to understand better. Over a year ago, I remember Gus mentioning the process of creating an online book in his blog – that book was Crafting Interpreters, and I kept a reference to that site.

I’ve started working through the book – except that instead of doing the example code in Java, I’m converting the examples and content on the fly and implementing it in Swift. I just worked through implementing the scanner portion of the code. That code required me to read through a text file (or string) and get tokens from it. And as it turns out, this was super easy to do in swift, but quite non-obvious for me.

Since its inception the Swift language has changed, I think a couple of times, how it deals with strings. Because it supports the UTF8 strings, you can’t just iterate through it byte at a time and get what you want. A lot of early (pre-swift 3) examples did some of this, or variations on the theme, but that’s no longer valid. So a number of examples on StackOverflow tackling this kind of thing are dead code and very out-of-date.

The first, and most obvious way, to iterate through the characters in a string is to use it within a for loop. This is the pattern that you’ll see directly in The Swift Programming Language Guide on Strings and Characters:

for char in yourString {
    print(char)
}

Works great, super efficient… except, you don’t have any reference to do some of the tricks that parsers and scanners want to do – which is peek to see what the next character might be.

The way you can interact with strings in this fashion involve a specific type called String.Index. The details are in a section just a bit further down: Accessing and Modifying Strings in that same guide.

Don’t make the mistake of thinking a String.Index is just a number that you can add and subtract to move around the index. Unicode makes it significantly more tricky than that, and the language represents String.Index as its own separate type, I think partially to to keep me (and you) from making that mistake. Fortunately, the standard library offers a couple of properties on each string to give us reference points: startIndex and endIndex. You can step forward to the next index with a methods on the index: String.Index(after: _someIndex). There’s also a way to step back – use String.Index(before: _someIndex). These allow you to step forward (or back) through a string, character by character, or shift the index location forward a step (or two) that you kind of need when you’re making a scanner for a language interpretter.

One last mechanism that’s helpful to know: you can retrieve a substring from the characters between two index locations, using a range expression made up of the two indices. A substring isn’t quite a string – it’s a different type in Swift. Swift plays some tricks with referencing the original string when you’re using this type, but you an easily make a full string to use as an argument for another function easily enough:

newString = String(firstString[indexA...indexB])

As I mentioned earlier, this is all in the Swift language guide. I hate to admit that it hasn’t been the first place I looked for help and guidance, but it probably should have been. It’s all there, and more.

Published by heckj

Joe has broad software engineering development and management experience, from startups to large companies. Joe works on projects ranging from mobile to multi-cloud distributed systems, has set up and led engineering teams and processes, as well as managing and running services. Joe also contributes and collaborates with a wide variety of open source projects, and writes online at https://rhonabwy.com/.

<span>%d</span> bloggers like this: