Parsing HTML in Swift 3 (and RTF, etc.)

A friend of mine and I were discussing how to parse HTML and turn them into NSAttributedStrings the other day. That is to say, to take a tagged string, such as

<h1><i>This</i> is <b>NPR</b></h1>

And produce an NSAttributedString that will look like:

This is NPR

xkcd comic
*This* is HTML
  • My friend, who works at a well-known dating app startup, had a rather limited problem (no font or color variations, for example), and could hard-code much of his solution, using NSScanner to do the rest.
  • I came up with an algorithm using a struct to hold Range and NSAttributeName information, and filling those structs by extracting the information while walking through the original string using a regex pattern (something like <\/?\w[\w\d]?\w{0,9}>) to catch the tags.
  • But before we could have the fun of combining/comparing the two solutions, I had a facepalm moment, discovering the built-in methods for doing so, and wrote the following extension (below).

Leave a Reply

Your email address will not be published. Required fields are marked *