Rust and NSString

Rust and NSString

So yeah. You want to learn Rust and have a small project in mind. Like one hour adventure. In and out.

My small project was Clink. The idea is pretty simple, get data from the clipboard, check if there link, clean the link from tracking params, put the cleaned link back to the clipboard. The first version of it (0.1.0) was done maybe in two hours. This is not because I am smart, but because Rust is an awesome programming language. Then I posted a link to the GitHub repo on the Rust subreddit and gathered some feedback. Project stabilized, aka I run out of ideas, at version 0.6.0. All that time, I was a happy user, until I looked at Activity Monitor and saw that:

Activity Monitor

That is when real adventure starts.

Usually, you write software for mac using Objective C or Swift. Sometimes you are crazy enough to do it in C, and the Objective C runtime exposes the C interface for you. But sometimes, for no particular reason, you want to learn Rust. And for that occasion, Steven Sheldon created a Rust wrapper around Objective C runtime. If you are interested in details here is the link to his blog post where he explains what and why is going on.

How do you approach tasks like finding and fixing a memory leak in a program written in a language you barely know, which calls API you don't know? Good question.

First things first - tools. After some googling, I found that nowadays everything is easy.

installation is easy:

xcode-select --install
brew install cargo-instruments

usage also easy:

cargo instruments -t Allocations

And I get nothing. After some thinking, I figured out that Clink is never stopping, and that's why I do not get any traces. So I come up with this:

use clipboard::{ClipboardContext, ClipboardProvider};

fn main() {
    let mut ctx: ClipboardContext = ClipboardProvider::new().unwrap();
    for _ in 0..20_000 {
        match ctx.get_contents() {
            Ok(str) => println!("{}", str.chars().count()),
            Err(_e) => println!("{:?}", _e),
        }
    }
}

The idea is simple - get 20 000 times the value from the clipboard and, to be sure that there is something, print the count of chars. I know that I should count graphemes, but in this case, it does not matter. Here is what I got:

full glory

Yeah. It produces a nice graph (there are no values on the axis, but the last value is 2.67 GB). But how to find what is actually leaking? I did not come up with a better idea than check all repeating allocations if there is corresponding free. Here is how good allocation looks like:

good allocation

And this is bad allocation:

bad allocation

The stack trace is on the right side of the window. So now, finding where the memory leak is not a problem. In my case, the Clipboard crate calls NSPasteboard.readObjectsForClasses to get current clipboard content, but not Rust nor Objective C runtime knows who is responsible for freeing memory, so nobody frees it. Surprise! Here you can suggest - choose another crate to work with the clipboard. But the problem is that every other rust clipboard implementation for mac, in one way or another, copies Avi Weinstock's implementation and leaks :(

So my first step was to simplify the code a little bit. Since the get_contents method from the ClipboardProvider trait returns a String, we can use NSPasteboard.stringForType, which produces NSString, and this is as close to the Rust String as it can get. For calling this method, and by calling method, I mean sending the message, we need to pass an enum value NSPasteboardTypeString as a param. I spent at least a couple of evenings trying to figure out where I can get this enum value. As always, everything is simple when you know it:

#[link(name = "AppKit", kind = "framework")]
extern "C" {
  pub static NSPasteboardTypeString: Sel;
}

Now we got our NSString, convert it to Rust string, and send the release to NSString after.

disappointment

This is how disappointment looks like. But if you look at this screenshot long enough, you will see that leak is more than three times lower. The first allocations dump showed more than two and a half gigs of RAM, but this one is only 775 megabytes. So this means two things:

First - this is a huge win.

Second - there is at least one more leak.

The next leak was found in the INSString.as_str() method. You may ask WTF is INSString, and this would be a trait from the obj-foundation crate. We convert NSString to C string, by reading data from a pointer which we get from NSString.UTF8String. This pointer, according to Apple documentation should be freed at the time when NSString itself will be released or sooner. In reality, it is mostly true, except when you have characters outside the ASCII range. So it is leaking on the Objective C runtime level, or I again don’t understand how things are working. The solution that worked looks like that:

fn nsstring_to_rust_string(nsstring: *mut NSString) -> Result<String, Box<dyn Error>> {
  unsafe {
      let string_size: usize =
          msg_send![nsstring, lengthOfBytesUsingEncoding: NSUTF8StringEncoding];
      //we need +1 because getCString will return null terminated string
      let char_ptr = libc::malloc(string_size + 1);
      let res: bool = msg_send![nsstring, getCString:char_ptr  maxLength:string_size + 1 encoding:NSUTF8StringEncoding];
      if res {
          let c_string = CStr::from_ptr(char_ptr as *const i8);
          libc::free(char_ptr);
          Ok(c_string.to_string_lossy().into_owned())
      } else {
          libc::free(char_ptr);
          Err(err("Casting from NSString to Rust string has failed"))
      }
  }
}

What is going on:

  1. We get the size of a string in NSUTF8StringEncoding. I did not find a way to import NSUTF8StringEncoding from anywhere, but according to ,Apple docs, it equals 4
  2. We are allocating memory of a size we got on a first step plus one. Plus one is needed because getCString returns a null-terminated string
  3. By calling getCString, we ask to write a null-terminated string in UTF8 encoding to memory we allocated on step two
  4. If the third step was successful
  5. We creating a C string from a pointer we got on a second step
  6. Free memory we allocated on a second step
  7. Convert C string to Rust string and return it
  8. If the third step was unsuccessful
  9. Free memory we allocated on a second step
  10. Return error

And here is the result:

final result

Now, this is how the win looks like. And Clink, after running a couple of days, still uses one MB of ram. If you are using Clink on a mac, which is very unlikely, update to version 0.6.1. I hope it is finally fixed :)