How to build a flexible multiple type data system in Rust without cloning strings?

穿精又带淫゛_ 提交于 2019-12-11 02:04:15

问题


I want to build a system where data of different types (i32, String, ...) flows between functions that modify the data. For example, I want to have an add function that gets "some" data and adds it.

The add function gets something of type Value and if Value is an i32, it adds the two i32 values, if it is of type String, it returns a string that combines both strings.

I know that this would be almost perfect for template programming (or whatever this is called in Rust, I'm coming from C++) but in my case I want to have small code blocks that handle the stuff.

As an example, with f64 and String, using Float and Text as the names, I have:

pub struct Float {
    pub min: f64,
    pub max: f64,
    pub value: f64,
}

pub struct Text {
    pub value: String,
}

pub enum Value {
    Float(Float),
    Text(Text),
}

Now I want to implement a function that gets a value that is supposed to be a string and does something to it, so I implement the to_string() method for Value:

impl std::string::ToString for Value {
    fn to_string(&self) -> String {
        match self {
            Value::Float(f) => format!("{}", f.value).to_string(),
            Value::Text(t) => t.value.clone(),
        }
    }
}

Now the function would do something like:

fn do_something(value: Value) -> Value {
    let s = value.to_string();
    // do something with s, which probably leads to creating a new string

    let new_value = Text(new_string);
    Value::Text(new_value)
}

In the case of a Value::Float this would create a new String, then a new String with the result and return it, but in the case of a Value::Text this would clone the string, which is an unnecessary step, and then create the new one.

Is there a way where the to_string() implementation could create a new String on Value::Float but return the reference of Value::Text's value?


回答1:


The "standard" way to deal with the possibility of either a String or a &str is to use a Cow<str>. COW stands for clone-on-write (or copy-on-write) and you can use it for other types besides strings. A Cow lets you hold either a reference or an owned value, and only clone a reference into an owned value when you need to mutate it.

There are a couple of ways you can apply this to your code:

  1. You can just add an Into<Cow<str>> implementation and keep the rest the same.
  2. Change your types to hold Cow<str>s throughout, to allow Text objects to hold either an owned String or a &str.

The first option is easiest. You can just implement the trait. Note that the Into::into accepts self, so you need to implement this for &Value not Value, otherwise the borrowed values would be referencing owned values that have been consumed by into and are already invalid.

impl<'a> Into<Cow<'a, str>> for &'a Value {
    fn into(self) -> Cow<'a, str> {
        match self {
            Value::Float(f) => Cow::from(format!("{}", f.value).to_string()),
            Value::Text(t) => Cow::from(&t.value),
        }
    }
}

Implementing this for &'a Value lets us tie the lifetime in the Cow<'a, str> back to the source of the data. This wouldn't be possible if we implemented just for Value which is good because the data would be gone!


An even better solution might be to use Cow in your Text enum too:

use std::borrow::Cow;

pub struct Text<'a> {
    pub value: Cow<'a, str>,
}

This will let you hold a borrowed &str:

let string = String::From("hello");

// same as Cow::Borrowed(&string)
let text = Text { value: Cow::from(&string) };

Or a String:

// same as Cow::Owned(string)
let text = Text { value: Cow::from(string) };

Since Value now can indirectly hold a reference, it will need a lifetime parameter of its own:

pub enum Value<'a> {
    Float(Float),
    Text(Text<'a>),
}

Now the Into<Cow<str>> implementation can be for Value itself because referenced values can be moved:

impl<'a> Into<Cow<'a, str>> for Value<'a> {
    fn into(self) -> Cow<'a, str> {
        match self {
            Value::Float(f) => Cow::from(format!("{}", f.value).to_string()),
            Value::Text(t) => t.value,
        }
    }
}

Just like String, Cow<str> satisfies Deref<Target = str> so it can be used anywhere that a &str is expected, by just passing a reference. This is another reason why you should always try accept &str in a function argument, rather than String or &String.


Generally, you can use Cows as conveniently as Strings, because they have many of the same impls. For example:

let input = String::from("12.0");
{
    // This one is borrowed (same as Cow::Borrowed(&input))
    let text = Cow::from(&input);
}
// This one is owned (same as Cow::Owned(input))
let text = Cow::from(input);

// Most of the usual String/&str trait implementations are also there for Cow
let num: f64 = text.parse().unwrap();


来源:https://stackoverflow.com/questions/52281496/how-to-build-a-flexible-multiple-type-data-system-in-rust-without-cloning-string

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!