Something that some new programming languages are doing well is to have from day 1 materialised data and data views.
First, what is materialised data and data view?
Before anything else, those names came out of my mind, but the idea is not new.
This is a struct/class data has the ownership of some allocated data. For instance, C++ std::string
/ std::vector
and Rust String
/ Vec
classes have the ownership of the data they provide.
On the other hand, this struct/class indexes data from a materialised data, but it does not own the data. It's trivial to convert materialised data to data view, but not the other way around. Examples of data views are std::string_view
and std::span
in C++ and &str
and &[]
in Rust
This is not in order of importance!
First: Performance
size_t length(const std::string &buffer); // C++
fn length(buffer: String) -> usize; // Rust
Every time I invoke those functions, I am allocating data if not already allocated. This is easier to see in Rust as the conversion is explicit, but not so easy in C++
// No extra allocation
std::string my_val = "Some data to be used";
size_t count = length(my_val);
// Extra allocation to count something that is static
size_t count2 = length("More data to be counted");
It might look silly, but let's get another real-world example where it matters more
// Declared somewhere
std::vector<uint_8> compress(const std::vector<uint8_t> &chunk);
std::vector<uint8_t> my_big_data;
// I need this extra allocation if I am not using std::span, a view.
std::vector<uint8_t> slice_of_data = {my_big_data.begin(), my_big_data.end() - my_big_data.size() / 2};
std::vector<uint8_t> res = compress(slice_of_data);
And another interesting example
std::unordered_map<std::string, my_class> my_hashmap;
// Oops! extra allocation!
my_hashmap.find("something");
Second: Memory ownership
When you receive a view, you should not move the data
void do_something(std::span subset_of_data);
// It's clear the memory cannot be moved out of subset_of_data
C++ has std::string_view and std::span !
That is true and it's great it has. However, tons and tons of code prefer to pay the price of using const std::string &val
then using const char *
, which makes sense. Also, tons of libraries will have a hard time to convert from storing std::string and accepting std::string_view as there are lots of meta programming relying on that and not a clear way to make the connection.
On the contrary. I am grateful new languages are getting it right and old languages are catching up
Rust is a very well designed language and there are 2 interesting ways they to that
fn length(bug: &str) -> usize;
let chars1 = length("My data"); // No extra allocation
let val: String = socket.recv();
let chars2 = length(val.as_str());
// or
let chars2 = length(&val);
// In rust, this conversion is defined in the Deref trait. Again, no extra allocation
Many other posts of mine have no conclusion, specially when having a conclusion is silly, but this one almost deserves one. That said, let's pay attention to the ownership of data!