Rust Roadmap 2021: Allowing for arbitrary size integer primitives
- 1309 words
- 7 min
In this blog entry following the Call for Rust 2021 Roadmap items, I shall layout my vision for giving Rust the ability to use arbitrarily sized integer primitives.
Motivation
This request has multiple motivations but following the Call for Rust 2021 Roadmap items rules I'm notating them in the agile user story way:
- As a system developer I want to be able to be able to use automatically range limited types to prevent incorrect use and eliminate error checking
- As an embedded developer I want to be able to safely represent memory-mapped registers as bitfields
So let me give you a bit more background about the status quo and what I mean by those two stories before giving you my ideas how to address them.
Range limited types
In a lot of areas it is quite common to have the desire to provide range
limited types whose allowed ranges do not coincide with the natural primitive
but still are a power of 2. E.g. say you want to provide an imaging library
which is supposed to work with 3 components (R
, G
and B
). The fact that
we have three components might already provide a hint at what I'm trying to get
at...
So let's continue our exploration... A typical representation is RGB888
which
means that each component consists of 8 bits (which is great since we can
simply use 3 u8
s for that and by keeping the components separate we should be
fine, and not have to worry about the combined representation of 24 bits).
However, a lot of embedded displays only have a color depths of 16 bits in which case the representation is usually something like (RGB565) which leaves us with a number of options how to handle those values, neither of which is ideal.
Today we could do:
/// Values outside of the range will be clamped to the highest allowed
/// component value
pub fn set_pixel_rgb565(&mut self, r: u8, g: u8, b: u8) {
let (r, g, b) = (max(r, 31), max(g, 63), max(b, 31));
...
}
or:
/// Supplying components values outside the designated range are a
/// definite grave user error
pub fn set_pixel_rgb565(&mut self, r: u8, g: u8, b: u8) {
assert!(r < 32);
assert!(g < 64);
assert!(b < 32);
...
}
or:
/// Supplying components values outside the designated range are a
/// will yield a Result needed to be handled by the caller
pub fn set_pixel_rgb565(&mut self, r: u8, g: u8, b: u8) -> Result<(), E> {
if r >= 32 || g >= 64 || b >=32 {
return Err(E::OutOfRange);
}
...
}
Neither of the three cases are good and really just making a tradeoff of some kind.
What we really want to do instead is:
/// Rust will do the type-checking for us
pub fn set_pixel_rgb565(&mut self, r: u5, g: u6, b: u5) {
...
}
This ensures that only correct types will be passed in and they have exact the same behaviour as other all other "natural" primitives.
Before people will scream at me "why don't you use the ux crate for this?", I'd like to point out that this is rather incomplete and outdated and also I'm convinced that the compiler can do a much better job at compile time to ensure that incorrect uses of such types will be noticed at compile time instead of panicking at runtime.
But let's move on to the other use case...
Bitfields
One glaring omission in the Rust type system is the availability of bitfields
which is arguably a MUST HAVE for any systems programming language.
Especially in embedded Rust we have thousands of memory-mapped-input-output
(MMIO) registers which cram a number of different functions, identified by one
or more bits, into a value located at a specific memory address. To access the
individual bits, other languages allow the (very unsafe) specification of
structures as bitfields while we in the Rust world have to use either gnarly
macros or gobs of Rust code (which also turn into gobs of binary code in dev
mode, but I'll leave that to another Roadmap blog post maybe) and are
superslow to compile.
I don't bore you with details or pages of Rust code here, but if you're
interested in seeing real life code in use today for one single register(!)
of of a STM32F042 MCU generated by svdrust
, here's a gist: I2C1 CR1 svd2rust code.
You're also invited to browse the STM32F0 svd2rust crate docs which is
constantly getting in conflict with our docs.rs infrastructure due to the
massive amounts of code required.
It would be really cool if we could express the basic structure of such a MMIO
register by using the new primitive types in a regular struct
:
#[repr(u32)]
struct CR1 {
#[doc = "Bit 0 - Peripheral enable"]
pe: u1,
#[doc = "Bit 1 - TX Interrupt enable"]
txie: u1,
#[doc = "Bit 2 - RX Interrupt enable"]
rxie: u1,
#[doc = "Bit 3 - Address match interrupt enable (slave only)"]
addrie: u1,
#[doc = "Bit 4 - Not acknowledge received interrupt enable"]
nackie: u1,
#[doc = "Bit 5 - STOP detection Interrupt enable"]
stopie: u1,
#[doc = "Bit 6 - Transfer Complete interrupt enable"]
tcie: u1,
#[doc = "Bit 7 - Error interrupts enable"]
errie: u1,
#[doc = "Bits 8:11 - Digital noise filter"]
dnf: u4,
#[doc = "Bit 12 - Analog noise filter OFF"]
anfoff: u1,
#[doc = "Bit 14 - DMA transmission requests enable"]
txdmaen: u1,
#[doc = "Bit 15 - DMA reception requests enable"]
rxdmaen: u1,
#[doc = "Bit 16 - Slave byte control"]
sbc: u1,
#[doc = "Bit 17 - Clock stretching disable"]
nostretch: u1,
#[doc = "Bit 18 - Wakeup from STOP enable"]
wupen: u1,
#[doc = "Bit 19 - General call enable"]
gcen: u1,
#[doc = "Bit 20 - SMBus Host address enable"]
smbhen: u1,
#[doc = "Bit 21 - SMBus Device Default address enable"]
smbden(: u1,
#[doc = "Bit 22 - SMBUS alert enable"]
alerten: u1,
#[doc = "Bit 23 - PEC enable"]
pecen: u1,
// Need to always declare all bits according to `repr` size
_unused : u8,
}
Looks very nice and tidy, doesn't it? Of course we'll still need the accessor functions to differentiate between read-only and read-write fields but this gets rid of all the error prone bit-shifting and masking code.
Another usecase of this is real (and space-/execution efficient!) bitmaps:
/// size_of::<[bool; 10]>() == 10
let arr : [bool; 10] = [false; 10];
/// size_of::<[u1; 10]>() == 2
let arr : [u1; 10] = [0; 10];
Implementation ideas
Obviously we cannot simply add a plethora of new primitives and be done with it, there're some problems with non-natural sized primitives:
- We cannot (easily) calculate and use a pointer to such a type, so taking a pointer would have to be forbidden
- The compiler might (or might not) be bogged down if we add dozens or even
more than hundred new types (depending on how far we want to take this) so we
might potentially have to not enable those types by default and require
explicit
use
statements instead, which should be fine - Using a bitfield for MMIO might need some additional rules or some more and easier to find documentation than what's currently hidden in the nomicon
- ... add your own here
Some random to-dos I can think of from the top of my head:
- Allow
#[repr(T)]
onstruct
s iff the sizes match - Add
u
x andi
x types to the primitives - Properly reflect them internally in the compiler so we can eliminate unnecessary bound checks
- It might be possible to hook those types directly into LLVM to use by code generation backends
To avoid this blog from getting too long I'm going to turn this over to the core team now for initial consideration.
Thanks for your attention and hope to blog to y'all soon!