Rust 2019: security
Introduction
I’ve decided to write this blog post because this is one of Rust’s main selling points and the most important to me: memory safety without garbage collection.
The truth of that statement relies on writing purely safe Rust. Unfortunately, that’s not a real world case scenario. Some crates will use unsafe but most of them will depend on at least one crate which uses unsafe.
While these unsafe cases could be completely safe, they cannot be validated by the compiler so they rely on the developer’s good judgment. Now, this brings us to the same problem of having a C codebase but with easier findable bugs (grep unsafe
).
The problem is now that if we are measuring safety of Rust according to safe vs unsafe, the most obvious metric to optimize for is amount of unsafe lines in the code.
And this is why I think that security should be part of Rust’s stable ecosystem development. And 2019 should be the year for improving the processes around it.
This is not something new and working groups are tackling this problem from different angles and you should join them if you are interested:
Unsafe Rust
A big part of the unsafe code I usually see can fit into these categories:
- Integrating Rust with other components through FFI
- Functionality not available in
core
orstd
(standard library) - Optimizations that cannot be expressed through the type system
- Implement send and sync (I’m not discussing it here)
Rust+FFI
The community has already learned that Rewrite it in Rust doesn’t scale well for big or fast-moving projects.
On the other hand, one of the things that I learned is that gradually replacing C/C++ with Rust code works quite well. The same happens with encapsulating C code with safe Rust abstractions.
For mixing code, we use ffi bindings usually generated by bindgen.
Isolation
A problem that has no practical solution yet is how can we guarantee safety in these cases of mixed code. This happens in almost every programming language that supports FFI so a possible implementation here might be able to be used in others.
The way I see it being tackled is having isolation at the FFI level. Where, either, we impose a serialization barrier to a process inside a sandbox, or use more modern compartmentalization technologies (such as CHERI).
While isolating unsafe code (C/C++) into a sandbox might solve the problem and can probably be implemented “easily” through bindgen and other existing libraries, it will incur into performance penalty very quickly and could also open a Pandora box of issues.
Memory Corruption Mitigations
Furthermore, to not degrade the security of the existing components when mixing other (unsafe) code with Rust, we must support all the memory corruption mitigations supported by the C and C++ compilers like control flow integrity. In Microsoft security is a priority when components are shipped and not being able to impose the full security mitigations spectrum on a component which uses unsafe or mixed code might be a blocker.
Control flow guard, for example, requires not only support from the linker to generate the tables and inject the right code (which both LLVM and MSVC Linker already do), but also metadata emitted by Rust’s frontend.
Functionality not available in std
It’s very hard to track which are these cases and even harder to decide which are worth adding to core
or std
so external libraries don’t have to implement it themselves. There is also the question of is it really worth putting everything in the standard library?
An example I can think of now is casting between memory representations of the same size like u64
to [u8;8]
, or [f32]
to [u32]
as used in byteorder.
The Secure Code WG is already looking for similar patterns so I’m very positive about seeing a lot of progress in this area in 2019.
A different approach to putting it all in the standard libraries is making a database of components audited by third parties which developers can trust. This is the approach being taken by cargo-crev. In my opinion, the only way that this is gonna work is if it is integrated with Cargo once the idea is validated.
Optimizations
I often see, that in hot-paths, developers tend to do nasty things to avoid having performance impact there. More often than not, the code is “safe” but uses unsafe
.
One of the examples was that back when I reviewed edgelet’s source code, I found that the only unsafe code outside of FFI was for copying data inside uninitialized buffers [[EDIT: This is imposed when implementing one of Tokio APIs and not a decision by IoTEdge team]]. While the operation of reading from these buffers is undefined, writing to them isn’t but it’s not possible to specify it with the current types.
This particular example might fall inside the previous point but I think it deserves a distinction between “unsafe because faster” and “unsafe because I can’t safe”. The solution, though, is very similar to what I proposed for the former.
Auditability
While optimizing for 0 unsafe is a great thing, there are will always be unsafe somewhere.
I like these tools and I think they will have greater impact when more companies start adopting Rust and have to fit it into the security development life-cycle (SDL):
- cargo-audit has a DB of crates with vulnerabilities and you can run it to check if your projects are using them
- cargo-fuzz makes fuzzing programs really easy through the help of LLVM’s libFuzzer
- cargo-geiger detects unsafe code usage from the dependencies
- cargo-crev is (an interface to) a database of audited components
Leveraging all this tools in CI will surely help with SDL and I can see a 2019 where they become an integral part of the development process.
Others
Fallible allocations
While detecting memory allocation failure is a hard problem, the current situation of succeed or crash is an issue for places where you can predict that the allocation might fail. I worked on implementing the try_alloc
RFC for the different collections but didn’t submit a stabilization PR yet because of concerns on efforts trying to improve the Allocator API
(maybe) breaking backwards compatibility if it is stabilized.
Async/await
I’d like to see the whole async/await story improve with tokio adopting futures-0.3 and porting many of the derived crates to it as well. This year I tried using the new syntax with futures-0.3 but the ecosystem wasn’t there yet.
A better IDE experience
Others already mentioned it, but I think we need RLS to continue improving through the great work that has been getting during the past years. Not being able to use, in my case, VSCode in the same way as for C# or TypeScript is a little frustrating.
Conclusion
While many interesting things happened during 2018. I see 2019 as a period for improving the ecosystem and tooling, and continue getting feedback on adoption blockers.
In my case, we are using it in Microsoft with a lot of very interesting Rustaceans and I think security should be a top priority if we want to go after not quite memory safe languages like C and C++.
Thanks for reading.
- Follow me on Twitter: @snfernandez
- Contact me at Gmail: sebanfernandez
- Secure is better: GPG Key