COMP6991 - Solving Modern Programming Problems with Rust
Prac Exam for 22T3
With my own solution (might be incorrect)
Question
A C programmer who is starting to learn Rust has asked: "Aren't match statements just complicated if statements?". Give a specific example of a situation where you believe a match statement would significantly improve code quality, instead of a series of if/else statements.
In some situations, match statement is just like a more complex if statement. However, it has more davantages in cleanliness and expressiveness, especially when you need to have some enums, or handle multiple cases.
For example, I am at a cross and there is a sign to guide me. The text on the sign could only be three kinds, and we could set it as an enum statement:
RUSTenum SignText {
TurnLeft,
TurnRight,
GoStraight,
}
In Rust, we could use match to handle multiple cases:
RUSTfn action(text: SignText) -> &static str {
match text {
SignText::TurnLeft => "Turn Left",
SignText::TurnRight => "Turn Right",
SignText::GoStraight => "Go Straight",
}
}
However, if/else statement is more complex:
RUSTfn action(text: SignText) -> &static str {
if light == SignText::TurnLeft {
"Turn Left"
} else if light == SignText::TurnRight {
"Turn Right"
} else if light == SignText::GoStraight {
"Go Straight"
} else {
"unreachable"
}
}
Question
The following Rust code fails to compile, but equivalent code in other popular programming languages (e.g. C, Java, Python) compiles and/or works correctly. Explain what issue(s) prevent the Rust compiler from building this code, and the philosophy behind this language decision.
RUSTstruct Coordinate {
x: i32,
y: i32,
};
let coord1 = Coordinate {x: 1, y: 2};
let coord2 = coord1;
let coord_sum = Coordinate { x: coord1.x + coord2.x, y: coord1.y + coord2.y };
The problem is ownership. In this code, the ownership of coord1 has been transfered to coord2. Therefore, coord_sum could not use coord1.x or coord1.y to state itself.
Rust uses ownership to prevent the potential data race or null pointer, and give a reliable concurrency environment and high performance.
Question
In other languages, the operation: "first_string
" + "second_string
" produces a new string, "first_stringsecond_string". This particular operation does not work in Rust.
Why does Rust not implement this operation on the &str type?
Because all of the linking operation need new memory space to create and store new string data. However, &str
is not a mutable reference. Using +
operation will break the rule of ownership principle.
Would it be possible for the Rust language developers to implement this? What rust feature would they use to implement it?
Developers could create a special trait Add
, and use overloaded operator to make it. Add
could make the result of a plus between two &str
to be a String
.
Do you think the Rust language developers should implement this operation? Give one reason to justify your answer.
No, I do not think so. The most important part of Rust is security and reliability. It bases on ownership, borrowing, memory management, and other principles. However, auto transition between &str
and String
and memory management could make some hidden problem in a whole project, and it could be deadly in production environment.
Question
In this activity, you will be building a small text searching system. It should search a large string for sentences that contain a particular search term. Another function will then look through all the search results to determine how often each sentence was found.
You have been given starter code which does not yet compile. Your task is to fill in both todo!()
statements, as well as to add lifetimes where required in order to build your code.
You are not permitted to change the return type of functions, the names of structs, or the types of structs. You may also not change the main function, and you should expect that the main function could be changed during testing. You will, however, have to add lifetimes to existing types in order to successfully compile your code.
This is an example of the expected behaviour:
$ 6991 cargo run test_data/test_data.txt Finished dev [unoptimized + debuginfo] target(s) in 0.36s Running `target/debug/prac_q2` there very prove the universe Ctrl - D Found 1 results for 'there'. Found 9 results for 'very'. Found 1 results for 'prove'. Found 11 results for 'the universe'. '8 billion years ago, space expanded very quickly (thus the name "Big Bang")' occured 1 times. 'According to the theory the universe began as a very hot, small, and dense superforce (the mix of the four fundamental forces), with no stars, atoms, form, or structure (called a "singularity")' occured 2 times. 'Amounts of very light elements, such as hydrogen, helium, and lithium seem to agree with the theory of the Big Bang' occured 1 times. 'As a whole, the universe is growing and the temperature is falling as time passes' occured 1 times. 'Because most things become colder as they expand, scientists assume that the universe was very small and very hot when it started' occured 2 times. 'By measuring the redshift, scientists proved that the universe is expanding, and they can work out how fast the object is moving away from the Earth' occured 2 times. 'Cosmology is the study of how the universe began and its development' occured 1 times. 'Other observations that support the Big Bang theory are the amounts of chemical elements in the universe' occured 1 times. 'The Big Bang is a scientific theory about how the universe started, and then made the stars and galaxies we see today' occured 1 times. 'The Big Bang is the name that scientists use for the most common theory of the universe, from the very early stages to the present day' occured 2 times. 'The more redshift there is, the faster the object is moving away' occured 1 times. 'The most commonly considered alternatives are called the Steady State theory and Plasma cosmology, according to both of which the universe has no beginning or end' occured 1 times. 'The most important is the redshift of very far away galaxies' occured 1 times. 'These electromagnetic waves are everywhere in the universe' occured 2 times. 'This radiation is now very weak and cold, but is thought to have been very strong and very hot a long time ago' occured 1 times. 'With very exact observation and measurements, scientists believe that the universe was a singularity approximately 13' occured 2 times.
RUSTuse std::fs;
use std::env;
use std::io::{self, BufRead};
use std::error::Error;
use std::collections::HashMap;
// NOTE: You *may not* change the names or types of the members of this struct.
// You may only add lifetime-relevant syntax.
pub struct SearchResult<'a, 'b> {
pub matches: Vec<&'a str>,
pub contains: &'b str
}
/// Returns a [`SearchResult`] struct, where the matches vec is
/// a vector of every sentence that contains `contains`.
///
/// A sentence is defined as a slice of an `&str` which is the first
/// character of the string, or the first non-space character after
/// a full-stop (`.`), all the way until the last non-space character
/// before a full-stop or the end of the string.
///
/// For example, In the string "Hello. I am Tom . Goodbye", the three
/// sentences are "Hello", "I am Tom" and "Goodbye"
fn find_sentences_containing<'a, 'b>(text: &'a str, contains: &'b str) -> SearchResult<'a, 'b> {
let mut sentences = Vec::new();
let mut start = 0;
for (index, character) in text.char_indices() {
let is_end_of_sentence = character == '.' || index == text.len() - 1;
if is_end_of_sentence {
let end = if character == '.' { index } else { text.len() };
let sentence = text[start..end].trim();
if sentence.contains(contains) {
sentences.push(sentence);
}
start = end + 1;
}
}
SearchResult {
matches: sentences,
contains
}
}
/// Given a vec of [`SearchResult`]s, return a hashmap, which lists how many
/// time each sentence occured in the search results.
fn count_sentence_matches<'a, 'b>(searches: Vec<SearchResult<'a, 'b>>) -> HashMap<&'a str, i32> {
let mut counts = HashMap::new();
for search_result in searches {
for sentence in search_result.matches {
let count = counts.entry(sentence).or_insert(0);
*count += 1;
}
}
counts
}
/////////// DO NOT CHANGE BELOW HERE ///////////
fn main() -> Result<(), Box<dyn Error>> {
let args: Vec<String> = env::args().collect();
let file_path = &args[1];
let text = fs::read_to_string(file_path)?;
let mut sentence_matches = {
let mut found = vec![];
let stdin = io::stdin();
let matches = stdin.lock().lines().map(|l| l.unwrap()).collect::<Vec<_>>();
for line in matches.iter() {
let search_result = find_sentences_containing(&text, line);
println!("Found {} results for '{}'.", search_result.matches.len(), search_result.contains);
found.push(search_result);
}
count_sentence_matches(found).into_iter().collect::<Vec<_>>()
};
sentence_matches.sort();
for (key, value) in sentence_matches {
println!("'{}' occured {} times.", key, value);
}
Ok(())
}
Question
In this question, your task is to complete two functions, and make them generic: zip_tuple
and unzip_tuple
. Right now, the zip_tuple
function takes a Vec<Coordinate>
and returns a tuple: (Vec<i32>, Vec<i32>)
. The unzip_tuple
function performs the inverse of this.
This code currently does not compile, because q3_lib
(i.e. lib.rs
) does not know what the type of Coordinate
is. Rather than telling the functions what type Coordinate
is, in this exercise we will make the functions generic, such that it works for both q3_a
(i.e. main_1.rs
) and q3_b
(i.e. main_2.rs
). This is to say, tuple_unzip
should work for any Vec<T>
such that T
implements Into into a 2-tuple of any 2 types, and tuple_zip
should work for any Vec<(T, U)>
such that (T, U)
implements Into into any type.
Once you have modified your function signatures for tuple_unzip
and tuple_zip
, you should find that the only concrete type appearing within the signature is Vec
. In other words, the functions should work for any type which can be created from a 2-tuple and which can be converted into a 2-tuple.
RUSTpub fn tuple_unzip<T, A, B>(items: Vec<T>) -> (Vec<A>, Vec<B>)
where
T: Into<(A, B)>,
{
let mut first = Vec::new();
let mut second = Vec::new();
for item in items {
let (a, b) = item.into();
first.push(a);
second.push(b);
}
(first, second)
}
pub fn tuple_zip<T, A, B>(items: (Vec<A>, Vec<B>)) -> Vec<T>
where
T: From<(A, B)>,
{
items.0.into_iter().zip(items.1.into_iter()).map(|(a, b)| T::from((a, b))).collect()
}
Question
Steve is writing some Rust code for a generic data structure, and creates a (simplified) overall design alike the following:
RUSTstruct S {
// some fields...
}
impl S {
fn my_func<T>(value: T) {
todo!()
}
}
He soon finds that this design is not sufficient to model his data structure, and revises the design as such:
RUSTstruct S<T> {
// some fields...
}
impl<T> S<T> {
fn my_func(value: T) {
todo!()
}
}
Give an example of a data-structure that Steve could be trying to implement, such that his first design would not be sufficient, and instead his second design would be required for a correct implementation. Furthermore, explain why this is the case.
Because in the first design, the structure S
could not support generic. It means that all of the S
use the same structure, regardless of the data type they should operate on. my_func
could use generic T
to operate data, but it may not store the data because S
could not handle it.
In the second design, S<T>
could use some generic fields to support different type data storage.
RUSTstruct S {
elements: Vec<i32>, // only handle i32 value
...
}
struct S<T> {
elements: Vec<T>, // handle generic value
...
}
Question
Emily is designing a function that has different possibilities for the value it may return. She is currently deciding what kind of type she should use to represent this property of her function.
She has narrowed down three possible options:
fn foo(...) -> impl
Trait)For each of her possible options, explain one possible advantage and one possible disadvantage of that particular choice.
An enum
A trait object
A generic type (as fn foo(...) -> impl
Trait)
impl
allows functions to return any type with a generic. It is very flexiable.imple Trait
functions usually could only return 1 type of value. It could limit the functions when they want to use different return types.Question
Rust's macro system offers an extremely flexible method for code generation and transfiguring syntax, but this language feature comes with certain costs. Identify 3 downsides to the inclusion, design, or implementation of Rust's macro system.
(Note that your downsides may span any amount and combination of the categories above. e.g. you could write all 3 on just one category, or one on each, or anything in-between.)
Question
In many other popular programming languages, mutexes provide lock()
and unlock()
methods which generally do not return any value (i.e. void
).
What issues could this cause?
How does Rust differently implement the interface of a Mutex
, and what potential problems does that help solve?
What issues could this cause?
unlock()
after every lock()
, and it could cause the resource could not be released or even deadlock.lock()
and unlock()
, and it could make the resource could not be released or even deadlock.How does Rust differently implement the interface of a
Mutex
, and what potential problems does that help solve?
Question
In Rust, locking a Mutex returns a Result, instead of simply a MutexGuard. Explain what utility this provides, and why a programmer might find this important.
Result
allows Mutex to process other resources or just wait to try again if this lock operation is failed, rather than execute a panic operation.Result
could make it understand that this resource is using rather than wait for a long time, it is useful for some time-limited situations.Question
While reviewing someone's code, you find the following type: Box<dyn Fn() -> i32 + Send>
.
Explain what the + Send
means in the code above?
Explain one reason you might need to mark a type as Send
, and what restrictions apply when writing a closure that must be Send.
Explain what the + Send means in the code above?
Send
trait could mark this resource as a safe type which could transfer from one thread to another. Its ownership could also transfer.
Explain one reason you might need to mark a type as Send, and what restrictions apply when writing a closure that must be Send.
Reason: You might want to use this type in a multithreaded environment. Use Send
to ensure that it could be used through threads and do not break the security rule of Rust.
Restriction:
Send
, it could not get any non-Send variables.Send
constraint.Question
Your friend tells you they don't need the standard library's channels, since they've implemented their own alternative with the following code:
RUSTuse std::collections::VecDeque;
use std::sync::Mutex;
use std::sync::Arc;
use std::thread;
#[derive(Clone, Debug)]
struct MyChannel<T> {
internals: Arc<Mutex<VecDeque<T>>>
}
impl<T> MyChannel<T> {
fn new() -> MyChannel<T> {
MyChannel {
internals: Arc::new(Mutex::new(VecDeque::new()))
}
}
fn send(&mut self, value: T) {
let mut internals = self.internals.lock().unwrap();
internals.push_front(value);
}
fn try_recv(&mut self) -> Option<T> {
let mut internals = self.internals.lock().unwrap();
internals.pop_back()
}
}
fn main() {
let mut sender = MyChannel::<i32>::new();
let mut receiver = sender.clone();
sender.send(5);
thread::spawn(move || {
println!("{:?}", receiver.try_recv())
}).join().unwrap();
}
Identify a use-case where this implementation would not be sufficient, but the standard library's channel would be.
Furthermore, explain why this is the case.
std provide blocking receive operations. When the queue is empty, blocking receive operations put the thread to sleep until data is available. However, this implementation could only return None
directly when it is empty. It will be in busy-waiting, and waste the CPU resource.
Question
The "Read Copy Update" pattern is a common way of working with data when many sources need to be able to access data, but also to update it. It allows a user to access a value whenever it's needed, achieving this by never guaranteeing that the data is always the latest copy. In other words, there will always be something, but it might be slightly old. In some cases, this trade-off is one that's worth making.
In this task, you will be implementing a small RCU data-structure. You should ensure that:
You have been given some starter code for the type RcuType<T>
, including some suggested fields, and the required interface. Ensure you first understand the requirements of this task, and then implement the methods described in the starter code.
RUSTuse std::sync::{RwLock, Arc, atomic::{AtomicUsize, Ordering}};
pub struct RCUType<T> {
data: Arc<RwLock<Arc<T>>>,
generation: Arc<AtomicUsize>,
}
impl<T> RCUType<T> {
/// Creates a new `RCUType` with a given value.
pub fn new(value: T) -> RCUType<T> {
RCUType {
data: Arc::new(RwLock::new(Arc::new(value))),
generation: Arc::new(AtomicUsize::new(0)),
}
}
/// Will call the closure `updater`, passing the current
/// value of the type; allowing the user to return a new
/// value for this to store.
pub fn update(&self, updater: impl FnOnce(&T) -> T) {
let mut data_guard = self.data.write().unwrap();
let new_value = updater(&data_guard);
*data_guard = Arc::new(new_value);
self.generation.fetch_add(1, Ordering::SeqCst);
}
/// Returns an atomically reference counted smart-pointer
/// to the most recent copy of data this function has.
pub fn get(&self) -> Arc<T> {
Arc::clone(&self.data.read().unwrap())
}
/// Return the number of times that the RCUType has been updated.
pub fn get_generation(&self) -> usize {
self.generation.load(Ordering::SeqCst)
}
}
impl<T> Clone for RCUType<T> {
fn clone(&self) -> Self {
Self {
data: self.data.clone(),
generation: self.generation.clone(),
}
}
}
Question
Gavin writes a blog post critical of Rust, especially with respect to unsafe. In his blog post, he claims that it's not possible to have any confidence in the overall safety of a Rust program since "even if you only write safe Rust, most standard functions you call will have unsafe code inside them".
I partially disagree. It is true that there are lots of unsafe code in Rust std, however, it does not mean that Rust code is unsafe.
Question
Hannah writes a Rust program that intends to call some C code directly through FFI. Her C function has the following prototype:
Cint array_sum(int *array, int array_size);
Note that you can assume that this C code is written entirely correctly, and the below extern "C" block is an accurate translation of the C interface.
Her Rust code is currently written as follows:
RUSTuse std::ffi::c_int;
#[link(name = "c_array")]
extern "C" {
fn array_sum(array: *mut c_int, array_size: c_int) -> c_int;
}
fn test_data() -> (*mut c_int, c_int) {
let size = 10;
let array = vec![6991; size].as_mut_ptr();
(array, size as c_int)
}
fn main() {
let sum = {
let (array, size) = test_data();
// Debug print:
let message = format!("Calling C function with array of size: {size}");
println!("{message}");
unsafe { array_sum(array, size) }
};
println!("C says the sum was: {sum}");
}
She expects that if she runs her code, it should print that the C code summed to 69910
. To her surprise, she runs the program and finds the following:
$ 6991 cargo run Finished dev [unoptimized + debuginfo] target(s) in 0.00s Running `target/debug/ffi` Calling C function with array of size: 10 C says the sum was: -2039199222
Hannah correctly concludes that there must be a problem with her Rust code.
Identify the issue that is causing the program to misbehave.
When use vec![6991; size].as_mut_ptr()
to get mut ptr, this vec will be released when the function test_data()
is end, because this function owns this vec. Therefore, when this ptr gets in array_sum
, it becomes a wild pointer and occurs the error output.
Describe a practical solution Hannah could use to fix the bug.
ptr
should in a bigger scope to ensure that it is valid when it gets in array_sum
.
For example, we could change the test_data()
to return the ptr
.
RUSTfn test_data() -> (Vec<c_int>, *mut c_int, c_int) {
let size = 10;
let mut array = vec![6991; size];
let ptr = array.as_mut_ptr();
(array, ptr, size as c_int)
}
fn main() {
// 'array' here keeps the vector alive
let (array, ptr, size) = test_data();
let sum = unsafe { array_sum(ptr, size) };
println!("C says the sum was: {sum}");
}
Explain why Rust wasn't able to catch this issue at compile-time.
Because she used unsafe
. Rust does not check unsafe
code because it thinks developers understand what they are doing and it does not send any memory safety error message.
Question
The final question of the exam will be a more open-ended question which will ask you to perform some analysis or make an argument. Your argument will be judged alike an essay (are your claims substantiated by compelling arguments). Remember that you will not get any marks for blindly supporting Rust.
A friend of yours has just read this article, and thinks that it means they shouldn't learn Rust.
Read through the article, and discuss the following prompt:
Rust is not worth learning, as explained by this article.
The overall structure of your answer is not marked. For example, your answer may include small paragraphs of prose accompanied by dot-points, or could instead be posed as a verbal discussion with your friend. Regardless of the structure / formatting you choose, the substance of what you write is the most important factor, and is what will determine your overall mark for this question.
本文作者:Jeff Wu
本文链接:
版权声明:本博客所有文章除特别声明外,均采用 BY-NC-SA 许可协议。转载请注明出处!