Reference for the incomplete beginner task Installation Guide
Racing Riverboats is the third task of the Think Cairo Cairo Data Structures and Algorithms task, mainly focusing on the problem of finding duplicate values in an array using dictionary queries. There are two sub-tasks.
Index of Duplicate Values in a List#
The first sub-task, "Finding the Crocodile," essentially involves finding duplicate items in an array, marking their indices, and storing them in another array.
The array contains sub-elements that are structs composed of u32, u32, and felt252. The initial data is an Array<Obstacle>
.
#[derive(Drop)]
struct Obstacle {
length: u32,
width: u32,
description: Array<felt252>
}
The initial approach is to use Felt252Dict, loop through the Obstacles, and pop them into the Felt252Dict, similar to a mapping(index => struct)
in Solidity. If the value exists, the index is recorded; otherwise, it is inserted.
After writing the initial version of the code, it was discovered that since the sub-element Array<felt252>
of Obstacle cannot be copied, felt252Dict.get(index) -> Obstacle
is not valid. Therefore, a different approach is needed.
If a dictionary is not used, a nested loop can be used to compare the Obstacles and add the duplicates to a temporary list. Then, the indices can be found. However, during testing, it was found that the code ran out of gas, consuming 3.2 million gas instead of the target 1.8 million. Optimization is needed.
Analyzing the gas limit of the test case, the cost of comparing two Obstacles once is 26,100 gas. The test case contains 13 Obstacles, so calculating 1,800,000 gas allows for a maximum of 69 comparisons. The above algorithm requires a total of 78 comparisons (1+2+...+12). Therefore, the number of comparisons needs to be reduced.
The approach taken was to use the test cases of the core library dict method, using Felt252Dict to store indices. If a value is obtained, it is skipped, reducing unnecessary comparisons. However, it still exceeded the gas limit, and reducing the number of comparisons could not be further optimized. Therefore, the cost of a single comparison needs to be optimized, specifically the comparison of Obstacle.description.
After being stuck for a while without any ideas, I reached out to the NG mentor and he provided a solution to hash the Obstacle, achieving the goal with a complexity of O(n).
So how do we hash a struct? Initially, I changed #[derive(Drop)]
of Obstacle to #[derive(Drop, Serde)]
and used the following code to obtain the hash:
let mut raw_data = Default::default();
obs.serialize(ref raw_data);
let hash = poseidon_hash_span(raw_data.span());
Later, it was found that it couldn't pass the online test because Obstacle cannot be modified.
If Obstacle cannot be modified, all the attributes of the struct can be manually packed and then hashed.
After successfully hashing, it is only necessary to insert (index, hash)
and get(index)
in the dictionary to obtain the duplicate indices. The code can then pass the tests smoothly.
There are also many conversions between u_size and felt252, which require the use of into
and unwarp
. You can refer to the documentation to solve them.
Another pitfall here is that if the index is 0 and the value is also 0, both dict.get(0)
and dict.get(nonexistent index)
will return 0, leading to incorrect judgments and failing the tests related to the second sub-task.
This can be resolved by using the library use nullable::{NullableTrait, match_nullable, FromNullableResult};
. It is recommended to study the source code of nullable for this part.
To summarize, the key points of the first sub-task are:
- How to hash a struct
- Conversions between u_size and felt252
- Felt252Dict's new, get, and insert methods
- Usage of nullable
Option Return in Conditional Statements in Loops#
This sub-task involves returning the minimum number of steps needed to surpass an array (river) without exceeding a step length of 3. If the first and last cells are crocodiles, None is returned.
The approach is to use nested loops, jumping 3, 2, or 1 step at a time if possible, and incrementing the step count. If it is not possible to jump any of these distances, Option::None is returned. Multiple jumps are made to achieve the goal.
The implementation of this sub-task is relatively simple. After obtaining the index array from the first sub-task, it can be solved using two loops. The main challenge is encountering out of gas issues during testing, which means that checking for an empty river and checking if the start and end points are crocodiles should be done in advance to save gas.
It is recommended to compare the test cases and make use of print statements to see where each step jumps to.
Some pitfalls have already been mentioned in the first sub-task, so if they are addressed in advance, there is no need to make further changes.
After submitting the verification, there was a small bug that prevented passing the tests. It is important to note whether the condition for breaking the loop is river.len
or river.len-1
.
This sub-task involves many return values in loops, and since Cairo does not have a return statement, it is easy to make mistakes and requires multiple modifications.
It also provides practice with various operations related to Option.
Conclusion#
Since Cairo's memory is immutable and does not allow slot overriding, some techniques are needed to achieve the effect of a dictionary. Familiarity with these operations will be helpful for future development.