Project address: github.com/netwarps/ru…

preface

In Rust, async-STD and Tokio, as two asynchronous runtime libraries with more users, have their own advantages. Rust-ipfs is the rust implementation of IPFS. The runtime used is Tokio, and the underlying network library is based on rust-libp2p. In an attempt to change the underlying rust-libp2p to libp2P-rs, we forked a code port from the original repository, which is now complete. I now share a hang problem encountered during migration.

Problem description

First I started a go-ipfs daemon and got the multiaddress information using the ipfs ID command. Then run simple, the example rust-IPFS program, to ensure a successful connection to Go-IPFS. In IPFS, the maximum number of blocks supported by a DAG node is 174, or 43.5MiB. I stored a file of about 77MiB through go. When I fetched it through rust, I found that I got 128 blocks at most, and the test code had no response.

Search time too long?

Bitswap will throw an Error message if the cid does not return a block for 30 seconds. Therefore, it will take a long time to search for the block and put it aside. But after about a dozen minutes, the console didn’t throw any BitswapError messages, and you realized that things might not be as simple as you thought.

Blockstore hang

Through the layer-by-layer print-log side-by-side errors, the problem is finally located in the call to the get_block() method.

Search for the block corresponding to the CID in the local Blockstore. If no block is found, use bitswap to search for the block. When tested, BlockStore used a Hashmap wrapped in Tokio ::Mutex, and the hang problem occurred in the step of getting blocks from the Hashmap, which is line 383 in the figure.

Tokio resource limits

With the help of a paper by Tokio, we have found a solution to this problem.

Since Tokio is not preemptive scheduling, one task may be executing all the time, causing other tasks to fail to be scheduled and remain hungry all the time. In some languages, it is possible to interrupt execution by injecting yield points, but the Rust generator does not seem to provide similar functionality.

Therefore, tokio introduced the concept of budget in version 0.2.14 to solve this problem. This can be understood as a quota, and every resource in Tokio will know this value. The default value of budget is 128, which is a good value after the official test. Each asynchronous operation reduces the value of budget. When the value is reduced to 0, the task returns to the scheduler and resets the budget until the next time it is scheduled.

In tokio:: block_ON, budget checks are performed:

Coop :: Budget (), shown in the figure, initializes the budget variable to 128 and then polls the incoming future, in this case Acquire inside Mutex. Acquire implements the Future, and will first check budget when polling. If budget is sufficient or budget is not limited, return Ready and perform the remaining operations, otherwise return Pending.

At the same time, there is another point to note. Budget reset only works in tokio worker threads. Executors of other libraries are unaware of Budget’s existence and will not reset. For example, if the futures block_ON is used in tokio’s executor, the code running logic inside the block_ON will hang after a certain number of executions, causing a hang problem.

In our code, we actually encounter the above problem:

First, the main function uses #[tokio::main], which automatically generates a Tokio executor:

Second, the code is tested using the future::executor block_on method, which will cause the code to execute only in the Future’s executor. When get_block() searches for the local blockstore, it will suspend if it fails to execute within 128 times:

conclusion

In summary, the solution to the problem is not to execute executor::block_on() from other libraries in Tokio’s executor.


Netwarps is composed of a senior cloud computing and distributed technology development team in China, which has rich experience in the financial, power, communication and Internet industries. Netwarps has set up research and development centers in Shenzhen and Beijing, with a team size of 30+, most of which are technical personnel with more than 10 years of development experience, respectively from the Internet, finance, cloud computing, blockchain and scientific research institutions and other professional fields. Netwarps focuses on the development and application of secure storage technology products, the main products are decentralized file system (DFS), decentralized computing platform (DCP), is committed to providing distributed storage and distributed computing platform based on decentralized network technology, with high availability, low power consumption and low network technical characteristics. Applicable to scenarios such as the Internet of Things and industrial Internet. Public account: Netwarps