Background
I created my first crate rust-canto even though I didn’t know
Rust, so that I can bring automatic Cantonese word segmentation and
romanizations into Typst.
This is a great idea. Moving from a complex xtask workspace to a streamlined
build.rs and build.sh workflow is a common evolution for Rust projects,
especially when targeting WebAssembly.
Five lessons learnt from chat bot
1. Avoid the “Crates.io Path Dependency” Trap
-
The Problem: Using a Workspace with a local helper crate.
[dependencies] xtask = { path = "xtask" }The above code worked locally but prevented me from publishing to Crates.io. The registry cannot resolve local paths.
-
The Fix: Move
xtasklogic into a local folder (likebuild_deps/) that is not its own crate. Reference it inbuild.rsorsrc/bin/files using the#[path]attribute:#[path = "build_deps/mod.rs"] mod codegen;
2. Master the OUT_DIR Handshake
-
The Golden Rule: Never allow
build.rsto write files intosrc/ordata/folders. Ifcargo publishdetects that the source directory was modified during the build, it will fail verification. -
Writing: Always use the environment variable
OUT_DIRprovided by Cargo:let out_dir = std::env::var("OUT_DIR").unwrap(); let dest = std::path::Path::new(&out_dir).join("trie.dat"); std::fs::write(dest, data)?; -
Reading: Use the
include_bytes!macro combined withenv!to “bake” that data into your binary at compile time:const DATA: &[u8] = include_bytes!(concat!(env!("OUT_DIR"), "/trie.dat"));
3. Respect Rust 2024 Keyword Changes
- The Conflict: Rust Edition 2024 introduced
genas a reserved keyword (for generator blocks). - The Mistake: Naming your code-generation module
mod gen;. This results in an “expected a name” compiler error. - The Fix: Use a descriptive name like
mod codegen;ormod generate;.
4. Solve the WASM C++ Toolchain Error
-
The Error:
fatal error: 'algorithm' file not found. -
The Cause: Some Rust crates (like
zstd) include C++ code by default. Thewasm32-unknown-unknowntarget does not have a C++ standard library, so the build fails. -
The Fix: In
Cargo.toml, disable default features for these crates to force a pure-Rust implementation:zstd = { version = "...", default-features = false }
5. Optimize for WASM Size and Speed
-
Pre-computing: Instead of building complex structures (like Tries) from raw text at runtime, build them in
build.rs, serialize them withpostcard, and compress them withzstd. -
Binary Tools: Avoid adding heavy optimization crates like
wasm-optto your[build-dependencies]. They often require C++ toolchains. Instead, use a shell script (build.sh) to call thewasm-optsystem binary:wasm-opt -Oz input.wasm -o output.wasm --strip-debug -
The Result: This keeps your WASM plugin small (usually 1–2MB) and ensures it loads instantly in environments like Typst without panicking.