-
-
Notifications
You must be signed in to change notification settings - Fork 2k
Labels
Description
Problem
The correct implementation might be:
&source[self.start_byte() / 2..self.end_byte() / 2]
But the current implementation is (including Go binding):
&source[self.start_byte()..self.end_byte()]
Or maybe I misunderstood this API?
Steps to reproduce
Create a new grammar and a test crate:
mkdir utf16le
cd utf16le
tree-sitter init
tree-sitter generate
cargo new --lib hello
Add dependencies to crate hello
:
tree-sitter = "0.25.8"
tree-sitter-utf16le = { path = "../" }
Write the Rust test:
#[cfg(test)]
mod tests {
use tree_sitter::Parser;
use tree_sitter_utf16le::LANGUAGE;
#[test]
fn it_works() {
let mut parser = Parser::new();
parser.set_language(&LANGUAGE.into()).unwrap();
let text = "你好".encode_utf16().collect::<Box<_>>();
let tree = parser.parse_utf16_le(&text, None).unwrap();
let root = tree.root_node().utf16_text(&text);
// ^~~~~~~~~~~~~~~~^ fails here
let hello = String::from_utf16(root).unwrap();
assert_eq!(hello, "你好");
}
}
Run the test:
cd hello
cargo test
Result:
---- tests::it_works stdout ----
thread 'tests::it_works' panicked at C:\Users\XXX\.cargo\registry\src\index.crates.io-XXX\tree-sitter-0.25.8\binding_rust\lib.rs:2065:16:
range end index 4 out of range for slice of length 2
Expected behavior
Get the "你好"
text via String::from_utf16
.
Tree-sitter version (tree-sitter --version)
tree-sitter 0.25.4
Operating system/version
Windows 10 22H2
WillLillis