-
-
Notifications
You must be signed in to change notification settings - Fork 931
Description
While investigating fixes for #8682 I discovered that rbByteEncode does not properly handle the case of source and destination encoding being the same.
As described in #8682 (comment):
IOOutputStream calls RubyIO.write and passes only the incoming bytes, offsets, and encoding:
jruby/core/src/main/java/org/jruby/util/IOOutputStream.java
Lines 152 to 163 in 14ec399
@Override | |
public void write(final byte[] b,final int off, final int len) throws IOException { | |
ThreadContext context = runtime.getCurrentContext(); | |
RubyIO realIO = this.realIO; | |
if (realIO != null) { | |
realIO.write(context, b, off, len, encoding); | |
} else { | |
IRubyObject io = this.io; | |
writeAdapter.call(context, io, io, RubyString.newStringLight(runtime, new ByteList(b, off, len, encoding, false))); | |
} | |
} |
This path was added as an optimization in 236f7ba to avoid constructing string objects just to immediately unwrap them for IO writes.
Unfortunately it hits a bug eventually in EncodingUtils.rbByteEncode where two identical encodings will not no-op, but instead will trigger an encoding error because it rejects encoding from and to the same encoding.
The short-circuit checks here:
jruby/core/src/main/java/org/jruby/util/io/EncodingUtils.java
Lines 983 to 989 in 844c151
if (encoding.isAsciiCompatible() && to.isAsciiCompatible()) { | |
if (cr == StringSupport.CR_7BIT) { | |
return null; | |
} | |
} else if (encodingEqual(sname, dname)) { | |
return null; | |
} |
...were never updated when the sister logic for RubyString was updated in 621369c. The correct logic, with additional refactoring, looks like this:
jruby/core/src/main/java/org/jruby/util/io/EncodingUtils.java
Lines 1094 to 1101 in 844c151
if (senc != null && senc == denc) { | |
return strTranscodeScrub(context, forceEncoding, str, ecflags, ecopts, result, explicitlyInvalidReplace, denc, senc); | |
} else if (is7BitCompat(str, denc, senc)) { | |
return result.apply(context, str, denc, str); | |
} else if (encodingEqual(sname, dname)) { | |
if (forceEncoding.isNil()) denc = null; | |
return result.apply(context, str, denc, str); | |
} |
Ignoring the problems with IOOutputStream only being able to specify a single encoding for all String writes, we need to fix this issue in rbByteEncode and ensure same-encoding calls no-op the same way as rbStrEncode.