Skip to content

Conversation

@octo
Copy link
Contributor

@octo octo commented Apr 3, 2018

This PR fixes two issues when parsing heredoc strings:

  • The regular expression was not anchored to the beginning of the line, allowing arbitrary garbage in front of the delimiter.

  • The regular expression did not accept an arbitrary number of cartridge returns. One optional cartridge return (\r) and one newline (\n) were considered a line break. If a line ends in multiple cartridge returns, e.g. EOF\r\r\n, it was not considered to end the heredoc string with delimiter EOF. However, the formatter removes one cartridge return, so that a repeated parsing would consider the same line to end the heredoc string, resulting in a different interpretation of the input.

    In most cases parsing the output results in a syntax error, but it is fairly easy to create inputs that change semantic due to formatting. For example, "x=<<_\n_\r\r\ny=1\nz=<<_\n_\n" evaluates to x = <string> initially, but after reformatting it evaluates to x = <string>, y = <int>, z = <string>.

Kudos to dvyukov/go-fuzz for finding this!

octo added 4 commits April 3, 2018 16:16
When there are multiple cartridge returns at the end of the line, the regular expression will consider n-1 of them to be part of the string. Later, the last `\r` is removed. That may mean that a line that did previously *not* terminate a heredoc string may now terminate it, changing the meaning of the HCL file.
@mitchellh mitchellh merged commit c247bd0 into hashicorp:master Apr 3, 2018
@mitchellh
Copy link
Contributor

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants