Skip to content

feature: Added (generate_libtoken_dict.sh) script for simplified work with libtokencap #2106

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Jun 4, 2024

Conversation

sigdevel
Copy link
Contributor

Prepared a script (make_dict.sh) for simplified work with libtokencap. The description of the work libtokencap is given in the file README.md , but to simplify the startup, it will probably be more convenient to add a separate script that can be used at once

@vanhauser-thc vanhauser-thc changed the base branch from stable to dev May 30, 2024 13:18
@vanhauser-thc
Copy link
Member

Hi, I do not mind merging this, however
a) add a description to libtokens README.md
b) a user should not need to edit the file, just make it command line parameters

@sigdevel
Copy link
Contributor Author

Yes, great, would be happy to share the scripts used. if you are happy with this example, I will add it to the README on the next commit.

} >"$AFL_TOKEN_FILE"

# sort & remove duplicates
sort -u "$AFL_TOKEN_FILE" >"$AFL_DICT_FILE"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a) you still need to delete the tempfile and b) an echo that prints the generated filename would be good, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the additional sorting by the system utility as justified, since we have no guarantee that the user will perform tokenization on the minimized dictionary. Therefore, i think there are two approaches here:
a)modify the libtokencap tool itself - cache values beforehand and then select unique values afterwards ;
b)use a unix-way style aid band for this task

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it makes no sense to keep AFL_TOKEN_FILE. AFL_DICT FILE is the same just deduplicated. so keeping AFL_TOKEN_FILE is pointless.
and you need to inform the user what the filename is that you generated. like echo Done! captured tokens written to ... etc.

{
touch "$AFL_TOKEN_FILE"
for i in $(find "$target_output" -type f -name "id*"); do
LD_PRELOAD="$LD_PRELOAD_PATH" \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should check first if LD_PRELOAD is unset before doing this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, its been added to check flags.

@@ -0,0 +1,49 @@
#default values
timeout_sec=5
LD_PRELOAD_PATH="/home/${USER}/AFLplusplus/utils/libtokencap/libtokencap.so"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such a default is a bad idea IHMO, just remove this line

echo "Error: Missing mandatory opts" >&2
usage
fi

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put a check here that error if both LD_PRELOAD and LD_PRELOAD_PATH are empty.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems yes, its a localized point, worth making more universal.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is not what I meant but thinking about it, what you did is better

@vanhauser-thc
Copy link
Member

just a few nits and then it is fine IMHO

sigdevel added 2 commits June 1, 2024 01:18
(-) removed default vars ;
(+) added LD_PRELOAD_PATH check

# initialize vars
AFL_TOKEN_FILE="${PWD}/temp_output.txt"
AFL_DICT_FILE="$(basename "$target_output").dict"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a fan of the AFL_DICT_FILE naming.
either use a fixed name (e.g. tokens.dict), and/or set it e.g. with an -f parameter, or base it on target_bin.
an output directory is usually out, o, corpus or corpus, so the generated filename is not helpful IMHO.

@vanhauser-thc
Copy link
Member

I would recommend to rename the tool. make_dict_v2.sh makes no sense to a user, he/she has never seen v1 :)
maybe generate_libtoken_dict.sh is better, because if people copy the script to /usr/local/bin they will still know later what this script does.

@vanhauser-thc
Copy link
Member

sorry this took 3 loops, I finally had time to really look at the script, I am drowning in work and various tasks.

@sigdevel sigdevel changed the title feature: Added (make_dict.sh) script for simplified work with libtokencap feature: Added (generate_libtoken_dict.sh) script for simplified work with libtokencap Jun 1, 2024
@sigdevel
Copy link
Contributor Author

sigdevel commented Jun 1, 2024

I would recommend to rename the tool. make_dict_v2.sh makes no sense to a user, he/she has never seen v1 :) maybe generate_libtoken_dict.sh is better, because if people copy the script to /usr/local/bin they will still know later what this script does.

Yes, that comment seems logical - I renamed in the dependent files :)

@sigdevel
Copy link
Contributor Author

sigdevel commented Jun 1, 2024

sorry this took 3 loops, I finally had time to really look at the script, I am drowning in work and various tasks.

sorry this took 3 loops, I finally had time to really look at the script, I am drowning in work and various tasks.

That's okay If the tweaks helped the project a little bit - I was pleased.

@vanhauser-thc
Copy link
Member

looks good, but still the temporary file is not deleted and the naming of the result file is not helpful + no final echo that prints the final filename plus that everything went fine.

another thought: sometimes a target needs command line parameters. anything after '--' (getopt handles this) should be given to the target command ($*)

@sigdevel
Copy link
Contributor Author

sigdevel commented Jun 3, 2024

looks good, but still the temporary file is not deleted and the naming of the result file is not helpful + no final echo that prints the final filename plus that everything went fine.

another thought: sometimes a target needs command line parameters. anything after '--' (getopt handles this) should be given to the target command ($*)

Hi, based on your review, I've added changes to the file.

@vanhauser-thc
Copy link
Member

LGTM thank you

@vanhauser-thc vanhauser-thc merged commit 7f02f0d into AFLplusplus:dev Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants