-
Notifications
You must be signed in to change notification settings - Fork 1
Description
We are getting continuous problems with link checker errors, the solution to which seems to be adding more and more domains to the list that are excluded from being checked. As the number of excluded domains grows, the utility of the link checker diminishes.
As of Jan. 9 2025, these are the excluded domains from our link checker:
Domain | Date added to exclude list | Reason for adding |
---|---|---|
scholar.google.com |
05.01.2025 | 403: Network error: Forbidden |
useast.ensembl.org |
04.10.2025 | 403: Network error: Forbidden |
academic.oup.com/bioinformatics |
04.10.2025 | 403: Network error: Forbidden |
doi.org |
01.09.2025 | 403: Network error: Forbidden |
academic.oup.com/nar |
01.09.2025 | 403: Network error: Forbidden |
gnu.org |
10.25.2024 | 429: Network error: Too Many Requests |
anaconda.org |
04.05.2024 | unknown |
fonts.gstatic.com |
12.06.2023 | unknown |
www.microsoft.com/en-us/microsoft-365/onedrive/online-cloud-storage |
12.08.2023 | timeout |
I think some of these are justifiably excluded, like fonts.gstatic.com and the microsoft one, but others are very wide-ranging, like doi.org.
We also run into errors with the cache, which has to be manually deleted here if some links threw errors in previous runs - at least I think that is the reason. These show up as cache error
s.
This issue is for discussing possible solutions to these problems. Ultimately I think the link checker is a good thing to have, but it becomes tedious to deal with these same issues each time we build the page, and the growing list of excludes defeats the purpose.