Skip to content

Ubuntu26.04LTSにlycheeをインストールし、Webサイトのリンク切れを確認する

lycheeを使うとWebサイトのリンク切れを検出することが出来ます。また、lycheeはRustで書かれており、高速に動作するようです。今回はlycheeをUbuntu26.04LTSへインストールする手順をメモしておきます。

検証環境

対象 バージョン
Ubuntu 26.04LTS
lychee 0.24.2

今回のコマンドサマリー

apt update && apt install -y build-essential pkg-config libssl-dev && \
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh && \
. "$HOME/.cargo/env" && \
cargo install lychee

インストール

依存性のあるパッケージをインストールします。

apt update && apt install -y build-essential pkg-config libssl-dev

cargoをインストールします。

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

環境変数を反映します。

. "$HOME/.cargo/env"

cargoでlycheeをインストールします。

cargo install lychee

今回はバージョン0.24.2がインストールされました。

# lychee --version
lychee 0.24.2

使い方

URLを指定するだけでリンク切れをチェックしてくれます。「リダイレクト動作」などの詳細も確認したい場合は「-v」や「-vv」オプションを指定します。

lychee -v https://www.progdence.co.jp/

画像やCSS、JSなどのリンク切れを確認するには「--extensions」オプションを指定します。

lychee -v --extensions html,htm,png,jpg,jpeg,css,js https://www.progdence.co.jp/

参考

lycheeのヘルプ表示

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
# lychee --help
lychee is a fast, asynchronous link checker which detects broken URLs and mail addresses in local files and websites. It supports Markdown and HTML and works with other file formats.

lychee is powered by lychee-lib, the Rust library for link checking.

Usage: lychee [OPTIONS] [inputs]...

Arguments:
  [inputs]...
          Inputs for link checking (where to get links to check from).
          These can be: files (e.g. `README.md`), file globs (e.g. `'~/git/*/README.md'`),
          remote URLs (e.g. `https://example.com/README.md`), or standard input (`-`).
          Alternatively, use `--files-from` to read inputs from a file.

          NOTE: Use `--` to separate inputs from options that allow multiple arguments.

Options:
  -a, --accept <ACCEPT>
          A List of accepted status codes for valid links

          The following accept range syntax is supported: [start]..[[=]end]|code.
          Some valid examples are:

          - 200 (accepts the 200 status code only)
          - ..204 (accepts any status code < 204)
          - ..=204 (accepts any status code <= 204)
          - 200..=204 (accepts any status code from 200 to 204 inclusive)
          - 200..205 (accepts any status code from 200 to 205 excluding 205, same as 200..=204)

          Use "lychee --accept '200..=204, 429, 500' <inputs>..." to provide a comma-
          separated list of accepted status codes. This example will accept 200, 201,
          202, 203, 204, 429, and 500 as valid status codes.

          [default: 100..=103,200..=299]

      --accept-timeouts[=<false|true>]
          Accept timed out requests and return exit code 0 when encountering timeouts but not any other errors

      --archive <ARCHIVE>
          Web archive to use to provide suggestions for `--suggest`.

          [default: wayback]

          [possible values: wayback]

  -b, --base-url <BASE_URL>
          Base URL to use when resolving relative URLs in local files. If specified,
          relative links in local files are interpreted as being relative to the given
          base URL.

          For example, given a base URL of `https://example.com/dir/page`, the link `a`
          would resolve to `https://example.com/dir/a` and the link `/b` would resolve
          to `https://example.com/b`. This behavior is not affected by the filesystem
          path of the file containing these links.

          Note that relative URLs without a leading slash become siblings of the base
          URL. If, instead, the base URL ended in a slash, the link would become a child
          of the base URL. For example, a base URL of `https://example.com/dir/page/` and
          a link of `a` would resolve to `https://example.com/dir/page/a`.

          Basically, the base URL option resolves links as if the local files were hosted
          at the given base URL address.

          The provided base URL value must either be a URL (with scheme) or an absolute path.
          Note that certain URL schemes cannot be used as a base, e.g., `data` and `mailto`.

      --base <BASE>
          Deprecated; use `--base-url` instead

      --basic-auth <BASIC_AUTH>
          Basic authentication support. E.g. `http://example.com username:password`

  -c, --config <FILE_PATH>
          Configuration file to use. Can be specified multiple times.

          If given multiple times, the configs are merged and later
          occurrences take precedence over previous occurrences.

          [default: lychee.toml]

      --cache[=<false|true>]
          Use request cache stored on disk at `.lycheecache`

      --cache-exclude-status <CACHE_EXCLUDE_STATUS>
          A list of status codes that will be ignored from the cache

          The following exclude range syntax is supported: [start]..[[=]end]|code. Some valid
          examples are:

          - 429 (excludes the 429 status code only)
          - 500.. (excludes any status code >= 500)
          - ..100 (excludes any status code < 100)
          - 500..=599 (excludes any status code from 500 to 599 inclusive)
          - 500..600 (excludes any status code from 500 to 600 excluding 600, same as 500..=599)

          Use "lychee --cache-exclude-status '429, 500..502' <inputs>..." to provide a
          comma-separated list of excluded status codes. This example will not cache results
          with a status code of 429, 500 and 501.

      --cookie-jar <COOKIE_JAR>
          Read and write cookies using the given file. Cookies will be stored in the
          cookie jar and sent with requests. New cookies will be stored in the cookie jar
          and existing cookies will be updated.

      --default-extension <EXTENSION>
          This is the default file extension that is applied to files without an extension.

          This is useful for files without extensions or with unknown extensions.
          The extension will be used to determine the file type for processing.

          Examples:
            --default-extension md
            --default-extension html

      --dump[=<false|true>]
          Don't perform any link checking. Instead, dump all the links extracted from inputs that would be checked

      --dump-inputs[=<false|true>]
          Don't perform any link extraction and checking. Instead, dump all input sources from which links would be collected

  -E, --exclude-all-private[=<false|true>]
          Exclude all private IPs from checking.
          Equivalent to `--exclude-private --exclude-link-local --exclude-loopback`

      --exclude <EXCLUDE>
          Exclude URLs and mail addresses from checking. The values are treated as regular expressions

      --exclude-file <EXCLUDE_FILE>
          Deprecated; use `--exclude-path` instead

      --exclude-link-local[=<false|true>]
          Exclude link-local IP address range from checking

      --exclude-loopback[=<false|true>]
          Exclude loopback IP address range and localhost from checking

      --exclude-path <EXCLUDE_PATH>
          Exclude paths from getting checked. The values are treated as regular expressions

      --exclude-private[=<false|true>]
          Exclude private IP address ranges from checking

      --extensions <EXTENSIONS>
          A list of file extensions. Files not matching the specified extensions are skipped.

          Multiple extensions can be separated by commas. Note that if you want to check filetypes,
          which have multiple extensions, e.g. HTML files with both .html and .htm extensions, you need to
          specify both extensions explicitly.
          An example is: `--extensions html,htm,php,asp,aspx,jsp,cgi`.

          This is useful when the default extensions are not enough and you don't
          want to provide a long list of inputs (e.g. file1.html, file2.md, etc.)

          [default: md,mkd,mdx,mdown,mdwn,mkdn,mkdown,markdown,html,htm,css,txt,xml]

  -f, --format <FORMAT>
          Output format of final status report

          [default: compact]

          [possible values: compact, detailed, json, junit, markdown]

      --fallback-extensions <FALLBACK_EXTENSIONS>
          When checking locally, attempts to locate missing files by trying the given
          fallback extensions. Multiple extensions can be separated by commas. Extensions
          will be checked in order of appearance.

          Example: --fallback-extensions html,htm,php,asp,aspx,jsp,cgi

          Note: This option takes effect on `file://` URIs which do not exist and on
                `file://` URIs pointing to directories which resolve to themself (by the
                --index-files logic).

      --files-from <PATH>
          Read input filenames from the given file or stdin (if path is '-').

          This is useful when you have a large number of inputs that would be
          cumbersome to specify on the command line directly.

          Examples:

              lychee --files-from list.txt
              find . -name '*.md' | lychee --files-from -
              echo 'README.md' | lychee --files-from -

          File Format:
          - Each line should contain one input (file path, URL, or glob pattern).
          - Lines starting with '#' are treated as comments and ignored.
          - Empty lines are also ignored.

      --generate <GENERATE>
          Generate special output (e.g. the man page) instead of performing link checking

          [possible values: man, complete-bash, complete-elvish, complete-fish, complete-powershell, complete-zsh]

      --github-token <GITHUB_TOKEN>
          GitHub API token to use when checking github.com links, to avoid rate limiting

          [env: GITHUB_TOKEN]

      --glob-ignore-case[=<false|true>]
          Ignore case when expanding filesystem path glob inputs

  -h, --help
          Print help (see a summary with '-h')

  -H, --header <HEADER:VALUE>
          Set custom header for requests.

          Some websites require custom headers to be passed in order to return valid responses.
          You can specify custom headers in the format 'Name: Value'. For example, 'Accept: text/html'.
          This is the same format that other tools like curl or wget use.
          Multiple headers can be specified by using the flag multiple times.
          The specified headers are used for ALL requests.
          Use the `hosts` option to configure headers on a per-host basis.

      --hidden[=<false|true>]
          Do not skip hidden directories and files

      --host-concurrency <HOST_CONCURRENCY>
          Default maximum concurrent requests per host (default: 10)

          This limits the maximum amount of requests that are sent simultaneously
          to the same host. This helps to prevent overwhelming servers and
          running into rate-limits. Use the `hosts` option to configure this
          on a per-host basis.

          Examples:
            --host-concurrency 2   # Conservative for slow APIs
            --host-concurrency 20  # Aggressive for fast APIs

      --host-request-interval <HOST_REQUEST_INTERVAL>
          Minimum interval between requests to the same host (default: 50ms)

          Sets a baseline delay between consecutive requests to prevent
          overloading servers. The adaptive algorithm may increase this based
          on server responses (rate limits, errors). Use the `hosts` option
          to configure this on a per-host basis.

          Examples:
            --host-request-interval 50ms   # Fast for robust APIs
            --host-request-interval 1s     # Conservative for rate-limited APIs

      --host-stats[=<false|true>]
          Show per-host statistics at the end of the run

  -i, --insecure[=<false|true>]
          Proceed for server connections considered insecure (invalid TLS)

      --include <INCLUDE>
          URLs to check (supports regex). Has preference over all excludes

      --include-fragments[=<none|anchor-only|text-only|full>]
          Enable the checking of fragments in links.

          Use `none` to disable fragment checks, `anchor-only` for anchor fragments
          like `#section`, `text-only` for text fragments like `#:~:text=example`,
          or `full` to check both.

          If provided without a value, defaults to `anchor-only`.

      --include-mail[=<false|true>]
          Also check email addresses

      --include-verbatim[=<false|true>]
          Find links in verbatim sections like `pre`- and `code` blocks

      --include-wikilinks[=<false|true>]
          Check WikiLinks in Markdown files, this requires specifying --base-url

      --index-files <INDEX_FILES>
          When checking locally, resolves directory links to a separate index file.
          The argument is a comma-separated list of index file names to search for. Index
          names are relative to the link's directory and attempted in the order given.

          If `--index-files` is specified, then at least one index file must exist in
          order for a directory link to be considered valid. Additionally, the special
          name `.` can be used in the list to refer to the directory itself.

          If unspecified (the default behavior), index files are disabled and directory
          links are considered valid as long as the directory exists on disk.

          Example 1: `--index-files index.html,readme.md` looks for index.html or readme.md
                     and requires that at least one exists.

          Example 2: `--index-files index.html,.` will use index.html if it exists, but
                     still accept the directory link regardless.

          Example 3: `--index-files ''` will reject all directory links because there are
                     no valid index files. This will require every link to explicitly name
                     a file.

          Note: This option only takes effect on `file://` URIs which exist and point to a directory.

  -m, --max-redirects <MAX_REDIRECTS>
          Maximum number of allowed redirects

          [default: 10]

      --max-cache-age <MAX_CACHE_AGE>
          Discard all cached requests older than this duration

          [default: 1d]

      --max-concurrency <MAX_CONCURRENCY>
          Maximum number of concurrent network requests

          [default: 128]

      --max-retries <MAX_RETRIES>
          Maximum number of retries per request

          [default: 3]

      --min-tls <MIN_TLS>
          Minimum accepted TLS Version

          [possible values: TLSv1_0, TLSv1_1, TLSv1_2, TLSv1_3]

      --mode <MODE>
          Set the output display mode. Determines how results are presented in the terminal

          [default: color]

          [possible values: plain, color, emoji, task]

  -n, --no-progress[=<false|true>]
          Do not show progress bar.
          This is recommended for non-interactive shells (e.g. for continuous integration)

      --no-ignore[=<false|true>]
          Do not skip files that would otherwise be ignored by '.gitignore', '.ignore', or the global ignore file

  -o, --output <OUTPUT>
          Output file of status report

      --offline[=<false|true>]
          Only check local files and block network requests

  -p, --preprocess <COMMAND>
          Preprocess input files with the given command.

          For each file input, this flag causes lychee to execute `COMMAND PATH` and process
          its standard output instead of the original contents of PATH. This allows you to
          convert files that would otherwise not be understood by lychee. The preprocessor
          COMMAND is only run on input files, not on standard input or URLs.

          To invoke programs with custom arguments or to use multiple preprocessors, use a
          wrapper program such as a shell script. An example script looks like this:

          ```
          #!/usr/bin/env bash
          case "$1" in
          *.pdf)
              exec pdftohtml -i -s -stdout "$1"
              ;;
          *.odt|*.docx|*.epub|*.ipynb)
              exec pandoc "$1" --to=html --wrap=none
              ;;
          *)
              # identity function, output input without changes
              exec cat
              ;;
          esac
          ```

  -q, --quiet...
          Less output per occurrence (e.g. `-q` or `-qq`)

  -r, --retry-wait-time <RETRY_WAIT_TIME>
          Minimum wait time in seconds between retries of failed requests

          [default: 1]

      --remap <REMAP>
          Remap URI matching pattern to different URI

      --require-https[=<false|true>]
          When HTTPS is available, treat HTTP links as errors

      --root-dir <ROOT_DIR>
          Root directory to use when checking absolute links in local files. This option is
          required if absolute links appear in local files, otherwise those links will be
          flagged as errors. This must be an absolute path (i.e., one beginning with `/`).

          If specified, absolute links in local files are resolved by prefixing the given
          root directory to the requested absolute link. For example, with a root-dir of
          `/root/dir`, a link to `/page.html` would be resolved to `/root/dir/page.html`.

          This option can be specified alongside `--base-url`. If both are given, an
          absolute link is resolved by constructing a URL from three parts: the domain
          name specified in `--base-url`, followed by the `--root-dir` directory path,
          followed by the absolute link's own path.

  -s, --scheme <SCHEME>
          Only test links with the given schemes (e.g. https).
          Omit to check links with any other scheme.
          At the moment, we support http, https, file, and mailto.

      --skip-missing[=<false|true>]
          Skip missing input files (default is to error if they don't exist)

      --suggest[=<false|true>]
          Suggest link replacements for broken links, using a web archive. The web archive can be specified with `--archive`

  -t, --timeout <TIMEOUT>
          Website timeout in seconds from connect to response finished

          [default: 20]

  -T, --threads <THREADS>
          Number of threads to utilize. Defaults to number of cores available to the system

  -u, --user-agent <USER_AGENT>
          User agent

          [default: lychee/x.y.z]

  -v, --verbose...
          Set verbosity level; more output per occurrence (e.g. `-v` or `-vv`)

  -V, --version
          Print version

  -X, --method <METHOD>
          Request method

          [default: get]