[FFmpeg-devel] [PATCH] ffmpeg-web/robots.txt: attempt to keep spiders out of dynamically generated git content

ffmpegandmahanstreamer at lolcow.email ffmpegandmahanstreamer at lolcow.email
Wed Jul 14 23:00:53 EEST 2021


On 2021-07-14 14:51, Michael Niedermayer wrote:
> Signed-off-by: Michael Niedermayer <michael at niedermayer.cc>
> ---
>  htdocs/robots.txt | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/htdocs/robots.txt b/htdocs/robots.txt
> index eb05362..4bbc395 100644
> --- a/htdocs/robots.txt
> +++ b/htdocs/robots.txt
> @@ -1,2 +1,13 @@
>  User-agent: *
> -Disallow:
> +Crawl-delay: 10
> +Disallow: /gitweb/
> +Disallow: /*a=search*
> +Disallow: /*/search/*
> +Disallow: /*a=blobdiff*
> +Disallow: /*/blobdiff/*
> +Disallow: /*a=commitdiff*
> +Disallow: /*/commitdiff/*
> +Disallow: /*a=snapshot*
> +Disallow: /*/snapshot/*
> +Disallow: /*a=blame*
> +Disallow: /*/blame/*
LGTM based on my own personal experiences. But the robots.txt has to be 
applied for git.ffmpeg.org as well, and not just ffmpeg.org. Or else 
they will just do the same for git.ffmpeg since there are treated 
separately.


More information about the ffmpeg-devel mailing list