[Rivet] [Fastjet] FastJet "robot" downloads blocked

Frank Siegert frank.siegert at cern.ch
Fri Apr 19 14:01:02 BST 2013


Hi again,

I'm attaching a short script that demonstrates what our bootstrap
script is doing. I have just "successfully" tested it -- i.e. I got a
403 error from the server:

$ ./testfj.py
Traceback (most recent call last):
  File "./testfj.py", line 8, in <module>
    hreq = urllib2.urlopen(url)
  File "/usr/lib/python2.7/urllib2.py", line 126, in urlopen
    return _opener.open(url, data, timeout)
  File "/usr/lib/python2.7/urllib2.py", line 406, in open
    response = meth(req, response)
  File "/usr/lib/python2.7/urllib2.py", line 519, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/lib/python2.7/urllib2.py", line 444, in error
    return self._call_chain(*args)
  File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.7/urllib2.py", line 527, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 403: Forbidden

My IP address in this case was 131.169.104.214.

Hope that helps,
Frank


On 19 April 2013 14:13, Matteo Cacciari <cacciari at lpthe.jussieu.fr> wrote:
> Follow up: can you please give me the IP from which you are running your
> script, so that I can look into the logs?
>
> Matteo
>
>
> On 19/04/2013 14:03, Matteo Cacciari wrote:
>> Hi Andy.
>>
>> As far as I remember we are not doing eny explicit blocking. I don't even
>> think we have a robots.txt file there.
>>
>> We'll look into it asap. Thank you for the mail.
>>
>> Matteo
>>
>>
>> On 19/04/2013 12:42, Andy Buckley wrote:
>>> Hi Gregory, Gavin, et al,
>>>
>>> We've noticed recently that the FastJet website blocks "robot" downloads
>>> of the FastJet tarball, e.g. http://fastjet.fr/repo/fastjet-3.0.3.tar.gz
>>>
>>> Unfortunately this means that the Rivet bootstrap script can fail if it
>>> tries to download and build FastJet, rather than using the LCG installed
>>> copy from AFS. We're using Python's urllib2 to do the fetching... is
>>> there anything weor you can do to not fall foul of this blocking? (I'm
>>> not sure if urllib2 automatically respects robots.txt files, but if you
>>> want to give us a special unblocked User-Agent name to use, I'm sure we
>>> can manage to update our script accordingly)
>>>
>>> Thanks!
>>> Andy & co
>>>
>>
>>
>>
>> _______________________________________________
>> Fastjet mailing list
>> Fastjet at projects.hepforge.org
>> http://www.hepforge.org/lists/listinfo/fastjet
>>
>
>
> _______________________________________________
> Rivet mailing list
> Rivet at projects.hepforge.org
> http://www.hepforge.org/lists/listinfo/rivet
-------------- next part --------------
A non-text attachment was scrubbed...
Name: testfj.py
Type: application/octet-stream
Size: 224 bytes
Desc: not available
URL: <http://www.hepforge.org/lists-archive/rivet/attachments/20130419/287edef4/attachment.obj>


More information about the Rivet mailing list