[Rivet] Hepdata numbering

Andy Buckley andy.buckley at cern.ch
Tue Feb 21 20:30:24 GMT 2017


On 16/02/17 12:18, Graeme Watt wrote:
> Dear All,
>
> This was a conscious decision to improve the YODA export of
> multidimensional tables, so that we now write the appropriate YODA
> object for the number of independent variables, rather than always a
> Scatter2D object:
>
> https://github.com/HEPData/hepdata-converter/issues/5#issuecomment-135375309
>
>
> I checked with Andy that he agreed with this decision (in an email sent
> on 27th August 2015).

Aha. Yes, seemed like a good idea... but an option to get the 
backward-compatible format as well would help a lot, in this migration 
phase. I don't know how set-up we are to use 1D and 3D scatters at the 
moment.

> Most (or even all?) existing HepData tables exported as YODA for use in
> Rivet analyses will only have one independent variable and one dependent
> variable (x01-y01).

Definitely not all! And in some places I think they are being used for 
reasons other than encoding 2D histograms... Chris? I think a wider 
discussion is needed.

More generally, I think this flags up that the dataset/axis naming in 
HepData was always a bit of a hack. Maybe the input format could now let 
the experiments specify their own names? I am certainly not welded to 
the d,x,y format that I cooked up one afternoon many years ago...

> I suspect that Rivet analyses containing path names
> with something different were prepared independently from the
> corresponding HepData record and so the path names don't match anyway
> (even on the old HepData site).

This is exactly what we're trying to avoid: we don't want there to be 
*any* such analyses.

> Please let me know if you're aware of
> any existing Rivet analyses with path names containing something
> different than "x01" that correspond to a HepData table with more than
> one independent variable.  These are the only cases that would be
> affected by the change, and I'm not aware of any so far (and neither was
> Andy when I asked him back in 2015).

Ah, 2015: that's why I don't remember! It's actually a bit difficult to 
work it out from the code, but there are a lot of x02 etc. in our ref 
data folder -- 848 of them, to be precise (cf. yodals *.yoda | grep 
x0[2345] | wc -l)

But maybe we are not using those particular histograms... Chris/Holger, 
could you take a look at the Rivet MC output files from the pre-release 
testing using the command above, to see if any of our *output* uses 
second, third, etc. x-axis IDs?

> It's been requested in the past to allow an option in the HepData input
> file to allow some override of the automatic path names, but I think it
> would be better to allow some kind of "mapping" to be coded within the
> Rivet analysis between the HepData histogram names and the Rivet
> histogram names, for cases where they don't match.  Andy made some
> comments on this last week:
> https://www.hepforge.org/lists-archive/rivet/2017-February/007602.html

Well the "mapping" here *is* the HepData names. We just have functions 
like bookHisto1D(1,2,3) as syntactic sugar for 
bookHisto1D("d01-x02-y03"). The numerical names have the benefit of 
being easy to loop over, too -- but loopable numeric components in more 
"custom" names would also be very workable, IMHO.

Thanks,
Andy


> On 16/02/17 11:15, David Grellscheid wrote:
>> Hi Graeme,
>>
>> do I understand correctly that the new Hepdata engine has changed the
>> numbering on existing archived datasets and not just the new ones
>> coming in?
>>
>> Thanks,
>>
>>   David
>>
>>
>>
>>
>> On 16/02/2017 10:59, Christian Gutschow wrote:
>>> Dear Graeme,
>>>
>>> thanks for your quick reply. Yes exactly, four 1-D histograms with
>>> varying x and y path fields is what I’m after. I suppose I can work
>>> around that for this analysis, but as ATLAS Rivet contact I can
>>> already see this seemingly minor feature cause an awful lot of
>>> frustration elsewhere.
>>>
>>> There are already many Rivet analyses relying on path names
>>> containing not just the "x01" that would need changing and I’m not
>>> sure this is a path we wanna head down to be honest. On the
>>> experiment side, we have elaborate MC validation frameworks that make
>>> use of Rivet analyses and their existing path names, so I’m worried
>>> that changes to the now well-established naming scheme will cause
>>> havoc all over the place…
>>>
>>> In fact, what will happen when (not if) we sync the Rivet reference
>>> data repository against HEPData? I’m fairly sure this will break
>>> everything and we’ll receive lots of abuse...
>>>
>>> I’m cc’ing the Rivet list as I think the Rivet developers (and users)
>>> need to be aware of this and this needs to be discussed. I wonder
>>> whether a comprise is feasible where we just add an additional
>>> 'qualifier' field to the YAML input, that would allow us to set the
>>> d01-x01-y01 style YODA names?
>>>
>>> Cheers,
>>> Chris
>>>
>>>
>>> On 16 Feb 2017, at 10:17, Graeme Watt
>>> <Graeme.Watt at durham.ac.uk<mailto:Graeme.Watt at durham.ac.uk>> wrote:
>>>
>>> Dear Chris,
>>>
>>> Good question.  On the old HepData site, a data table could be
>>> defined as "*data: x : x : y : y" (in the oldhepdata format,
>>> corresponding to two independent variables and two dependent
>>> variables), giving the YODA path names you mention, i.e. four 1-D
>>> histograms: y1(x1), y2(x1), y1(x2), y2(x2), written as Scatter2D
>>> objects.  But an alternative interpretation of such a table might be
>>> two 2-D histograms, y1(x1,x2) and y2(x1,x2).  To remove the
>>> ambiguity, we use only the latter interpretation on the new HEPData
>>> site, writing two Scatter3D objects with path names d01-x01-y01 and
>>> d01-x01-y02.
>>>
>>> Since I think you want four 1-D histograms rather than two 2-D
>>> histograms, you need to define two different tables, each with one
>>> independent variable and two dependent variables, giving YODA path
>>> names:
>>> d01-x01-y01
>>> d01-x01-y02
>>> d02-x01-y01
>>> d02-x01-y02
>>> So the new HEPData site will always define YODA path names with "x01"
>>> and it is not possible to get the path names mentioned in your
>>> email.  I hope this is not a problem for you, but you might need to
>>> modify the path names in an existing Rivet analysis.
>>>
>>> Best regards,
>>> Graeme
>>>
>>>
>>> On 15/02/17 23:34, Christian Gutschow wrote:
>>> Hi,
>>>
>>> I’m trying to work out how to write a YAML input file that will be
>>> interpreted as table with 2 different independent variables, each
>>> with two dependent variables. In YODA language the idea would be
>>> something like:
>>>
>>> d01-x01-y01
>>> d01-x01-y02
>>> d01-x02-y01
>>> d01-x02-y02
>>>
>>> I’d already be happy if perhaps you could point me to an example
>>> entry where this has been achieved, so I can take a look at the
>>> corresponding YAML file.
>>>
>>> Many thanks in advance!
>>>
>>> Cheers,
>>> Chris
>>>
>>>
>>>>>>
>>>   Dr. Christian Gütschow
>>>
>>>   Department of Physics and Astronomy
>>>   University College London
>>>   Gower Street
>>>   London WC1E 6BT
>>>
>>>   > D10 Physics Building
>>>   > +44 (0)20 7679 3775
>>>   > chris.g at cern.ch<mailto:chris.g at cern.ch>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>>>>
>>>   Dr. Christian Gütschow
>>>
>>>   Department of Physics and Astronomy
>>>   University College London
>>>   Gower Street
>>>   London WC1E 6BT
>>>
>>>   > D10 Physics Building
>>>   > +44 (0)20 7679 3775
>>>   > chris.g at cern.ch<mailto:chris.g at cern.ch>
>>>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Rivet mailing list
>>> Rivet at projects.hepforge.org
>>> https://www.hepforge.org/lists/listinfo/rivet
>>>
>
> _______________________________________________
> Rivet mailing list
> Rivet at projects.hepforge.org
> https://www.hepforge.org/lists/listinfo/rivet


-- 
Dr Andy Buckley, Lecturer / Royal Society University Research Fellow
Particle Physics Expt Group, University of Glasgow


More information about the Rivet mailing list