[Rivet] Hepdata numbering

Andy Buckley andy.buckley at cern.ch
Wed Feb 22 00:22:34 GMT 2017
Previous message: [Rivet] Hepdata numbering
Next message: [Rivet] Hepdata numbering
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Graeme,

I think the multi-axis table change is exactly what Chris raised as an 
issue, wasn't it?

I agree that the mismatch of Rivet analysis numbering and HepData 
numbering is a big problem. I didn't realise this was widespread until 
recently, because *I* always made my HepData submissions and Rivet data 
with the same script, and made sure they were synchronised. I think to 
some extent it reflects that it was very hard to engineer the desired 
structure with the old HD input format... hopefully that's much improved 
now.

On how to deal with it... yes, checking that the ref data is consistent 
with (or preferably identical to) the HepData output would be good, but 
it's not us who do that level of validation. At least, not right now and 
we'll need manpower (already very short) to do it. Hmm.

As far as mappings go, I don't think we should provide machinery. If 
users prefer to use C++ constructs that don't exactly line up with the 
HepData numbers, they'll have to do the mapping themselves. Usually the 
data record should be in a fairly sensible, loop-friendly order, though, 
and no "complex" mapping should be needed. And in smaller analyses 
there's no need for loops and arrays, just book a dedicated variable for 
each histogram and call it whatever you like.

Cheers,
Andy



On 22/02/17 00:12, WATT, GRAEME wrote:
> Dear Andy,
>
> The change to the YODA export of multidimensional tables from the new
> HEPData site is a secondary issue, which has no practical consequences
> for existing Rivet analyses as far as I know.  The wider issue is the
> inconsistency of path names in YODA files included in Rivet analysis
> (where people often choose their own ‘x’ and ‘y’ values) and YODA files
> exported from the old HepData site (where the ‘x’ and ‘y’ values are the
> number of the axes within a table, usually 01).  I see from the Rivet
> mailing list that Holger made a study showing these inconsistencies last
> October:
>  https://www.hepforge.org/lists-archive/rivet/2016-October/007318.html
>
> I think it should be part of the validation procedure for a new Rivet
> analysis that the YODA file matches the HepData/HEPData export.  I would
> like to keep the automatic path names on the HEPData side, so that
> custom path names would need to be handled on the Rivet side.  Allowing
> an override in the HEPData input file for new records would not fix the
> inconsistencies with existing records.  These are not new issues and
> I’ve raised them with you (and Holger) at various points in the last few
> years, e.g. from an email to you in 2014:
>
> ---
>
> On 9 Jun 2014, at 11:09, Graeme Watt <Graeme.Watt at durham.ac.uk
> <mailto:Graeme.Watt at durham.ac.uk>> wrote:
>
> However, I'm a bit uncomfortable about writing a path that doesn't
> correspond to the internal HepData IDs.  Could this be better handled on
> the Rivet side?  For example, if a mapping between the HepData histogram
> names and the Rivet histogram names was specified in the Rivet analysis,
> then calling _hist1 = bookHisto1D(toHepDataIndices("d03-x01-y01")) would
> be equivalent to _hist1 = bookHisto1D("d02-x01-y01"), where
> "toHepDataIndices" is a function giving the mapping.  Here, I'm looking
> at slide 35 of your Rivet tutorial given at CERN on 21st November 2013.
> Of course, such a mapping could just be left to the user and that might
> be the best solution.
>
> —
>
> Best regards,
> Graeme
>
>
>> On 21 Feb 2017, at 20:30, Andy Buckley <andy.buckley at cern.ch
>> <mailto:andy.buckley at cern.ch>> wrote:
>>
>> On 16/02/17 12:18, Graeme Watt wrote:
>>> Dear All,
>>>
>>> This was a conscious decision to improve the YODA export of
>>> multidimensional tables, so that we now write the appropriate YODA
>>> object for the number of independent variables, rather than always a
>>> Scatter2D object:
>>>
>>> https://github.com/HEPData/hepdata-converter/issues/5#issuecomment-135375309
>>>
>>>
>>> I checked with Andy that he agreed with this decision (in an email sent
>>> on 27th August 2015).
>>
>> Aha. Yes, seemed like a good idea... but an option to get the
>> backward-compatible format as well would help a lot, in this migration
>> phase. I don't know how set-up we are to use 1D and 3D scatters at the
>> moment.
>>
>>> Most (or even all?) existing HepData tables exported as YODA for use in
>>> Rivet analyses will only have one independent variable and one dependent
>>> variable (x01-y01).
>>
>> Definitely not all! And in some places I think they are being used for
>> reasons other than encoding 2D histograms... Chris? I think a wider
>> discussion is needed.
>>
>> More generally, I think this flags up that the dataset/axis naming in
>> HepData was always a bit of a hack. Maybe the input format could now
>> let the experiments specify their own names? I am certainly not welded
>> to the d,x,y format that I cooked up one afternoon many years ago...
>>
>>> I suspect that Rivet analyses containing path names
>>> with something different were prepared independently from the
>>> corresponding HepData record and so the path names don't match anyway
>>> (even on the old HepData site).
>>
>> This is exactly what we're trying to avoid: we don't want there to be
>> *any* such analyses.
>>
>>> Please let me know if you're aware of
>>> any existing Rivet analyses with path names containing something
>>> different than "x01" that correspond to a HepData table with more than
>>> one independent variable.  These are the only cases that would be
>>> affected by the change, and I'm not aware of any so far (and neither was
>>> Andy when I asked him back in 2015).
>>
>> Ah, 2015: that's why I don't remember! It's actually a bit difficult
>> to work it out from the code, but there are a lot of x02 etc. in our
>> ref data folder -- 848 of them, to be precise (cf. yodals *.yoda |
>> grep x0[2345] | wc -l)
>>
>> But maybe we are not using those particular histograms...
>> Chris/Holger, could you take a look at the Rivet MC output files from
>> the pre-release testing using the command above, to see if any of our
>> *output* uses second, third, etc. x-axis IDs?
>>
>>> It's been requested in the past to allow an option in the HepData input
>>> file to allow some override of the automatic path names, but I think it
>>> would be better to allow some kind of "mapping" to be coded within the
>>> Rivet analysis between the HepData histogram names and the Rivet
>>> histogram names, for cases where they don't match.  Andy made some
>>> comments on this last week:
>>> https://www.hepforge.org/lists-archive/rivet/2017-February/007602.html
>>
>> Well the "mapping" here *is* the HepData names. We just have functions
>> like bookHisto1D(1,2,3) as syntactic sugar for
>> bookHisto1D("d01-x02-y03"). The numerical names have the benefit of
>> being easy to loop over, too -- but loopable numeric components in
>> more "custom" names would also be very workable, IMHO.
>>
>> Thanks,
>> Andy
>>
>>
>>> On 16/02/17 11:15, David Grellscheid wrote:
>>>> Hi Graeme,
>>>>
>>>> do I understand correctly that the new Hepdata engine has changed the
>>>> numbering on existing archived datasets and not just the new ones
>>>> coming in?
>>>>
>>>> Thanks,
>>>>
>>>>  David
>>>>
>>>>
>>>>
>>>>
>>>> On 16/02/2017 10:59, Christian Gutschow wrote:
>>>>> Dear Graeme,
>>>>>
>>>>> thanks for your quick reply. Yes exactly, four 1-D histograms with
>>>>> varying x and y path fields is what I’m after. I suppose I can work
>>>>> around that for this analysis, but as ATLAS Rivet contact I can
>>>>> already see this seemingly minor feature cause an awful lot of
>>>>> frustration elsewhere.
>>>>>
>>>>> There are already many Rivet analyses relying on path names
>>>>> containing not just the "x01" that would need changing and I’m not
>>>>> sure this is a path we wanna head down to be honest. On the
>>>>> experiment side, we have elaborate MC validation frameworks that make
>>>>> use of Rivet analyses and their existing path names, so I’m worried
>>>>> that changes to the now well-established naming scheme will cause
>>>>> havoc all over the place…
>>>>>
>>>>> In fact, what will happen when (not if) we sync the Rivet reference
>>>>> data repository against HEPData? I’m fairly sure this will break
>>>>> everything and we’ll receive lots of abuse...
>>>>>
>>>>> I’m cc’ing the Rivet list as I think the Rivet developers (and users)
>>>>> need to be aware of this and this needs to be discussed. I wonder
>>>>> whether a comprise is feasible where we just add an additional
>>>>> 'qualifier' field to the YAML input, that would allow us to set the
>>>>> d01-x01-y01 style YODA names?
>>>>>
>>>>> Cheers,
>>>>> Chris
>>>>>
>>>>>
>>>>> On 16 Feb 2017, at 10:17, Graeme Watt
>>>>> <Graeme.Watt at durham.ac.uk
>>>>> <mailto:Graeme.Watt at durham.ac.uk><mailto:Graeme.Watt at durham.ac.uk>>
>>>>> wrote:
>>>>>
>>>>> Dear Chris,
>>>>>
>>>>> Good question.  On the old HepData site, a data table could be
>>>>> defined as "*data: x : x : y : y" (in the oldhepdata format,
>>>>> corresponding to two independent variables and two dependent
>>>>> variables), giving the YODA path names you mention, i.e. four 1-D
>>>>> histograms: y1(x1), y2(x1), y1(x2), y2(x2), written as Scatter2D
>>>>> objects.  But an alternative interpretation of such a table might be
>>>>> two 2-D histograms, y1(x1,x2) and y2(x1,x2).  To remove the
>>>>> ambiguity, we use only the latter interpretation on the new HEPData
>>>>> site, writing two Scatter3D objects with path names d01-x01-y01 and
>>>>> d01-x01-y02.
>>>>>
>>>>> Since I think you want four 1-D histograms rather than two 2-D
>>>>> histograms, you need to define two different tables, each with one
>>>>> independent variable and two dependent variables, giving YODA path
>>>>> names:
>>>>> d01-x01-y01
>>>>> d01-x01-y02
>>>>> d02-x01-y01
>>>>> d02-x01-y02
>>>>> So the new HEPData site will always define YODA path names with "x01"
>>>>> and it is not possible to get the path names mentioned in your
>>>>> email.  I hope this is not a problem for you, but you might need to
>>>>> modify the path names in an existing Rivet analysis.
>>>>>
>>>>> Best regards,
>>>>> Graeme
>>>>>
>>>>>
>>>>> On 15/02/17 23:34, Christian Gutschow wrote:
>>>>> Hi,
>>>>>
>>>>> I’m trying to work out how to write a YAML input file that will be
>>>>> interpreted as table with 2 different independent variables, each
>>>>> with two dependent variables. In YODA language the idea would be
>>>>> something like:
>>>>>
>>>>> d01-x01-y01
>>>>> d01-x01-y02
>>>>> d01-x02-y01
>>>>> d01-x02-y02
>>>>>
>>>>> I’d already be happy if perhaps you could point me to an example
>>>>> entry where this has been achieved, so I can take a look at the
>>>>> corresponding YAML file.
>>>>>
>>>>> Many thanks in advance!
>>>>>
>>>>> Cheers,
>>>>> Chris
>>>>>
>>>>>
>>>>>  —
>>>>>
>>>>>  Dr. Christian Gütschow
>>>>>
>>>>>  Department of Physics and Astronomy
>>>>>  University College London
>>>>>  Gower Street
>>>>>  London WC1E 6BT
>>>>>
>>>>>  > D10 Physics Building
>>>>>  > +44 (0)20 7679 3775
>>>>>  > chris.g at cern.ch <mailto:chris.g at cern.ch><mailto:chris.g at cern.ch>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>  —
>>>>>
>>>>>  Dr. Christian Gütschow
>>>>>
>>>>>  Department of Physics and Astronomy
>>>>>  University College London
>>>>>  Gower Street
>>>>>  London WC1E 6BT
>>>>>
>>>>>  > D10 Physics Building
>>>>>  > +44 (0)20 7679 3775
>>>>>  > chris.g at cern.ch <mailto:chris.g at cern.ch><mailto:chris.g at cern.ch>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Rivet mailing list
>>>>> Rivet at projects.hepforge.org <mailto:Rivet at projects.hepforge.org>
>>>>> https://www.hepforge.org/lists/listinfo/rivet
>>>>>
>>>
>>> _______________________________________________
>>> Rivet mailing list
>>> Rivet at projects.hepforge.org <mailto:Rivet at projects.hepforge.org>
>>> https://www.hepforge.org/lists/listinfo/rivet
>>
>>
>> --
>> Dr Andy Buckley, Lecturer / Royal Society University Research Fellow
>> Particle Physics Expt Group, University of Glasgow
>


-- 
Dr Andy Buckley, Lecturer / Royal Society University Research Fellow
Particle Physics Expt Group, University of Glasgow
Previous message: [Rivet] Hepdata numbering
Next message: [Rivet] Hepdata numbering
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Rivet mailing list