[Rivet] aidamerge

Hannes Jung hannes.jung at cern.ch
Sun Aug 21 13:48:30 BST 2011


Hi Andy

sure it comes from my code.

I divide 2 histograms in the code, and there could be of course a zero in the denominator.

But the histo package should take care of this... or do you really mean I should check every bin to make sure that there is an entry before calling the histo divide option via:
histogramFactory().divide(histoPath("jetptl_ratio"), *_hist_jetptl_central_ratio, *_hist_jetptl_central_only_ratio);

I think it is not a merging problem.... the merging routine gave an error because there were "nan" and "inf"....
I reaslised this just now, maybe the discussion on merging was misleading, as it seems to be a problem of the hosto package....

hm...... i am really a bit confused now, what to do...

cheers
hannes




On 21.08.2011, at 14:28, Andy Buckley wrote:

Hi Hannes,

If you are getting nans and infs out of the analysis code, then there is a problem in that code or in the way it's being run. I'm not really sure how a merging script *should* behave if the input has this problem!

Do these happen in all your files or just one? And does it affect all the bins that you're plotting or just a couple? If most bins are ok in most of your merging files then you could explicitly average over the entries from the valid files only, but that will require some coding. If it's much more widespread, then I think it needs to be fixed in the analysis code or by increasing the size of the runs. Alternatively, you could write the analysis to output the numerator and denominator histograms, and modify the merging script to do the merging and then calculate the ratio.

Unfortunately it's easy to do that for a single analysis that you write yourself, but a lot harder to do the equivalent thing for all analyses, with completely general finalize functions and without the typical user having to know all the analysis details...

Andy


On 21/08/11 12:08, Hannes Jung wrote:
Dear Andy

thanks for your mail.
I understand that this all takes time for development and testing.
I am happy to help in debugging as much as I can (although this might be
limited). I am just lost how to correct it .... this is why I am asking
for help...

The problem seem to appear in lines like:
<dataPoint>
<measurement errorPlus="5.000000e+00" value="5.000000e+00"
errorMinus="5.000000e+00"/>
<measurement errorPlus="nan" value="nan" errorMinus="nan"/>
</dataPoint>
<dataPoint>
<measurement errorPlus="5.000000e+00" value="1.500000e+01"
errorMinus="5.000000e+00"/>
<measurement errorPlus="inf" value="inf" errorMinus="inf"/>
</dataPoint>

The errPlus and errMinus values I can correct with the method which
Daniel proposed, but I cannot get it working for the value="inf" and
also not for those with "nan"..... I am just not used to Python...

thanks again
cheers
hannes


On 21.08.2011, at 12:04, Andy Buckley wrote:

Hi Hannes,

We understand that being able to merge histograms is an important
feature. At the moment, however, you can't do it properly and
generally, because our data storage does not support it. This is an
historical accident rather than a design intent, but we only have a
small developer team, and despite serious intentions to fix this for
several *years* there has been much more demand for new analyses and
new analysis functionality than for updated histogram persistency. We
can only do so much at once, and until recently all use-cases could
make do with simply using long runs.

I'm pleased to say that the replacement which will support all this
and more (there are *many* things that we don't like about AIDA) is
very well underway, but for now you have to make sure that the example
approximate-merging script is doing what you need it to do. It's not
ideal and we're working to improve on that situation, but it takes
time. This is a physics package, not Microsoft Word -- physics users
need to be prepared to open the code a little bit and do some
debugging or hacking... especially when that code was only intended as
an example.

In your problem case I suspect that the merging algorithm *is*
appropriate, but that the numbers being entered are extremely large...
so large that they are overflowing the Python (double precision) float
type! So this anyway needs more debugging from your side than just
sending us the crash traceback... a *minimal* set of AIDA files that
reproduce the problem would help, or alternatively just inserting a
print statement or two before the offending line.

In short, we need more information to help in this case, and for the
general solution we're working on it. This will still take a while
because a lot of migration and testing needs to happen. Telling us
that it's not good enough won't make that happen faster, I'm afraid --
some development help in the last two years would have done, but it's
even too late for that now. We'll at some point have a beta release
with this feature, so I hope you'll be able to give us some feedback.

Best wishes,
Andy

PS. You mentioned being able to make ratios in the plotting phase...
well, you can, but again it involves writing code. And I don't see
that changing: we can't make a script with a "magically do what I
want" option flag! Allowing merging *before* the finalise step is
something we would like to do, but which requires a *lot* of design
and planning. If you have any bright ideas about how this can be done
nicely (i.e. uniformly for all analyses, and in the
user-rather-than-developer mode) then please get in touch :)


On 21/08/11 08:15, Hannes Jung wrote:
Dear Andy et all

thanks a lot for your mail and your explanations.

I understand that aidamerge is not an official script.
However, in a usable histogramming package one must be able to add
histograms at the end and possible errors must be trated, otherwise the
package is not really usable for users... it might be fine for
developers.
In Root as well as in the old hbook/paw package we had options which
treated adding histograms properly, of course one has to be careful to
add the proper things, and adding ratios might be tricky.... but then
there must be an option to do the ratio while plotting.

There is of course an issue, whether whatever is added does make sense.
But the package should work, or give an error message and treat that
somehow, but not just crashing....

The Aida package is nice and I like very much the rivet-mkhtml script
which make life much easier.... but I do not want to develop the
histogramming package, I just want to use it and to be sure that it does
what it is supposed to do....

Don't understand me wrong, I appreciate very much the help and support I
get and got in the past solving problems with the histogramming
package... but... I just want to use it.....

Thanks a lot for your support

Cheers

Hannes

On 20.08.2011, at 19:30, Andy Buckley wrote:

Hi Hannes and all,

Please, note that the aidamerge script is not an official Rivet
script: it's an example of how you might write a script to do some
*approximate* statistical merging of independent runs.

Because we don't currently store enough information to do the merging
exactly, this script has to make some guesses, and that's why we don't
offically support it. So you should make sure that it's doing merging
appropriate for your data -- using it blindly *will* lead to errors.

So have a look in the code. From my own glances inside it, the
approximate merging algorithm used assumes that the samples you are
merging of the same size (you could add scale factors to a local copy
if you need them), and that you are either merging normalised
histograms or profile histograms -- if your data is of a different
type, most notably un-normalised histograms, then the assumed error
scaling will be incorrect. You mentioned ratios, Hannes: I *think* the
scaling is probably correct, i.e. more data makes the values converge
to the (weighted) mean of the runs and the errors get smaller as
1/sqrt(N)... but it depends on exactly what you're doing.

Andy


On 20/08/11 13:50, Hannes Jung wrote:
Hi Daniel

hm... it seems the messages below come from somewhere else....
it didn't change even when setting the error to 1 instead of 1E308....

Does anyone knows how to fix this ? ....

Cheers
Hannes

On 20.08.2011, at 14:32, Daniel Weyh wrote:

Ok, ... I don't know at all what the plottings themselve do.
Probably using another float 1e+20 or somethin instead of 1e308 will
not cause an overflow...

But, I'm afk at the moment... Sry

Am 20.08.2011 um 14:20 schrieb Hannes Jung <hannes.jung at cern.ch<mailto:hannes.jung at cern.ch>
<mailto:hannes.jung at cern.ch>
<mailto:hannes.jung at cern.ch>
<mailto:hannes.jung at cern.ch>>:

Hi Daniel again

maybe I was too fast,.... the aidamerge did work, but when plotting
it i get the follwoing errors:

Plotting
cascade-uPDFs/FWDCENTPHENO/Delta_phi_Delta_eta_eq_10_Et_gt_10_GeV.dat
(33 remaining)
Plotting
cascade-uPDFs/FWDCENTPHENO/Delta_phi_Delta_eta_eq_10_Et_gt_30_GeV.dat
(32 remaining)
Error: cannot convert float NaN to integer
Error: cannot convert float NaN to integer
Plotting
cascade-uPDFs/FWDCENTPHENO/Delta_phi_Delta_eta_eq_2_Et_gt_10_GeV.dat
(29 remaining)
Error: cannot convert float infinity to integer

and then rivet-mkhtml gets stuck...
Hm....

thanks a lot
Cheers
Hannes

On 20.08.2011, at 13:53, Daniel Weyh wrote:

Sry, I didn't know where you got your copy from.
It is uploaded to SVN (r3300).
Please check this out or look at
<http://projects.hepforge.org/rivet/trac/browser/contrib/aidamerge><http://projects.hepforge.org/rivet/trac/browser/contrib/aidamerge>http://projects.hepforge.org/rivet/trac/browser/contrib/aidamerge

Hope it helps,
Daniel


Am 20.08.2011 um 13:35 schrieb Hannes Jung
<<mailto:hannes.jung at cern.ch>hannes.jung at cern.ch
<mailto:hannes.jung at cern.ch>>:

Hi Daniel

thanks a lot..... I guess this should work.... I just don't know
where to change what...
could you perhaps tell me a bit more what to change in which line,
or perhaps upload the patched version somewhere ?

Thanks very much
cheers
hannes

On 20.08.2011, at 13:27, Daniel Weyh wrote:

Dear Hannes,

Dear Riveties

adding several aida files works fine, only in some cases I get
the error:

Traceback (most recent call last):
File "./aidamerge", line 65, in <module>
sum_err2 += h.getBin(i).getErr()**2
OverflowError: (34, 'Numerical result out of range')

I guess it comes when a histo is not properly filled (fro example
when a ratio is taken).
Is there a way to prevent these error message, and to continue
with the program ?

I added a patch to catch the exception, use float('inf') during
summing up and in the write out step this 'inf' is converted to a
vee..eery large float.
Does this work for you?

@others: Is this the way it should work - or should we somehow
exclude such bins?!?

Cheers,
Daniel



***********************************************************************
Hannes Jung
Email:
<mailto:Hannes.Jung at cern.ch><mailto:Hannes.Jung at cern.ch>Hannes.Jung at cern.ch
<mailto:Hannes.Jung at cern.ch>
mobile :+49 40 8998 93741
<http://www.desy.de/~jung><http://www.desy.de/~jung>http://www.desy.de/~jung

Tel: +49 (0) 40 8998 3741 (DESY)
Tel: +41 22 76 62602 (CERN)
CERN - PH
42-2-033
CH-1211 Genève 23
Switzerland
***********************************************************************






***********************************************************************
Hannes Jung
Email: <mailto:Hannes.Jung at cern.ch>Hannes.Jung at cern.ch
<mailto:Hannes.Jung at cern.ch>
mobile :+49 40 8998 93741
<http://www.desy.de/~jung>http://www.desy.de/~jung
Tel: +49 (0) 40 8998 3741 (DESY)
Tel: +41 22 76 62602 (CERN)
CERN - PH
42-2-033
CH-1211 Genève 23
Switzerland
***********************************************************************






***********************************************************************
Hannes Jung
Email: Hannes.Jung at cern.ch<mailto:Hannes.Jung at cern.ch> <mailto:Hannes.Jung at cern.ch>
<mailto:Hannes.Jung at cern.ch>
<mailto:Hannes.Jung at cern.ch>
mobile :+49 40 8998 93741
http://www.desy.de/~jung
Tel: +49 (0) 40 8998 3741 (DESY)
Tel: +41 22 76 62602 (CERN)
CERN - PH
42-2-033
CH-1211 Genève 23
Switzerland
***********************************************************************






_______________________________________________
Rivet mailing list
Rivet at projects.hepforge.org<mailto:Rivet at projects.hepforge.org> <mailto:Rivet at projects.hepforge.org>
<mailto:Rivet at projects.hepforge.org>
http://www.hepforge.org/lists/listinfo/rivet


--
Dr Andy Buckley
SUPA Advanced Research Fellow
Particle Physics Experiment Group, University of Edinburgh

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



***********************************************************************
Hannes Jung
Email: Hannes.Jung at cern.ch<mailto:Hannes.Jung at cern.ch> <mailto:Hannes.Jung at cern.ch>
<mailto:Hannes.Jung at cern.ch>
mobile :+49 40 8998 93741
http://www.desy.de/~jung
Tel: +49 (0) 40 8998 3741 (DESY)
Tel: +41 22 76 62602 (CERN)
CERN - PH
42-2-033
CH-1211 Genève 23
Switzerland
***********************************************************************






--
Dr Andy Buckley
SUPA Advanced Research Fellow
Particle Physics Experiment Group, University of Edinburgh

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



***********************************************************************
Hannes Jung
Email: Hannes.Jung at cern.ch<mailto:Hannes.Jung at cern.ch> <mailto:Hannes.Jung at cern.ch>
mobile :+49 40 8998 93741
http://www.desy.de/~jung
Tel: +49 (0) 40 8998 3741 (DESY)
Tel: +41 22 76 62602 (CERN)
CERN - PH
42-2-033
CH-1211 Genève 23
Switzerland
***********************************************************************






--
Dr Andy Buckley
SUPA Advanced Research Fellow
Particle Physics Experiment Group, University of Edinburgh

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



***********************************************************************
Hannes Jung
Email: Hannes.Jung at cern.ch<mailto:Hannes.Jung at cern.ch>
mobile :+49 40 8998 93741
http://www.desy.de/~jung
Tel: +49 (0) 40 8998 3741 (DESY)
Tel: +41 22 76 62602 (CERN)
CERN - PH
42-2-033
CH-1211 Genève 23
Switzerland
***********************************************************************




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.hepforge.org/lists-archive/rivet/attachments/20110821/17529350/attachment.html>


More information about the Rivet mailing list