|
[Rivet] Histogram normalisationAndy Buckley andy.buckley at ed.ac.ukTue Oct 20 16:51:20 BST 2009
Frank Siegert wrote: > Hi Jon, > > Thanks for the comments. > > Jonathan Butterworth, Tuesday 20 October 2009: >> - I'd be very wary of the the KFactor. I am particularly worried that >> people don't start applying multiple scale factors and losing track of >> what has been applied. I suggest that the default output is ALWAYS just >> use the "truth" (either xsec proportional or normalised by your Norm >> factot if applicable) and any other scaling is done later, transiently, >> with plotting tools. The KFactor could be stored and written out so it >> can be applied by the plotting tools if desired (?) > > My proposal was meant to have truth output plus the KFactor=x.xx written > out. But thinking about the complications and dangers of this approach, I > agree: Let's drop the LO mode I proposed, and ... > >> - We also discussing having a plotting tool which steps over various >> scale factors for a combined run and works out the optimal scale factor >> based on the Chi2 between data and MC. This could also (optionally) >> apply the KFactors. Is that still in the plan? > > ... replace it with this more general and easier to implement solution of > automatic KFactor finding. Good: this was exactly what I was about to say. Sorry, the length of your mail made me delay reading it, Frank! One further thing, which I'm not sure counts in your "dropped" KFactor proropsal: I don't see how we can automate the finalize steps without always getting ~half of the observables very wrong, so I think the finalize methods always need to be written as part of the analysis, just without any hard-coded cross-sections. This also makes sense for users like Herwig++, who are accessing Rivet as a library and presumably want whatever histograms are written out to be meaningful *before* post-processing scripts are applied (with any post-processing of their own presumably done via YODA's C++ interface.) In the last week, we added HepMC cross-section filling to Pythia 6 via AGILe, and to Pythia 8: as far as I'm concerned, the remaining generator to cover is HERWIG via AGILe --- anyone know which common block variable to dig in for HERWIG cross-sections? (and whether it's still safe when JIMMY or AlpGen are used) > In any case we just have to make sure, that histograms which already have > a Norm=x.xx or Scale=x.xx are ignored (or does anybody have an analysis > use case where a histogram is scaled with anything else than > crossSection()/sumOfWeights() and _still_ should have a kfactor?). None that spring to mind, but never say never... > And for all others the automatically determined kfactor should somehow be > plotted for each MC run in the histogram, maybe together with the legend, > or above the top edge? Sure. Well, that's a plotting detail, but we can make sure in the YODA implementation that KFactor can be written as an annotation and used by anyone who wants to. In terms of this Norm and Scale stuff, the motivation is presumably the run combination requirement? I.e. if we only ever wanted single runs to be used, then we'd continue to do the scaling (and hence conversion to a scatter-type data object) inside finalize, with the output containing no moments, just points and errors. So the Norm and Scale are really just details of how histogramming has to work if we want to be able to combine multiple runs... of course, someone will eventually try to combine two runs with different scaling targets, so we need to be careful about failure modes! >> - How do plotting tools know whether a histogram has a cross-section >> (i.e. semi-floating) normalisation or is fixed? Is Norm=0 or some other >> special value for the xsec type histograms? Or is Norm just not written >> out? > > I would suggest Norm to not be written out in such a case. If I understand the question, yes. We could also think about marking such histograms explicitly (or vice-versa, mark those subject to K-factor rescaling) rather than using covert channels or magic values. Eike started work last week on centralising some of our histogramming mess, and has implemented a script for cutting out bin ranges to avoid normalisation biases in e.g. Nch plots where the generator is never going to reproduce the diffractive contribution at low-Nch but should be able to be fitted to the data for e.g. Nch > 10. We'll also be working on YODA and providing the floating norm optimisation script that Jon mentioned. I'll keep you posted. Andy -- Dr Andy Buckley Particle Physics Experiment Group, University of Edinburgh
More information about the Rivet mailing list |