[Rivet] Question about YODA Profile1D: how to show RMS instead of errors of the mean

Adrian Buzatu Adrian.Buzatu at glasgow.ac.uk
Sun Jun 14 22:15:24 BST 2015


Hi Andy,

I was aware of only NumPy for the Python-based tools you mentioned. When I have some time, I will try to look at them, to be going with the times. It was very enlighting to have this overview thanks to this email exchange. Thank you very much, and I will let you know if I encounter new features that might improve YODA.

Cheers,
Adrian
________________________________________
From: Andy Buckley [andy.buckley at cern.ch]
Sent: 14 June 2015 21:24
To: Adrian Buzatu; rivet at projects.hepforge.org
Cc: Chris Pollard
Subject: Re: Question about YODA Profile1D: how to show RMS instead of errors of the mean

We've been building YODA up gradually -- it's starting to get more use,
and the more obscure bits of the interface are now stabilising (booking
and filling having been unchanged for a long time). I tend not to push
it hard since we don't have a lot of development time & resources, and
because we still have some big ideas of how to make improvements... but
do feel free to advertise it if you find it working well! (And tell us
if not :-) )

I've noticed SciPy, NumPy, matplotlib, and advanced Python statistical
tools like scikit-learn, and performance boosters like Numba are
starting to make inroads into HEP now. In addition to rootpy, which
tries to sanitise ROOT's Python interface, there are root_numpy and
rootplot packages to connect ROOT to numpy and mpl. I recently saw that
even one of the TMVA & RooStats authors is now using Python and
scikit-learn!

I think over the next few years we will see an increasing HEP shift to
use these external tools, coming from the younger, tech-aware intake who
are open to new things, or maybe have undergrad background in using
them. These external tools, unlike our in-house ROOT monoculture, have
had to be well-documented and continually improving to survive -- when I
first used mpl ~7 years ago I thought it was junk, but it's improved a
great deal and just keeps getting better. Because of our ROOT-based big
data format, we're going to have to retain a ROOT connection for the
forseeable future, but I'm not the only person I know who prefers to
load ROOT files into numpy objects :-)

Enough preaching... keep letting us know what we can improve!
Andy


On 14/06/15 14:57, Adrian Buzatu wrote:
> Hi Andy, that is very interesting. I learned a lot from this exposure to RIVET/YODA/matplotlib. I had no idea before that there is an alternative to ROOT. Maybe YODA/SciPy should be promoted more within the community.
>
> Cheers,
> Adrian
> ________________________________________
> From: Andy Buckley [andy.buckley at cern.ch]
> Sent: 14 June 2015 14:54
> To: Adrian Buzatu; rivet at projects.hepforge.org
> Cc: Chris Pollard
> Subject: Re: Question about YODA Profile1D: how to show RMS instead of errors of the mean
>
> On 14/06/15 14:35, Adrian Buzatu wrote:
>> Dear Andy,
>>
>> Thank you for the prompt and detailed answer. I understand things better now. So the plotting including fitting will be done using Python's SciPy, and keep the other scripts for RIVET specific tasks.
>
> That's right: YODA is a binning & histogramming stat object library with
> no knowledge of physics. It will increasingly contain helper functions
> for making out-of-the-box nice plots with matplotlib, but we leave all
> the rendering machinery to mpl itself (sometimes making requests /
> supplying patched to them).
>
> And then Rivet will put some convenience wrappers on top of all that, to
> make use of the physics context and some Rivet-specific conventions
> (e.g. we'll shortly be storing the run cross-section as a YODA data
> object, but YODA itself will only see it as a data object without having
> any idea of what a cross-section is).
>
>> We'll talk offline more.
>
> Look forward to it!
>
> Cheers,
> Andy
>
>
>> From: Andy Buckley [andy.buckley at cern.ch]
>> Sent: 14 June 2015 14:05
>> To: Adrian Buzatu; rivet at projects.hepforge.org
>> Cc: Chris Pollard
>> Subject: Re: Question about YODA Profile1D: how to show RMS instead of errors of the mean
>>
>> On 14/06/15 12:43, Adrian Buzatu wrote:
>>> Hi Andy,
>>>
>>> Thank you for creating the files for me. Would you be able to produce them from this file as well? They contain different physics, with "w" replaced by "n".
>>
>> Sure, I've done these and attached both sets, now with the .yoda
>> filename as the first part of the plot filename. Do you want x and y
>> labels, too? (I'll send them off-list.)
>>
>>> My Rivet version:
>>> [abuzatu at ppepc137 yoda]$ rivet --version
>>> rivet v2.2.1
>>
>> That is an up to date version. It's the YODA version that really matters
>> here, but I guess this is also up to date. Strange that it didn't work
>> -- another reason to get a new release out.
>>
>>> How can I install the developmental one, or something more recent?
>>
>> Yes, our development repositories are public. I've not tested it
>> personally, but the rivet-bootstrap script should install the
>> development versions of YODA and Rivet if you run it like
>>
>> INSTALL_RIVETDEV=1 ./rivet-bootstrap
>>
>>> I do have a few suggestions that we can discuss next week.
>>>
>>> o One I mentioned about one week ago relating the fact that the -m and -M options do not work on rivet-mkhtml to select based on regular expressions. When I produce a lot of histograms and profiles, on the order of hundreds, it takes about 10 minutes or more to run them all. But if I want to adjust just one plot, or a few of them, it would be helpful to selection some like "RivetBjet/p1_*". The options suggests it does that, but it did not work for me. Again, maybe I do not have the most recent version?
>>
>> This should now be working with the trunk versions of Rivet and YODA --
>> I added pattern-matching features to the YODA read() function, which
>> makes it easy to pass them from
>>
>>> o make-plots has different options for the output, but two or more of them together do not work, for example having "--pdf --eps --pdf". You do have some options of two of them combined "-pdfpng", but I think it would be useful to have the first option. For example, one needs .png for a webpage, and .pdf for some latex and .eps for some latex. It is good to produce them all at once. As you never know when you need the plot later.
>>
>> Hmm, they are meant to be combinable. make-plots is a bit of a dead-end
>> script, as we work toward the long-term matplotlib version, but things
>> like this should be fixed...
>>
>>> o If the user decides to replace the rivet-mkhtlm by splitting it in it's individual commands, there is no individual command for the make-html, and I think it would be useful. For example.
>>>
>>> RUN_HEPMC_TO_YODA="rivet --pwd -a ${ANALYSIS} ${FILE_HEPMC} -H ${FILE_YODA} --histo-interval=10000 " # -n 20000" #-n 20000" # -l Rivet.Analysis=DEBUG" # -v
>>> RUN_YODA_TO_HTML="rivet-mkhtml ${FILE_YODA} -c ${ANALYSIS}.plot --mc-errs -n 8" #-M \"./plots/RivetHbbBJets/counter_lepton.dat\"' " # -n 8 -v
>>> RUN_YODA_TO_DAT="rivet-cmphistos ${FILE_YODA} -o ${FOLDER_PLOTS} --hier-out --mc-errs"
>>> RUN_DAT_TO_PLOTS="make-plots ${FOLDER_PLOTS}/${ANALYSIS}/p1_*.dat --pdfpng" # -n 4"
>>> RUN_PLOTS_TO_HTML="./make_html.sh ${FOLDER_PLOTS}/${ANALYSIS}"
>>>
>>> if [ ${doPlot} -eq 1 ]; then
>>>     if [ ${doPlotNew} -eq 1 ]; then
>>>         echo "RUN_YODA_TO_DAT"
>>>         `echo ${RUN_YODA_TO_DAT}`
>>>         echo "RUN_DAT_TO_PLOTS"
>>>         `echo ${RUN_DAT_TO_PLOTS}`
>>>         echo "RUN_PLOTS_TO_HTML"
>>>         `echo ${RUN_PLOTS_TO_HTML}`
>>>     else
>>>         `echo ${RUN_YODA_TO_HTML}`
>>>     fi
>>>
>>> I created a small .sh script, but it does not have the nice features of the <div> that  rivet-mkhtml has. In other works, I would suggest decoupling the code that makes the html from rivet-mkhtml from the one that make the plots and offer it as a separate tool, as the regular expression works on make-plots, but breaks on rivet-mkhtml.
>>
>> I'm not sure about this -- the point of mkhtml is that it knows what
>> analyses have been plotted in a particular directory structure, and it
>> wouldn't make sense to run it on a general set of dirs that it didn't
>> make. And we are trying not to proliferate many, many rivet-* scripts. I
>> hope that the pattern-matching improvement makes this less necessary,
>> but let's keep an eye on it because we do want to solve user problems.
>>
>>> o I have not used it yet, but it would be useful to be able to fit the histograms with functions that the user defines, for example the Bukin function, which is a generalization of the Gauss function to allow for assymetric tails. Can the plotting part do that?
>>
>> Again, that's not really plotting functionality -- there are so many
>> possible functional forms, and so many possible approaches to the fit.
>> I've found SciPy's 1D curve fitting (and cleverer 2D features like
>> ungridded spline interpolation) to work nicely:
>>
>> http://docs.scipy.org/doc/scipy-0.15.1/reference/generated/scipy.optimize.curve_fit.html
>>
>> Fitting, and producing a set of data points to plot, should take just a
>> few lines of Python. I think it'd be nice if the function itself could
>> be passed as a mpl plot() argument, though... one of several things I'll
>> suggest to them when I get a spare moment.
>>
>> Andy
>>
>>
>>> From: Andy Buckley [andy.buckley at cern.ch]
>>> Sent: 13 June 2015 20:56
>>> To: Adrian Buzatu; rivet at projects.hepforge.org
>>> Cc: Chris Pollard
>>> Subject: Re: Question about YODA Profile1D: how to show RMS instead of errors of the mean
>>>
>>> Hi Adrian,
>>>
>>> I'm able to read this one ok -- at least with the development version of
>>> YODA that I have installed. What version are you using? I attached the
>>> output of a slightly modified script, now without the red errorbar
>>> overlay for the std err on the mean.
>>>
>>> This has been a useful discussion for me -- gives me a few definite
>>> to-do's for the next YODA release that I hope to make before the end of
>>> the month. If you've got more questions just ask... or come find me in
>>> Glasgow this week!
>>>
>>> Andy
>>>
>>>
>>> On 13/06/15 17:41, Adrian Buzatu wrote:
>>>> Sorry, I had attached the wrong .yoda file before.
>>>>
>>>> Adrian
>>>> ________________________________________
>>>> From: Adrian Buzatu
>>>> Sent: 13 June 2015 17:38
>>>> To: ab374y-redirect; rivet at projects.hepforge.org
>>>> Cc: Chris Pollard
>>>> Subject: RE: Question about YODA Profile1D: how to show RMS instead of errors of the mean
>>>>
>>>> Dear Andy,
>>>>
>>>> Thank you for the clarification that YODA is just the histogramming format and then the plotting is done with Python libraries outside. It seems promising to use this script to adapt the plots. Unfortunately, I get an error. When I print the bins, I get the same erroneous value in all bins and later on the code segfaults when trying to do operations on the bin. I attach my .yoda file with the Profile1D. Maybe you could use it in your script to see if there is something wrong with the .yoda file?
>>>>
>>>> Thank you,
>>>> Adrian
>>>>
>>>>  [abuzatu at ppepc137 RivetHbbBJets]$ ./convert.py
>>>> <Profile1D '/RivetHbbBJets/p1_PTm_PTwm' 15 bins, sumw=1.4e+05>
>>>> <ProfileBin1D x=[1.34425e-312, 6.94398e-310)>
>>>> Segmentation fault
>>>>
>>>> and with "continue"
>>>> [abuzatu at ppepc137 RivetHbbBJets]$ ./convert.py
>>>> <Profile1D '/RivetHbbBJets/p1_PTm_PTwm' 15 bins, sumw=1.5e+05>
>>>> <ProfileBin1D x=[1.34425e-312, 6.91224e-310)>
>>>> <ProfileBin1D x=[1.34425e-312, 6.91224e-310)>
>>>> <ProfileBin1D x=[1.34425e-312, 6.91224e-310)>
>>>> <ProfileBin1D x=[1.34425e-312, 6.91224e-310)>
>>>> <ProfileBin1D x=[1.34425e-312, 6.91224e-310)>
>>>> <ProfileBin1D x=[1.34425e-312, 6.91224e-310)>
>>>> <ProfileBin1D x=[1.34425e-312, 6.91224e-310)>
>>>> <ProfileBin1D x=[1.34425e-312, 6.91224e-310)>
>>>> <ProfileBin1D x=[1.34425e-312, 6.91224e-310)>
>>>> <ProfileBin1D x=[1.34425e-312, 6.91224e-310)>
>>>> <ProfileBin1D x=[1.34425e-312, 6.91224e-310)>
>>>> <ProfileBin1D x=[1.34425e-312, 6.91224e-310)>
>>>> <ProfileBin1D x=[1.34425e-312, 6.91224e-310)>
>>>> <ProfileBin1D x=[1.34425e-312, 6.91224e-310)>
>>>> <ProfileBin1D x=[1.34425e-312, 6.91224e-310)>
>>>>
>>>>
>>>> Thanks,
>>>> Adrian
>>>> ________________________________________
>>>> From: Andy Buckley [andy.buckley at cern.ch]
>>>> Sent: 13 June 2015 16:49
>>>> To: Adrian Buzatu; rivet at projects.hepforge.org
>>>> Cc: Chris Pollard
>>>> Subject: Re: Question about YODA Profile1D: how to show RMS instead of errors of the mean
>>>>
>>>> On 13/06/15 11:37, Adrian Buzatu wrote:
>>>>> Hi RIVET/YODA authors,
>>>>>
>>>>> For the studies I am doing on the jet energy scale it was suggested to
>>>>> me to reproduce some YODA Profile1D where instead of the error bars on
>>>>> the mean to show the RMS of the distribution, as we already have the
>>>>> mean on the Y axis. Do you know how to do that?
>>>>>
>>>>> On a different, but maybe related note, do you know if I can interface
>>>>> ROOT with RIVET to use ROOT insted of YODA? I am familiar with how to do
>>>>> this in ROOT, and then use PyROOT, maybe that can help. But I hope plan
>>>>> A with YODA has a feature for this.
>>>>
>>>> Hi Adrian,
>>>>
>>>> There's some confusion here about what YODA does: it mostly just handles
>>>> the statistical definition of histograms, but the plotting is external.
>>>>
>>>> rivet-cmphistos converts to the simple make-histos format, and the
>>>> decision about what size to give to error bars is built into the
>>>> converter code. It doesn't have a simple flag to change the profile
>>>> error bar plotting, though.
>>>>
>>>> Our future direction for Rivet and YODA plotting is to use the Python
>>>> matplotlib (mpl) library. There's the start of some helper functions in
>>>> the yoda.plotting Python module, but even without touching those it's
>>>> pretty easy to do mpl plotting of YODA objects:
>>>>
>>>>   ## Load a profile histo
>>>>   import yoda
>>>>   aos = yoda.read("/home/andy/proj/hep/rivet/Rivet.yoda")
>>>>   p = aos["/MC_GENERIC/EtaSumEt"]
>>>>   assert type(p) is yoda.Profile1D
>>>>
>>>>   ## Get data points and errors
>>>>   xs, ys, exs, eys, sigys = [], [], [], [], []
>>>>   for b in p.bins:
>>>>       if b.sumW:
>>>>           xs.append(b.xMid)
>>>>           exs.append(b.xWidth/2)
>>>>           ys.append(b.mean)
>>>>           eys.append(b.stdErr)
>>>>           sigys.append(b.stdDev)
>>>>
>>>>   ## Plot
>>>>   import pylab as pl
>>>>   pl.hold(True)
>>>>   pl.errorbar(xs, ys, xerr=exs, yerr=sigys, marker="None", ls="None",
>>>> ecolor="black")
>>>>   pl.errorbar(xs, ys, xerr=exs, yerr=eys, marker="None", ls="None",
>>>> ecolor="red")
>>>>   pl.plot(xs, ys, "o", color="black")
>>>>   pl.savefig("rmsprofile.pdf")
>>>>
>>>> With the demo Rivet.yoda file that I had lying around, this produced the
>>>> attached PDF (note that I used the standard deviation rather than RMS
>>>> for this plot, since b.rms() wouldn't subtract the mean value in the
>>>> calculation of the distribution width). It should be pretty easy to
>>>> extend that script, and I can help you to use the yoda.plotting tools if
>>>> you like. I'm also going to add an optional argument to the Profile1D
>>>> mkScatter, function, which will make it possible to do this sort of
>>>> plotting in just a few lines.
>>>>
>>>> There's also a yoda2root script, which makes use of converter functions
>>>> in the YODA/ROOTCnv.h header... I'm not sure if it works completely for
>>>> profiles, though, since ROOT's interface for populating those is very
>>>> odd. If you go that route, let us know how you get on!
>>>>
>>>> Andy
>>>>
>>>> --
>>>> Dr Andy Buckley, Lecturer / Royal Society University Research Fellow
>>>> Particle Physics Expt Group, University of Glasgow
>>>>
>>>
>>>
>>> --
>>> Dr Andy Buckley, Lecturer / Royal Society University Research Fellow
>>> Particle Physics Expt Group, University of Glasgow
>>>
>>
>>
>> --
>> Dr Andy Buckley, Lecturer / Royal Society University Research Fellow
>> Particle Physics Expt Group, University of Glasgow
>>
>
>
> --
> Dr Andy Buckley, Lecturer / Royal Society University Research Fellow
> Particle Physics Expt Group, University of Glasgow
>


--
Dr Andy Buckley, Lecturer / Royal Society University Research Fellow
Particle Physics Expt Group, University of Glasgow


More information about the Rivet mailing list