Reconstruction Package Design Guide

1 Reconstruction Package Design GuideDesigning data struc...
Author: Octavia Beasley
0 downloads 3 Views

1 Reconstruction Package Design GuideDesigning data structures and algorithms Jim Kowalkowski Marc Paterno

2 Goals Summarize the rules for the production of “legal” reconstruction packages Present some suggestions for production of well-crafted reconstruction packages Go quickly over the requirements I\(available from other documents) Stress good practice Many people are coding by copying, and may be copying bad practice — point out more egregious cases? Alert people to the fact that they need more information on various topics; we aren’t in the position to present that information during the talk. People can refer to the handout after the talk; perhaps present examples in the talk. Aim for the intermediate designer, not the complete novice.

3 The Big Picture An overview

4 The Model Deleterious effects of physical couplingThe standard solution: separation of data and algorithms data in one library (“chunks”) algorithms in another library (“packages”) chunk library must not depend on package library

5 Algorithm Inputs and OutputsFirst design task: decide what inputs are needed decide what is produced Second design task: Is this algorithm... Part of a package? One package? Several packages?

6 Algorithms and the FrameworkInheritance from Package, implementation of one or more interface(s) What interfaces are available, and which should I use? Building an executable parts needed (objects and libraries) GNUmakefile (how to tell SRT what is needed)? What is the party line? How does this work in CTBUILD?

7 Classes and Instances Instance Classa unit that combines a specific state (data) and the functions used to manipulate it (methods) Class a type that defines related instances a description of what the instances have in common (types of data, method definitions) the body of code that manipulates the data in the instances A program can have multiple instances of the same class, each with different values

8 Parameterized ClassesClass template A description for how to write a class Describes a family of classes that share common characteristics Instantiating a class template causes the compiler to write a class; one can then make instances of the class std::vector — class template std::vector — instantiated class std::vector vf — class instance

9 The Framework InterfacesMore commonly used Process Filter Analyze Tag JobSummary RunInit RunEnd Less commonly used Generate Decide Dump Output Build

10 Parts Needed Main program object Framework related librariesframework.o Framework related libraries libframework.a libio_packages.a libevpack.a libd0om.a ... Analyze package has a full list (it might have more than required) Package libraries libemreco.a libcpsreco.a one required for each package you use Package registration objects RegEMReco.o RegCPSReco.o

11 Package ConfigurationParameters are introduced by RCP objects flexibility, automatic bookkeeping Framework described by RCP object can have multiple instances of the same class with different parameters, in the same program

12 Contents of Framework RCPstring InterfaceName = “process” // use this for reco development // Put here all the interfaces which you might want to implement string Interfaces = “generator decide filter process analyze dump tag output jobSummary runEnd runInit” // Put here the interfaces in the order you want the Event to travel string Flow = “generator decide filter process analyze dump tag output” // Package instance names, in the order in which you want them called string Packages = “read_event prereq conejet_5 conejet_7” // Define what these package instances mean RCP read_event = RCP prereq = // put the real prereqs here RCP conejet_5 = RCP conejet_7 =

13 Basic chunk design Good chunk design

14 What must I do to make my chunk work?Basic Chunk Design What must I do to make my chunk work?

15 Persistence Requirements due to D0OM allowed types no pointersbasic types classes which inherit from d0_Object STL containers no pointers no template classes unless a D0OM adapter exists (e.g., STL containers) preprocessing with d0cint required, to write x_lnk.cpp and x_ref.hpp files

16 Accounting InformationRecord information specific to the creation of your chunk values of algorithm parameters through RCPID commonly used parameters can be stored redundantly, e.g. KT algorithm ycut threshold Record parents Can always keep more than the minimum This information will be used by others during chunk lookup

17 EDM Rules No direct pointers to another chunk, or things within another chunk instead, use Link (under review) Chunk has exclusive ownership of all its data Provide const access; users get access only to const chunks Avoid mutable data members EDM tutorial online

18 What should I do to make my chunk work well?Good Chunk Design What should I do to make my chunk work well?

19 Chunks as Collections Most chunks are collections. They should be:Coherent Contain and present an STL container consider std::vector first Present standard typedefs, and use them value_type, index_type, container_type Present standard functions: contents( ), at(const index_type& i ) Provide predicates in the chunk library, appropriate for your users.

20 Predicates A class (or struct) which encapsulates a unary test or binary comparison struct less_pt : public binary_function { bool operator()(const Track& a, const Track& b) const {return (a.pt( ) < b.pt( )); } }; class high_pt : public unary_function { public: high_pt(float t) : _threshold(t) { } bool operator( )(const Track& a) const { return a.pt( ) > _threshold; } private: float _threshold

21 Designing for SelectionMerely meeting requirements is insufficient; useful chunks provide more Chunk designers should write the selectors which users need to locate the chunk instances they want e.g. JetCollection: algo = “KT”, ycut = x Generally will consider “accounting” info, not “event” data Record name and version of algorithm library

22 Do I Need a New Chunk Class?Use instances of the same chunk, when sensible different instances of JetCollection, made by algorithms with different parameters Maybe make a different chunk class when the contained data is really different Consider a new chunk class if your design contains multiple collections of different types

23 Using Activate and DeactivateTransient data format can be more efficient and easier to use provide more convenient form of data for users save storage by making only non-redundant data persistent Only persistent data needs to meet the D0OM restrictions How activate( ) works called after D0OM fills persistent data members Note there is no Event pointer available

24 Avoiding if/else and case/switchThis is generally bad design object has an internal state which has to be queried from outside this state has to be checked in many places in the code — very difficult to keep all correct Instead, use polymorphism maybe contain a base class maybe design different chunk class to hold each different flavor of contents

25 Basic package design Good package design

26 What must I do to make my package work?Basic Package Design What must I do to make my package work?

27 Package InitializationConstructor takes a Context* merely pass it to base class c’tors Get at parameters via method Package::packageRCP( ) All parameters for initialization should come from RCP keep record of what was done Write setParameters(RCP r) call from c’tor, passing packageRCP( ) can be called by Framework, to reinitialize

28 Keys, Handles, and SelectorsUse a selector (provided by the chunk designer) to choose what to extract Chunk class determined by TKey Chunk instance determined by selector THandle returned; use as a pointer THandle hXch = key.find(e); if (hXch.isValid()) hXch->doXChunkFunction();

29 Putting Chunks into the EventEvent takes ownership of the chunk Uses auto_ptr to indicate this do not delete the chunk Chunk owns its contents do not delete the contents

30 Using the Framework Implement the processEvent() interfaceFrom Package, implement statusReport( ) Report summary of state (statistics) to cout reinitialize(RCP r) this should call setParameters(r), probably should clear histograms, etc. flush( ) Flush any buffers your package contains (histograms, log streams, etc.)

31 Rules No caching event data between events No global variablesNo removing pointer from handle use the handle as a pointer Do not include headers from another package’s private directory Do reconstruction in the processEvent( ) interface Do not cast away const-ness

32 What should I do to make my package work well?Good Package Design What should I do to make my package work well?

33 Error Handling Catch exceptions thrown by your own algorithmsFramework catches all exceptions that exit your package; logs the message and program dies Use return codes Framework group needs to decide what the return codes mean Currently any package failure aborts further processing of that event (from that group)

34 Views: proposal Collection of pointers to items in a chunk; template class used by algorithms in preference to chunks decouples algorithms from chunk details Filled using standard chunk method contents( ); View remains modifiable can use predicates for selection can be sorted upon or after filling Usable with STL algorithms EDM will provide a default View template class

35 template class PRED> void fillView(const SRC& s, DEST& d, const PRED& p); template class View : public std::vector > {...} template class AlwaysTrue { bool operator()(const ELEMENT&) const { return true; } }

36 Example Use of View Result processEvent(Event& e) {TKey key; THandle hX = key.find(e); if (hX->isValid() ==false) return Result::failure; // Create an empty View View xview; // Fill the View, using a predicate fillView(hX->contents(), xview, BigX(5)); // Sort in place, using another predicate sort(xview.begin(),xview.end(),SortAscX()); ... use xview like an STL container ... }

37 Example Chunk Predicatesclass SortAscX { public: bool operator()(const X& a, const X& b) const { return a.value()>b.value(); } }; class BigX { BigX(int t) : _thresh(t) { } bool operator()(const X& a) const { return a.value() >= _thresh; } private: int _thresh;

38 Reconstructor Package ModelReconstructor = Package that implements processEvent() to do reconstruction Reconstructor gets chunks from Event, gets access to data (STL-like interface for chunk makes this easy); maybe makes a View Algorithm(s) use data, create new data — use a View Reconstructor wraps new data in a new chunk, inserts it into the Event

39 Using the Strategy Pattern (1)Strategy pattern used to provide flexibility at run-time: vary algorithms, not just numerical parameters Example: KT jet reconstruction Selection algorithm: selection which objects to combine (or decides that the algorithm terminates) Combination algorithm: combines two objects

40

41 Using the Strategy Pattern (2)Many possible selection algorithms; all have a common interface ® base class Many possible combination algorithms; all have a common interface ® another base class Reconstructor contains pointers to base class, filled during construction as directed by an RCP object, and calls their methods in processEvent( ).

42 Do I Need a New Package Class?If you are using the same inputs and making the same outputs, probably not; consider Strategy. Or just use another instance, with different parameters (fed in by RCP). If you are using different inputs, but making the same output, probably yes. If Strategy makes the task clearer, use it. If you are making different outputs, almost certainly yes.

43 Effective use of STL Use iterators to indicate parts of collections, rather than partial copying Use the right collection consider vector first; list for mid-insertion; set for automatic ordering; map for arbitrary association; deque as a vector/list compromise Use available chunk predicates to perform filtering and sorting Learn the available algorithms

44 STL Example #include // for deque#include // for sort, for_each #include // for not2, mem_fun #include // for back_inserter using namespace std; void printInterestingStuff(View& tracks) { // Select tracks with transverse momentum > 5. deque good; copy_if(tracks.begin(), tracks.end(), back_inserter(good), high_pt(5.)); // Sort on pt, in decreasing order sort(good.begin(), good.end(), not2(less_pt())); // Print summary for_each(good.begin(), good.end(), mem_fun(Track::print)); }

45 Techniques to help your code be maintainableMAINTAINING A SYSTEM Techniques to help your code be maintainable

46 Best Practice Do not put using namespace in any headerUse forward declarations wherever possible Prefer ++it to it++ Do not expose details unnecessarily e.g.: Do hide bit encodings from users Use meaningful member names, rather than slot numbers in an array

47 Writing Class Tests Write small tests that (as fully as possible) exercise your class use minimal extraneous stuff Great aid to regression testing Great way to ensure that your code is not the source of a failure Use Purify on your tests before requesting a release