Control Sheet No. 22
- Proven methods for estimating the cost of a control system
- First Beam at iBNCT
- All’s FAIR in the City of Science
- On bugs
- The Picture Board
By : Bob Dalesio (Brookhaven National Laboratory)
Project planning for science facilities involves planning the schedule, material, and manpower required to successfully execute a project. In this article, we discuss the details of estimating the material and manpower required and the management of that risk.
Before coming to the science community, I worked as a project engineer building control systems for oil refineries, water distribution, chemical plants, and various Supervisory Control And Data Acquisition (SCADA) and Distributed Control Systems (DCS). The control system and I/O was developed in our company, so all of the software and hardware was proprietary. As a project engineer, I worked with the salesmen to prepare estimates for control projects. As a rule of thumb, the bids were between 7-10% of the overall cost of the facility. If the bid was successful, my team would build the control system through commissioning at the facility. The facilities we were building were not prototype facilities. They were well understood plants whose proposals included Piping and Instrumentation Diagrams. The types, counts, and locations of all I/O was well defined. All bids were largely considered on “low bidder” basis. Every proposal that left our facility was about 10% below cost. After the contract was awarded, all contract changes were priced at a hefty profit margin with the goal of moving the value of the contract to be 10% profit. Even these well-defined, well understood projects invariably issued change requests. The major profit driver in this business was the maintenance contract which was a 10% per year contract to provide 24 hour response to the facility for any control system problems.
In the science community, our control engineers build one-of control systems that are only fully understood two years after commissioning. The range of costs for a control system is between 5% and 11% of the cost of the facility (not including the conventional construction cost). The largest variable in comparing these facilities is understanding what is included in the control system. When global systems like timing, network, control computers, and operator workstations are the full scope, the estimate comes closer to 5% of the facility. If the instrumentation for all industrial and high frequency analog electronics are included, it moves to 11%. This “rule of thumb” has held pretty well in my experience in project management in the accelerator community. If an estimate falls outside of these bounds, there is very typically missing scope or extraordinary costs in there.
Bottoms up estimates are also a reasonable way to verify if the working estimates are reasonable. From a bottoms up perspective, we look at the signals that are being integrated. There are typically several hundred device types, each with several dozen (i.e. Power Supplies) to a couple of hundred signals each (i.e. LLRF). There is some understanding of the number of device instances. So an estimate of the number of signals is obtainable. The formula that was used in industrial control was 15 minutes for each digital signal and 1 hour for each analog signal. This leaves out all of the interesting instrumentation in our community. Numbers I have seen for more complex instruments include 1 day for each motor, 1 week for each diagnostics device that has an existing driver. All new developments must be added to this. So it is important to have some understanding of the interfaces that are being specified. This is unlikely to be complete. Considering 6 weeks per driver and 10 new drivers per project gives some room for implementing new instrumentation interfaces. Once this exercise is done – multiply the number by pi (for its scientific value – and the habitual underestimation of device count in these one-of facilities). The numbers here only cover the implementation, integration, and testing of all of the hardware. You have to add network and computing estimates to this number.
A third approach to estimating the labor needed on these projects is to consider that the primary job of the control engineers is to provide system implementation and integration. When each of the subsystems achieve their preliminary design, the control system engineers need to be fully engaged in each subsystem. In this approach, count the number of bodies that are needed to cover each subsystem. For complex systems: RF, Diagnostics, Physics, having two full time engineers through the life of the project that are embedded in the subsystem groups is required to understand what is being built and help guide how to best integrate it into the control system. One engineer in each of the mechanical groups, facility control is needed. In addition to the project engineering, global systems: network and timing, and tool development and support teams are required. Having an overlap of interest and responsibility helps to assure that changing requirements are captured. More importantly, the embedded control engineer brings knowledge of how this subsystem fits into the overall system. Most facilities have their difficulties in two areas: proper integration of the subsystems into the overall facility and improper specification of requirements to cover the entire life of the project especially in identifying the machine modes for commissioning. As an aside, it is my experience that the quality of the relationship between the control system engineer and the subsystem team is a good indicator of the quality of the subsystem. If the control engineer is included as an integral member of the subsystem team, the results will be significantly better than working with a team that does not see the control engineer as part of the team. As the control budget is heavily weighted to labor, this gives the largest portion of the estimate. In this approach, the cost of the equipment must be added.
Using these three approaches to cost estimates typically yields estimates in the same general area. The next challenge it so meet the estimates. Anyone that has worked in our field is painfully aware that productivity is a huge variable. The other significant variable is the scope of what is delivered. A control system can be interpreted as simply connecting all of the instrumentation to talk to the equipment. Many approach the control system as an engineering activity that is required to deliver what is specified by the physicists and scientists. My experience is that all construction projects have no overall engineer that is tasked with system integration and the group that could do this are the physics group, the control group, and the operations group. It is best if all three of these groups work from the beginning to provide this functionality. Identifying the risks and challenges and making a plan to mitigate these risks is required early in the project to execute the project within budget. It is also important to identify changes in scope that occur during the project in a timely fashion. As engineers, it is our nature to solve problems. With a reasonable budget and a good engineering staff, the control system can assure a smooth commissioning period.
ABOUT THE AUTHOR
Bob Dalesio joined the NSLS-II project at BNL in January 2008 as the Control Systems Group Leader and is responsible for the implementation of all NSLS-II subsystems data acquisition, control, automation and integration into a global control system. Bob has worked in process control systems since 1981 and joined the accelerator control community at Los Alamos National Laboratory in 1985, where his first project was the Ground Test Accelerator. This project launched a tool-kit based control system for accelerators that later became known as EPICS. Bob has been involved in the application of this technology to a wide range of accelerators and other experimental facilities. Most recently, he led the SNS LINAC control system and the LCLS control system. Bob graduated from the University of Maryland with a bachelor’s in computer science.
By: Tilen Žagar and Takashi Nakamoto (Cosylab Japan)
In the last issue of Control Sheet, we introduced Cosylab China, but as most of you might know Cosylab has another branch on that side of the world, Cosylab Japan . If by any chance you encounter anyone from Cosylab Japan, we generally talk about iBNCT without really bothering to explain what it is. We will try to remedy that situation here.
iBNCT [2,3] (not the official name of the project), is an abbreviation for the accelerator-based Boron Neutron Cancer Therapy (BNCT) facility, located in the Ibaraki prefecture (hence the “i”) in Japan. iBNCT is being constructed around 120 km northeast of Tokyo in Tōkai-Mura , a small village (at least by Japanese standards where 40 000 people are barely worth mentioning) on the coast of the Pacific Ocean and is based at the Ibaraki Neutron Medical Research Center. It is also within a stone’s throw of J-PARC, one of Japans largest accelerators. .
Boron Neutron Cancer Therapy
The basic idea behind BNCT  is to inject a cancer patient with a boron-containing drug. Special care is taken so that the drug mostly accumulates in the cancer cells. Boron has a high neutron cross section, a property that is important for the next step, when the patient is irradiated with epithermal neutrons. Neutrons are absorbed into the boron in the cancer cells, and then decays into alpha and lithium particles (Figure 1). Both have a short range (4-9 µm, approximately the size of a cell) and are therefore lethal only for the boron containing cell. If the drug is distributed mostly to the cancer cells damage to healthy tissue is therefore minimal.
A good source of epithermal neutrons are nuclear reactors, but these are not the best candidate for routine treatment. Nuclear reactors are burdened by regulations and inspections, are hard to handle and complicated to maintain, not to mention the negative public opinion of nuclear reactors. Accelerator-based neutron sources avoid most of these problems. They are easy to operate, can be quite compact and therefore installed in or near hospitals and are less-restricted by regulations.
iBNCT is such an accelerator-based neutron source. The team behind the project is quite diverse, containing both academic and government organizations (Tsukuba University, KEK, JAEA, the Ibaraki Prefectural Government, Hokkaido University) as well as industry partners (Mitsubishi Heavy Industries, NAT, ATOX, Cosylab and others).
Together they are building an 8 MeV, 80 kW linear accelerator, that will accelerate protons into a beryllium target to create neutrons. Protons will be accelerated in 1 ms long pulses with a 200 Hz repetition rate, so the duty cycle will be as high as 20%! Combine this with a 50 mA peak beam current and the fact that the entire accelerator fits into an area that is less than 200 m2, the result is an impressive machine.
So, where does Cosylab fit into the picture? For starters, we are in charge of the control system: from the RF to the ion source, vacuum, magnets, beam profiling, timing system, machine protection system and data archiving. The technology of choice for this project is EPICS. Most of the IOCs run on Yokogawa PLCs with F3RP61 PowerPC modules. We are also coding Ladder logic for the sequence CPU part of the PLC and we developed the software for the MRF-based timing system, using Nominal Device Support (NDS). Finally, all the operator screens are created in Control System Studio (CSS), where Python scripts help us to extend its functionality.
Furthermore, with our expert knowledge of physics we are also involved, e.g., in beam commissioning, design of the ion source with LEBT (Low Energy Beam Transport), and beam profile monitors.
The first milestone
The first significant milestone was reached last year when we got the first beam from the accelerator, which amongst other things, showed that the control system works as intended. Currently the accelerator is being upgraded in preparation for the next phase, high-current beam commissioning.
Furthermore, iBNCT is an important milestone in itself for the establishment of the Cosylab Japan branch. It is a large project with an important contribution from our side, and we are eagerly looking forward to building on this reference and taking on many more projects in Japan and neighboring countries.
Figure 1: The basics of Boron Neutron Cancer Therapy (BNCT)
Figure 2: The Radio-Frequency Quadrupole (RFQ) and Drift Tube Linac (DTL).
Figure 3: Operator screens in all their glory.
- Cosylab Japan (http://www.cosylab.jp/)
ABOUT THE AUTHORS
Tilen Žagar, Slovenian, joined Cosylab in Ljubljana, Slovenia in 2012 and moved to Japan in 2014. Tilen is a software developer by training and has a soft spot for programming PLCs. In his free time he enjoys surfing, snowboarding and playing video games.
Takashi Nakamoto, Japanese, worked at Cosylab for 8 months as a Vulcanus in Europe student in 2009 - 2010 and later joined Cosylab Japan as the first full-time engineer. His professional skills focus on software, device integration and beam diagnostics. He likes to spend his free-time hitting balls in a batting cage.
By : Gregor Čuk (Cosylab)
Darmstadt, in the State of Hesse in Germany holds the title of City of Science in recognition of the contributions from and importance of the academic and research institutes that are based there. The Facility for Antiproton and Ion Research (FAIR) , currently under construction on the GSI Helmholtzzentrum für Schwerionenforschung GmbH site, will add further weight to the title.
FAIR will extend GSI’s current capabilities in heavy-ion research by providing anti-proton, ion, and rare isotope beams. Ultimately, the full GSI/FAIR accelerator complex will consist of eight rings (accelerators, storage rings), up to a circumference of 1 100 m, two linear accelerators and about 3.5 km beam lines including two high-intensity secondary beam production sections. The existing GSI accelerators will serve as the injectors for the new FAIR machines. [2, 3, 4]
Research at FAIR falls into four categories :
- Atomic, Plasma Physics and Applications (APPA)
- Compressed Baryonic Matter (CBM)
- Nuclear Structure, Astrophysics and Reactions (NUSTAR)
- antiProton ANnihilation at DArmstadt (PANDA)
FAIR will be built with the cooperation of an international partnership consisting of 10 members, under the FAIR Convention which formally came into force in March 2014. The bulk of the construction cost will be covered by Germany together with the State of Hesse, while the remaining partners (Finland, France, India, Poland, Romania, Russia, Slovenia, Sweden and the United Kingdom) will cover about 30% of the cost of the construction. Contributions by the members will be either in-kind or in-cash. In-kind contributors are fully responsible for implementation, delivery and commissioning. [1, 2, 5]
About the Control System
The FAIR control system consists of all hardware and software required to control, commission and operate the FAIR facility. It draws on collaborations with CERN and uses proven framework solutions like Front-End Software Architecture (FESA), LHC Software Architecture (LSA) and White Rabbit in a standard 3-layer architecture. 
The bulk of the design and development of FAIR’s accelerator control system (ACS) will be done by the GSI Controls group, as the FAIR host institute, but about 20% of the ACS cost will be complemented as an in-kind contribution by the member state of Slovenia. This requires a high level of communication and familiarity with the general ACS architecture and the GSI development team. As a result, clearly defined ACS subprojects have been mutually agreed to and the permanent presence of a Slovenian project manager/ developer on-site at GSI has contributed to the success of the development model. 
Cosylab is a partner of the Slovenian consortium, Tehnodrom, which is providing services for Controls and Diagnostics. Cosylab is the consortium lead for the Controls aspect and together with 5 other partners, will contribute to various aspects of the control system development. Progress has been made on many of the main work packages.
Equipment FESA Device Classes
Work has been completed on some FESA device classes. For the mass flow controller, magnetron and impedance adapter device classes, we are waiting for the proton linac ion source to be ready at CEA Saclay in France so that these can be tested. Work will continue with integrating beam positioning monitors for different machines, LLRFs, the close-orbit feedback system, spectrometers (RGAs), amongst others.
Development work on the Alarm System is completed and site acceptance tests are about to start.
Industrial Type FEC Systems
This work package consists of the delivery of customised front-end controllers for controlling serial devices and a motion controller solution. For serial device control, RS-485, RS-422, RS-232 and IEEE-488.2 GPIB interfaces are supported. For motion control, we developed a customised solution made up of a control unit and motor drive unit and have completed acceptance testing for the FESA class for this device.
FAIR will use a White Rabbit-based timing system and Cosylab will develop and produce 3 form factors - µTCA, VME, PMC – and part of the FPGA gateware. The first priority is the µTCA board because it is needed for the BPMs in the SIS 100, one of the 2 accelerator rings that make up the double synchrotron.
Vacuum Control System
The FAIR vacuum system will have several thousand signals that will be connected to distributed PLC systems. Therefore, an automated way of PLC software generation is desired and for this purpose the UNICOS framework was selected. The proof of concept has already been tested on a vacuum test stand.
The Interlock System will be a PLC-based system with a required response time of 100 ms. The system will be developed in two phases and Cosylab is providing the first phase, which will be installed and commissioned at CRYRING. Currently, factory acceptance tests are underway.
Cosylab will also deliver central control system services such as the Archiving System, the Beam Transmission Control System and the Post Mortem System. Work for these systems is in the requirements gathering stage.
FAIR is characterised by various technological innovations, which justifies the high expectations for beam intensities, beam quality, beam energies, beam power and parallel operation, allowing to further mankind’s understanding of the fundamental structure of matter and to find more answers to how the universe has evolved.
A model of FAIR (right) overlaid on a photograpgh of the GSI site. (Photograph: ion42 for FAIR )
Construction of FAIR has started. (Photograph: Jan Schaefer for FAIR )
- FAIR (http://www.fair-center.eu/)
- FAIR on Wikipedia (http://en.wikipedia.org/wiki/Facility_for_Antiproton_and_Ion_Research)
- GSI (https://www.gsi.de/en/research/fair.htm)
- Managing the FAIR Control System Development, Ralph C. Baer, Frederic Ameil, PCaPAC 2014, Karlsruhe, Germany (http://jacow.web.psi.ch/conf/pcapac2014/prepress/TCO201.PDF)
- The Control System for the FAIR Facility – Project Status and Design Overview, Ralph C. Bär, ICALEPCS 2011, Grenoble, France. (https://accelconf.web.cern.ch/accelconf/icalepcs2011/talks/frcaust01_talk.pdf)
- The FAIR Control System – System Architecture and First Implementations, Ralf Huhmann, Ralph C. Bär, Dietrich Hans Beck, Jutta Fitzek, Günther Fröhlich, Ludwig Hechler, Udo Krause, Matthias Thieme, ICALEPCS2013, San Francisco, USA. (http://accelconf.web.cern.ch/AccelConf/ICALEPCS2013/papers/moppc097.pdf)
ABOUT THE AUTHOR
Gregor Čuk joined Cosylab in 2010 as a senior software developer, and then took on the roles of software and systems architect and is now project leader for the FAIR project. Before joining Cosylab, Gregor worked for more than 10 years at a telecommunications company, where he started as a software developer, continued as project manager and later as a product manager for broadband access products. In his free time, Gregor enjoys traveling, hiking, cycling and photography.
By : Luka Šepetavc (Cosylab)
Nothing in life is certain but death and taxes …and bugs!
In any development, both hardware and software, bugs are inevitable. According to Wikipedia, a bug is an error, flaw, failure, or fault in the software or hardware that yields an incorrect or unexpected result or behavior. Bugs generally arise because of the human element that unintentionally introduces mistakes or errors during the implementation phase of the project.  Typically, the complexity of the project affects two things: the number of bugs and the effort required to resolve them.
Bugs and debugging are dreaded by software and hardware engineers and more so by the users who have to deal with their consequences. Developers accept bugs as an inherent part of the development cycle, while users understandably don’t!
Hardware and software engineering are mature engineering disciplines, that accept the presence of bugs in projects, as a fact of life.
Some of the most demanding industries, where the tolerance to bugs is lowest, are the avionic and aerospace industries. Their top priorities are safety and availability. To achieve this, they apply some of the most stringent development life-cycles and development standards out there (e.g. DO-178C, DO-254, NASA-STD and MIL-STD families of standards), but this does not reduce the bug count to zero.
It is more economic and effective to plan for their appearance and removal then to attempt a perfect project with zero bugs. In project planning, this step is often omitted or is significantly underestimated; in terms of the consequences of bugs and the time and effort required for their resolution.
When it’s not a bug
Before we see how to get bugs under control, it is important to understand when certain system misbehaviors are not actually bugs.
When a system does not perform as expected, often the first thought through the user’s mind is: “It’s a bug.” and the software or hardware engineer is blamed for the problem. As users, we never consider whether the expected behaviour was even specified in the requirements.
Just like the end-product, the written requirements of a large project are not perfect. They contain omissions, contradictions and mistakes. As a result the system exhibits certain “undesirable” behavior, but analysis reveals it was “as specified”, or, the product owner has changed his mind since the requirements were written. If these shortcomings of the software are correctly recognized as “change requests” and not as bugs, the complex change management of the project can stay under control. If this crucial distinction is not understood by all stakeholders, it can put a serious strain on the cooperation and project success!
An example of an “omission” is something that might appear obvious. Or at least it seems obvious near the end of the project. As a rule of thumb, if something is important, it makes sense to write it down.
Studies  show that the majority of safety related bugs in safety-critical systems are caused by “bad” requirements. The seeds of system misbehaviour are planted when requirements are not gathered properly or when they are not understood and carefully reviewed by all stakeholders.
Debugging & The Project Life-cycle
Therefore, the first thing that should be “debugged” is the requirements. By applying careful verification (reviews, testing) during the earlier stages of the project, much can be done to lower the number of bugs that pop up later in the life-cycle and are then much more costly to resolve. Applying a higher level of rigour may at first seem more costly and may seem to take more time but in the end it produces a higher quality system that contains fewer bugs, is easier to maintain and the overall “cost of ownership” is actually lower when considering its complete lifetime.
So as a product-owner you should expect to be involved in the change management process. Somewhere in the middle of the project you will find yourself sitting down with the development team going through a (sometimes long) list of recorded “issues”, which needs to be sifted and sorted.
Some will be recognized as coding mistakes, others will labelled as a “change request”. All will be sorted by importance and addressed by priority. This is a painstaking process, but the value to keep a large software project on track is unmeasurable.
Plan for Debugging
To keep bugs under control, i.e. reduce their number and resolve them efficiently, it is imperative to carefully plan the system life-cycle: plan for proper system definition, plan for testing, plan for bugs and plan for debugging. The effort, time and processes required to develop a high quality system are often severely underestimated. Software and hardware development are not just coding and soldering. These are only one step of many - requirements, architecture, design, implementation, review, testing, validation. If you skip, or “skimp” on any of the steps, you lose control of your system and start opening a can of worms. The consequences can be severe.
There are also some specific areas of attention regarding bugs: find them early, resolve them effectively and prevent them from re-appearing.
Finding Bugs Early
Since bugs are inevitable, despite the level of rigour applied in a project, what can be addressed is how many are produced, how many are discovered and in what stages of the project they are discovered.
The fact is that the cost of bugs rises exponentially through the project lifetime. So ideally the number of bugs as a function of time should be constantly dropping. In practice this is very difficult to achieve because, the further in the project life-cycle we get:
- the more components are integrated together,
- the system gets more complex,
- the more use cases are exercised,
- some deficiencies are discovered only with the use of the final system.
These make it very easy to discover new deficiencies as the project nears its end.
When it comes to debugging, the main steps are to identify the cause of the bug, carefully devise a solution, implement the changes, verify that the bug has been fixed and that no new bugs have been introduced.
Identify the cause
- Start with the basics: read the manuals, check the cables.
- Make use of available diagnostics: measure and check the voltage of signals using test points, go through the log files, check the console output. If diagnostics are unavailable, make sure they are included in your next project!
- Make the system fail repeatedly. Find the exact conditions under which the bug is consistently reproduced. There is no such thing as a glitch (one time event) or a random fault, there is always a reason why the system doesn’t behave as expected.
- Change one thing at a time! Don’t fall into the trap of changing multiple things at the same time and expecting useful results.
- After every change, write down the results. It is possible to try out tens or hundreds of different scenarios per day chasing a bug. In a week or a month you won’t remember all the steps you have tried and what their results where.
- Versioning. For each test, establish and write down the exact version of software and hardware used, versions of the libraries, operating system etc. What exact versions are used in a system which exhibits a failure? Use exactly the same versions to reproduce the bug.
Devise a solution
- Once the cause of the bug is found, think before fixing it.
- Understand the system and take care with how the fix will fit into its architecture & design.
Implement the changes
- Apart from “simply” changing the code or re-spinning a PCB, make sure to document the change(s). If not elsewhere, at least in the versioning system and release notes.
- Repeat all tests or a set of tests that will verify that the fix has resolved the bug and that no new bugs have been introduced.
- The quality of this step greatly depends on the quality of regression tests. If the tests are vague and don’t put the system through all its paces, the results will be inconclusive.
- Top tip: After a new bug has been fixed, add a test case to the list of your regression tests which is able to identify this bug in case it reappears in the future.
One more step...
- Learn from your mistakes.
When the battle is over, take some time to analyze the bugs and think about how to prevent them in the future. Make a plan and take actions, e.g. update your coding guidelines or actually start using them.
While dealing with bugs in a disciplined way is not the most fun part of the development cycle, it is an indispensible part. However, great is the joy and satisfaction when a particularly nasty specimen of the order of Hemipterae has been permanently eradicated from your system!
* Editor’s Note: This article went through a disciplined verification process as even writing is not immune to bugs :).
The time to fix bugs and the related cost grows exponentially as the development progresses.
REFERENCES & RECOMMENDED READING
- N. Leveson, Safeware: System Safety and Computers
- Philip Koopman, Better Embedded System Software
- Leanna Rierson, Developing Safety-Critical Software: A Practical Guide for Aviation Software and DO-178C Compliance
- David J Agans, Debugging: The 9 Indispensable Rules for Finding Even the Most Elusive Software and Hardware Problems
- Jack Gannsle, Articles About Embedded Systems, http://www.ganssle.com/articles-subj.htm
ABOUT THE AUTHOR
Luka Šepetavc joined Cosylab in 2009 as an FPGA and hardware developer and soon continued as a software developer on a medical real-time operating system. Later he took over the system architecture, design and project management of a real-time power supply control system for a medical accelerator. Presently he’s designing and managing development of safety-critical medical systems. He likes to read, learn about new ideas and taste new food.
What happens when Cosylab lets its hair down... These are some of the scenes from our end of 2014 party :)
Live long and prosper
The challenges of controlling a party game
80’s Space flash-back!
Do you have a Cosylab T-shirt? We want to see you, your husband or wife, children, cats, dogs, goldfish ;) in the T-shirt. We will not only send you another T-shirt (there are many designs!), but we will also publish your picture in the next issue of Control Sheet.
Send pictures to firstname.lastname@example.org to previous content