Control Sheet No. 6
In this issue:
- On manpower for new projects
- What is really important about control systems
- Power-grid synchronization @ SNS; or how to keep er electronics and chopper rotors happy
We probably all agree that the nicest projects to develop and manage are green-field projects: you get a new budget, a new team, and can choose the newest technologies. But most of the new accelerator projects, alas, happen in existing laboratory. I have recently experienced a few interesting problems that I would like to share with you, where control groups were asked to take on new projects, apart from running an existing machine.
First and foremost – the control group must make sure that the availability of the existing facility is excellent. Some labs target 95%, some even 99% for the whole system, which is very, very good for an accelerator. Such high availability by itself is proof that the CS group is doing something right. Either by applying successful technologies, or by their work processes, or by the combination of both. This good performance is most likely related to a higher price in manpower and hardware, i.e. enough people and reliable (and expensive) components.
Now management wants to save money, in particular on the new project, which has been under-budgeted from the start, just to get the approval from the funding agency. There are two simple approaches – at least they look simple and are therefore the favorites of the management:
- build the new project with the existing team,
- build the new project with cheaper hardware than currently used.
The arguments for each approach is something like this:
- the new projects has essentially the same software and functionality, so the group can just replicate whatever is running now,
- surely new technology is cheaper now, so let’s switch to a new technology, just as we switch to a new PC.
As you might expect, both of the approaches are wrong. Wrong, because the goal should not be saving, but getting the required performance at the best possible price. By the way, this reminds me of the old aphorism: “we have to find savings, no matter the cost”.
So, what I am talking about: The management should be able to determine a lower limit on the needed accelerator and CS availability and service/support response times. Then, and only then, it can decide what cost it is prepared to bear for such performance.
OK, let’s assume that the management has done this exercise. Can it then use the two approaches mentioned above? Yes, but if it does the exercise correctly, it will found out the following:
As the CS group usually already has more responsibilities than other groups, because it covers a wide field of technologies and services and has to support all the other equipment groups, it appears obvious that the group cannot be stretched much more. There are surely internal economies possible, as they are with all labs, but definitely not enough to build another machine. Furthermore, no matter how outdated the technology might be, the control group is an expert exactly in that technology. And more often than not, the cost of introducing a new technology is to learn all the bugs and workarounds around what has been promised by marketing and not delivered properly. Consequently:
- if the project is to be finished with the existing personnel only, management has to be aware that it has to balance between a delay of the project and a reduction in existing services;
- if the price of the control system is to be reduced by changing the existing hardware platform, additional strain to the human resources will be a consequence, amplifying the dilemma described in previous bullet.
This can be considered a formal proof that additional hiring is needed in any case to finish the project. This will have to be accounted for somewhere, unless it can be absorbed by the lab staff budget. If not, there is still the possibility to call Cosylab, but this is already for the advertising section of this e-zine.
All the “sexy” technology and the dilemma between EPICS/TANGO/ACS/TINE and VME/PCIe/PXI lets us often forget that control system is an engineering discipline like all the others, but with an even more complicated development cycle:
- Write specifications
- Prototyping – probably the only fun part
- Define test procedures
- Implementation (coding) – the only software part
- Writing documentation
- Testing (follow ISO procedures)
- Acceptance at customer
Don’t forget, that even in-house control groups have a customer – physicists and operators, which must be involved in the specification, testing and acceptance phases.
Think like this: in vacuum, a specific tube or chamber is just the result of much designing before and testing after manufacturing. So is programming and running programs just a small step in the whole process – or so it should be. Often, programming is considered the key and only aspect apart from buying some hardware. The simple reason for this is that anybody can design and write at least simple programs, but not anybody can work with tools. Not to become philosophical, we conclude that control systems are the closest to pure procedures, which our mind seems perfectly adapted for and allows at least in principle to reverse any mistake at no apparent cost. The true cost, indeed, is very high – lost time, which in our modern society becomes more and more the most precious commodity of all.
What a project must look for in the CS
The nearly religious discussions about all those nice features individual control system packages have, more often than not obscures the items which a control system project really must take care of. Although not strictly part of the control system, they fall into the domain of the control system group and in reality form the largest part of their work:
Signal list: Some call it the golden or master list. Although it is so common sense that you laugh at me now, I have yet to see a project where the signal list has not been completed in the last minute or actually after some of the development has been done. The signal list really is a contract between the equipment specialist, the control expert and the operator. A contract that should be honored to the maximum extent. Changing something only because it is easy to change can have serious consequences later.
Signal names and general name conventions: a part of the signal list, but too obvious to be taken seriously and therefore often neglected. But the moment more than one person is involved, names must be unique and a good naming convention helps to keep it that way.
Alarm levels and operation limits: often left empty, because even the device expert does not know reasonably acceptable operation limits. This is later forgotten and only rediscovered many years into operation.
Configuration management: having procedures in place that deal with changing signal list, changing hardware, and changing software in such a way that all interdependencies are taken care of and that one number must not be changed in several places (otherwise it will be changed partially and the whole CS becomes inconsistent).
Logistics of installations: equipment can’t be tested without the CS, the CS can’t be tested without equipment – people often forget that, although only careful planning is needed, involving both sides, the CS people and the device experts.
Bugs: To err is human, but for real crap, you need a computer. Seriously: it is normal that bugs happen, because the complexity of the software is just too great. One has to plan a lot of time for testing and fixing bugs. And one has to live with workarounds. Having said that, the number of bugs and the cost they have can and must be minimized with strict control of the development process.
Problem description: in order to keep high-power accelerator equipment and power grid happy, accelerator has to be operated synchronously with the power grid. One thing is to track the power grid frequency, which drifts and jitters over time. To this we can say; peanuts, even with the changing energies of the beam. However, when rotating mechanical equipment gets inserted into the path of the beamsuch as neutron choppers in SNS @ ORNL - mechanical inertia causes the tracking mechanism to be more complex.
SNS timing master controller is presently based on outdated technology and consists of multiple VME counters. Cosylab and SNS cooperate on the upgrade of the SNS timing master controller, which will be implemented by single Xilinx Virtex 5 FPGA. Development of the FPGA module for standalone mains tracking functionality is only portion of the whole project.
Production of the neutrons at SNS is cyclic, guided by the frequency of the power grid. It is the responsibility of the timing system controller to schedule beam actions correlated with the power grid. If the power-grid frequency drifts, the operation of the entire accelerator drifts. However, tracking process has to be slow enough for the rotors to follow (inertia). Put in numbers, power grid swings roughly between 59.7 and 60.3 Hz, but the rotors only allow the operation of the accelerator cycle to be changed maximally 1 mHz/s.
Architecture: the approach to the problem can be decomposed in the following steps; a) to synthesize the “artificial” mains frequency, b) to measure the error between the two frequency signals, and c) to use a regulator, acting on the error to modify artificially synthesized frequency. Simple as a-b-c, the regulator is trying to minimize the error (frequency and phase), hence keeping the operation of the accelerator synchronous to the power grid.
Frequency synthesis is achieved by means of direct digital synthesis (DDS); running this module on 100 MHz, resolution of less than 50 uHz is achieved. Error measurement module takes into account two parameters; frequency and phase. Test showed both are needed for fast and stable tracking. When frequency difference is large (>2 mHz), only P-regulator acting on frequency difference is considered in the regulation loop. After the frequencies are brought together closely enough, the PD-regulator acting on phase difference is used, eliminating remaining phase error.
Functionality of the regulation-loop is fully parameterized and operation in tune with the changes of beam energy, thus »breathing« with the accumulator ring. Regulation-loop parameters were tuned in simulated environment (mains simulator developed with adjustable drift and jitter) and system tuned in order to keep stable and synchronized in worst case scenarios.
Besides implementing the required functionality we learned a couple of valuable lessons. There are times you have to go back to theory and thoroughly study it – even in practical development. If FPGA FSM is misbehaving and you want to blame cosmic rays; think twice, for most practical purposes you can bet it is developer’s fault.
back to previous content