Coupling & Crosstalk: Quality, Meet Safety & Security!

Coupling & Crosstalk is my column in the MEPTEC Report. This column appears in the Winter 2016 edition on pages 8-9.

Electronic coupling is the transfer of energy from one circuit or medium to another. Sometimes it is intentional and sometimes not (crosstalk). I hope that this column, by mixing technology and general observations, is thought provoking and “couples” with your thinking. Most of the time I will stick to technology but occasional crosstalk diversions may deliver a message closer to home.

Quality, Meet Safety & Security!

What can be simpler to specify or install than a light bulb controlled by a wall switch? Over-engineered versions, especially when developed without engineers, can really cause you to lose sleep. However, the real nightmare is the danger of simple products being hijacked by the Internet.

What keeps me up at night? Many things… When I’m focused on a client’s product requirements, it is making sure they match true market needs. Plus, dozens of other non-work related things like my children, the recent presidential election, the economy, …  Thankfully I easily fall asleep shortly after my mind starts to rundown this very long list.

What actually kept me awake on my recent business trip? Poor product quality, improper system design, and a lack of planning for failure. At a brand-new hotel the fancy multi-button bedside controls for the room lights and curtains failed in two different rooms. In the first room, the lights would intermittently not switch off. And in the second room, the multiple ceiling mounted spotlights over the bed would not switch off – at all – when I attempted to go to sleep. There was a considerable delay at midnight to get a maintenance person to disconnect the wiring in the hidden electrical control panel. Unlike a desktop or floor lamp, these lights were too difficult to reach to unplug or unscrew without a ladder. I admit I was not thinking kind thoughts at that moment about the engineer who designed the system, the supplier of the system, nor the company that installed it.

This may have been both a quality control failure or “escape” at the supplier and at the time of installation. Quality is essential for any product or service. It is critical to determine how to assess and control the quality of any manufacturing or delivery process. Without proper quality systems, one does not know if the process is out of control which may result in big surprises later on. In the case of the hotel lights, it is clear that there was a quality failure – either in the physical hardware or during installation – since the system was clearly not operating as designed. Or, at least, how most “reasonable” users would expect. Perhaps there was a local tradition of punishing those who go to bed after 11 PM?

Luckily, enlightened product managers and designers do place quality on their list of product requirements. But do they always add safety and security? It depends on the perceived risk and complexity of the product in addition to regulatory requirements (if any). Where most will agree that quality control is an essential cost – especially when in the long run it may save money – some trade-off or fail to consider safety and security in order to reduce cost.

In the case of the hotel lighting system failure, there was clearly a lack of quality in the design process at either the control system or hotel design level since the designers failed to consider the system failure modes. (Who knows if there were actual engineers on the design team?)  [And yes, this is a recursive thought since there is both quality of the manufacturing and delivery process in addition to the quality of the design process itself.] Done properly the system design would have included a Failure Mode Effect Analysis (FMEA) to identify most, if not all, possible failure modes along with countermeasures to eliminate or reduce the impact of the failures. Clearly the system was not designed to fail in a “safe mode”.

Failure of a switch? Use another switch for the same function. Failure of a curtain motor? Manually close the curtain. Failure of the logic control system? If the system could detect this, it should have turned itself off. But not every device can be “self aware” to detect proper operation. This is where the designers failed to provide a way for the user – not the maintenance technician – to override the system. Yes, there was a “Master” switch bedside but it was a master on/off “request” switch versus one that actually removed power from the entire system. Requiring a screwdriver to remove the back panel of the closet to access the logic control system to unscrew wires is neither user friendly nor the best design. This approach may have enabled the product or the system installation to hit the desired aesthetic and cost targets, but clearly it wasn’t designed for failure.

Many organizations “design” or plan for success but they don’t consider designing for failures in quality, safety, or security. We have come a long way in product design over the last twenty years. However, it is not clear that progress has been made in a user’s ability to remove the power source from many electronic devices. In the “real old days” of vacuum tubes almost all devices had a true “Mains” switch that cut the input power to the power transformer – i.e. the same as unplugging it. Because we are impatient and desire “instant television”, sets moved from truly turning off to going into standby mode. The television was really on (consuming power keeping the tubes powered up) but simply not displaying a picture on the cathode ray tube (CRT) to eliminate the need to “warm up”. Today, LCD screens use very little energy so there is little concern about them being powered up all the time. Since most new TV’s are connected to the Internet, perhaps we should instead worry about their capability of watching the viewers!  (Read again: George Orwell’s 1984.)

History has repeated itself with smartphones. In the quest for making them thinner and smaller, most leading brands have removed the ability to remove the battery. Smartphones are never really off and a “hard reset” is a request to reset versus a true power on/off cycle. As Samsung just discovered with the Note 7 battery issues, changing to a non-removable battery led to their being banned from airplanes as these phones could not be safely carried in a true “power off” state.

As products and devices become more complicated, the digital circuitry and software become significantly difficult to thoroughly test. As such, the probability of security holes, test escapes, or latent defects (possibly introduced after test) increase with complexity. Even presuming robust quality and security processes, defects will occur. Therefore, it is essential for products to “fail safe” with unambiguous methods so that a user can shut down or reset a compromised or malfunctioning device. Without these capabilities, beyond mere inconveniences, we are putting ourselves at grave risk as the number of autonomous devices increase by the day – everything from self-driving cars to drones. This risk is compounded by the sheer number of devices that comprise the Internet of Things (IoT). The recent distributed denial of service (DDoS) attacks using approximately 100,000 compromised web and security cameras demonstrated the power of a small number (relative to the projected billions of the IoT) of simple devices. It is estimated that there are over half a million of these infected devices due to the manufacturer(s) shipping them with the same default password and the users choosing not to change them.

Researchers at the Weizmann Institute of Science recently demonstrated the ability to spread a worm among Internet connected light bulbs via their ZigBee mesh radio network. The worm allowed them to take control of the light bulbs and to spread to neighboring bulbs that were not configured on the same network. In this case, the researchers were able to bypass the manufacturer’s security using sophisticated techniques. They speculated on the ability to use a similar security breach to cause mayhem from power grid failures to epileptic seizure. In theory, these bulbs are simpler “things” than web and security cameras with no user controlled security features.

Combine these issues and add paranoia to keep us all up at night? How about a worm/virus that takes control of your smartphone and changes the battery charging parameters causing your battery to exploding or ignite?

In the development and design process, product designers, engineers, and product managers need to think about failure modes and ways to make their products safe and secure. We need to add security and safety to design for manufacturing, quality, support, etc. that is known as “DFx”. Beyond making a product elegant and intuitive to use, the design team needs to worry about how their product could be misused and what to do in case things go wrong. As it is said “Murphy was an optimist”…

In industries and product categories that do not have governmental nor voluntary safety and security standards, another role(s) should be added to the cross-functional design team to focus on safety and security. Either an internal resource or, better yet, an external resource who can provide the required perspective and breadth of experience to take an independent view is required. Yes, quality has fought hard for a seat at the product development table and now it is time for safety and security to join too!

As always, I look forward to hearing your comments directly. Please contact me to discuss your thoughts or if I can be of any assistance.

%d bloggers like this: