Coupling & Crosstalk is my column in the MEPTEC Report. This column appears in the Winter 2015 edition on pages 11-12.
Electronic coupling is the transfer of energy from one circuit or medium to another. Sometimes it is intentional and sometimes not (crosstalk). I hope that this column, by mixing technology and general observations, is thought provoking and “couples” with your thinking. Most of the time I will stick to technology but occasional crosstalk diversions may deliver a message closer to home.
Engineering: The Solution to Software Quality!
Who is an engineer? In a recent Atlantic article, “Programmers: Stop Calling Yourselves Engineers”, Ian Bogost argues strongly that software developers should not be called engineers based upon several factors including quality, professional licensure, and liability. Mr. Bogost includes examples of where software has failed even as it has become critical infrastructure. Having struck a nerve, there are several notable rebuttal articles and thousands of comments on the original article.
But instead of arguing over who should be called an Engineer or which honorific should be used for Mr. Bogost based on his doctorate in Comparative Literature, we should be asking the bigger question: How can engineering solve software quality problems?
Engineering is now practiced in far more areas than the classical dictionary list of endeavors that could be characterized as the engines of the Industrial Revolution. Wikipedia does a better job of describing what is built in terms of function than specific devices. As such it incorporates the essence of our current information technology revolution by including tools and processes.
What is common to both definitions of engineering is practicality, i.e. making something useful. The key to achieving practicality is to make sure that the true requirements are fully understood and the end result is tested against these requirements. A good engineer takes the time to understand the true requirements including transforming implicit assumptions into documented requirements.
At the same time, quality is often stated as an expectation versus a measurable goal. For example, Ford’s now discontinued slogan of “Quality is Job 1” is meaningless without specific, measurable, achievable, realistic, and time based (“SMART”) goals to implement it. Having meaningful goals allowed Ford to focus on the required transformations that positioned them well compared to their competitors.
Clearly there is a difference between a software developer who attacks a problem in a structured manner versus one who cobbles together something “quick and dirty” that “works”. Using a structured process to ensure that all the requirements are addressed is the best engineering approach. Even though a “code sprint”, often used by Agile methodology practitioners, allows for focused effort to make forward progress in code development, it is essential to know the destination and the requirements. There is no sense in “running” around and either missing the destination or solving the wrong problem, right?
Beyond a requirements issue, software has three unique aspects that are not present in tangible products that contribute to quality issues: feature creep, perception of “simple” fixes, and scale.
Off-the-shelf software is often selected on the basis of what it promises to do and not on reliability. So, product marketers continually identify new features to attract new users while the average user may use only a very small percentage of existing features. Even if you do know most of the features of a common program such as Microsoft Excel or Word, it is highly unlikely you regularly use more than a small fraction of these features. These extra features therefore “bloat” the software and may contribute to lower overall reliability due to the added complexity. And, of course, increased product complexity makes it more difficult to comprehensively test each new release of the software.
Unlike hardware, the “penalty” for fixing a bug in software at the first order level is minimal. A mistake in a semiconductor photolithography mask set could easily cost tens of millions of dollars to generate a corrected mask set. This cost is incurred before the actual production to replace the defective parts and before the costs to repair or replace the parts in end products. The actual cost of fixing a software bug? Simply the development and test teams’ time to fix the error and retest the software release. And today’s digital delivery requires no physical media. (This ignores the often-significant business costs of the software error, i.e. customer support costs, lost revenue due to time to market delay, etc.) Clearly the “stakes” are often much lower due to the low cost to repair software. Additionally, many web-based companies release new versions of their software daily eliminating the issue of infrequent or delayed update cycles.
Lastly, where hardware technology aims for very low – typically single digit part per million (ppm) – defect rates, these rates may be too “high” for Internet scale software. Facebook currently has over 1.4 billion monthly users so a 1 ppm defect could easily be seen by 1,400 users in a month. Internet scale applications like Facebook need extremely low defect rates on the order of 10 to 100’s parts per billion (ppb) for critical functionality due to the multiplier of the number of users and transactions. Regardless of the actual defect rate, software defects are quickly exposed.
But are the software failures, especially those for “infrastructure” that Mr. Bogost finds the most alarming, a product of feature creep, simple fixes, or scale challenges? Clearly the defects are more visible when the scale of usage is very large. However, there is more to the situation than just feature creep and “patch it later” mentality on the part of the software team responsible for these failures. It is corporate culture.
Facebook’s previous motto of “Move Fast And Break Things” was appropriate for the race to scale as quickly as possible to attract additional users and advertisers. This set the tone for their corporate hacker culture. To shift the culture to focus on how to do things more reliably at scale, especially in their core platform and services, Facebook changed their motto last year to “Move Fast With Stable Infra[structure]”.
To do their job correctly – finding practical solutions that meet the business needs –software developers need to fully comprehend the requirements and test the solution thoroughly against these requirements. And there must be a corporate culture that supports innovation and quality. Arguing about whether programmers are true engineers will not solve the issues of quality. What is important is that the developers think and are treated like engineers and that management assigns them clearly defined projects that don’t easily change scope. (This implies a healthy relationship between marketing and engineering, which of its own, is a worthy discussion.) Let’s focus on the corporate culture and not on job titles to improve the quality of software.
As always, I look forward to hearing your comments directly. Please contact me to discuss your thoughts or if I can be of any assistance.