Highlights
- Economists have a different explanation for adoption rates of new technology. They typically describe it as the contrast between alignable and nonalignable differences. Alignable differences are those for which the consumer has a reference point. (Location 405)
- Nonalignable differences are characteristics that are wholly unique and innovative; there are no reference points with which to compare. You might assume that nonalignable differences are more appealing to potential consumers. (Location 408)
- The goal of full rewrites is to restore ambiguity and, therefore, enthusiasm. They fail because the assumption that the old system can be used as a spec and be trusted to have diagnosed accurately and eliminated every risk is wrong. (Location 926)
- Artificial consistency means restricting design patterns and solutions to a small pool that can be standardized and repeated throughout the entire architecture in a way that does not provide technical value. (Location 934)
- be careful not to be seduced by the assumption that things that look the same, or that we use the same words to describe, actually integrate better. (Location 942)
- The following situations might warrant modernization: The code is difficult to understand. It references decisions or architectural choices that are no longer relevant, and institutional memory has been lost. Qualified engineering candidates are rare. Hardware replacement parts are difficult to find. The technology can no longer perform its function efficiently. (Location 1020)
- The terms legacy and technical debt are frequently conflated. They are different concepts, although a system can show signs of both problems. (Location 1024)
- Legacy refers to an old system. Its design patterns are relatively consistent, but they are out-of-date. (Location 1026)
- Technical debt, by contrast, can (and does) happen at any age. It’s a product of subpar trade-offs: partial migrations, quick patches, and out-of-date or unnecessary dependencies. (Location 1030)
- A good way to approach the challenge is to run a product discovery exercise as if you were going to build a completely new system, (Location 1037)
- Use product discovery to redefine what your MVP is, and then find where that MVP is in the existing code. How are these sets of functions and features organized? How would you organize them today? (Location 1041)
- Another useful exercise to run when dealing with technical debt is to compare the technology available when the system was originally built to the technology we would use for those same requirements today. (Location 1042)
- Performance issues are actually one of the nicer problems to have with legacy systems. Few organizations are motivated to do anything about legacy systems until they start affecting the business side and work starts to slow down. Sometimes this is because the system itself has slowed down, but more likely, the system’s performance has remained pretty static and literally everything around it has gotten faster. (Location 1079)
- They are tightly coupled. When two separate components are dependent on each other, they are said to be coupled. In tightly coupled situations, there’s a high probability that changes with one component will affect the other. (Location 1147)
- They are complex. Big systems are often complex, but not all complex systems are big. Signs of complexity in software include the number of direct dependencies and the depth of the dependency tree, the number of integrations, the hierarchy of users and ability to delegate, the number of edge cases the system must control for, the amount of input from untrusted sources, the amount of legal variety in that input, and so on, and so forth. (Location 1155)
- A helpful way to think about this is to classify the types of failures you’ve seen so far. Problems that are caused by human beings failing to read something, understand something, or check something are usually improved by minimizing complexity. (Location 1183)
- Problems that are caused by failures in monitoring or testing are usually improved by loosening the coupling (and thereby creating places for automated testing). (Location 1185)
- There are a couple different ways to restrict scope when an existing system looms in the background. The most straightforward approach is to define an MVP from the existing system’s array of features. (Location 1594)
- Instead, I prefer to restrict the scope by defining one measurable problem we are trying to solve. (Location 1599)
- The more decisions need to go up to a senior group—be that VPs, enterprise architects, or a CEO—the more delays and bottlenecks appear. (Location 1609)
- Good modernization work needs to suppress that impulse to create elegant comprehensive architectures up front. You can have your neat and orderly system, but you won’t get it from designing it that way in the beginning. Instead, you’ll build it through iteration. (Location 1621)
- If you’re thinking about rearchitecting a system and cannot tie the effort back to some kind of business goal, you probably shouldn’t be doing it at all. (Location 1640)
- The best way to handle dysfunctional decision-making meetings is to prevent them from happening in the first place by defining and enforcing a scope. I usually start meetings by listing the desired outcomes, the outcomes I would be satisfied with, and what’s out of scope for this decision. I may even write this information on a whiteboard or put it in a PowerPoint slide for reference. What do we want to accomplish in this meeting? (Location 1671)
- With my engineers, I set the expectation that to have a productive, free-flowing debate, we need to be able to sort comments and issues into in-scope and out-of-scope quickly and easily as a team. I call this technique “true but irrelevant,” because I can typically sort meeting information into three buckets: things that are true, things that are false, and things that are true but irrelevant. Irrelevant is just a punchier way of saying out of scope. (Location 1687)
- Remember, technology has a number of trade-offs where optimizing for one characteristic diminishes another important characteristic. (Location 1702)
- Examples include security versus usability, coupling versus complexity, fault tolerance versus consistency, and so on, and so forth. (Location 1703)
- If the disagreement is in scope and isn’t a matter of conflicting optimization strategies, the best way to settle it is by creating time-boxed experiments. Find a way to try each approach on a small sample size with a clear evaluation date and specific success criteria defined in advance. Becoming good at experiments is valuable for practically any organization. (Location 1709)
- Find responsibility gaps. There will always be a disconnect between responsibilities formally delegated and actual responsibilities or functionality. Conway’s law tells us that the technical architecture and the organization’s structure are general equivalents, but no system is a one-to-one mapping (Location 1920)
- Organizations tend to have responsibility gaps in the following areas: So-called 20 percent projects, or tools and services built (usually by a single engineer) as a side project. Interfaces. Not so much visual design but common components that were built to standardize experience or style before the organization was large enough to run a team to maintain them. New specializations. Is the role of a data engineer closer to a database administrator or a data scientist? Product engineering versus whatever the product runs on. (Location 1926)
- So if you want to know what parts of the project are suffering the most, pay attention to what the team is having meetings about, how often meetings are held, and who is being dragged into those meetings. (Location 1935)
- A monolith in the context of software engineering is a tightly coupled application that configures a variety of functions and features so that they run on a single discrete computing resource. (Location 1965)
- The opposite of a monolith is service-oriented architecture. Instead of designing the application to host all its functionality on a single machine, functionality is broken up into services. Ideally, each service has a single goal, and typically each has its own set of computing resources. (Location 1972)
- Formal methods are techniques for applying mathematical checks to software designs to prove their correctness. In attempting to prove correctness, formal methods can highlight bugs that would otherwise be impossible to find just by studying the code. The most accessible form of formal methods is called formal specification. It consists of writing out the design as a specification with a markup language that a model checker can parse and run analysis on. These model checkers take the valid inputs defined by the spec and map out every possible combination of output based on the design. Then they compare all those possible outputs to the rules the spec has defined for valid outputs, looking for a result that violates the assertions of the spec. (Location 2099)
- but how do you find what is unowned and forgotten? One potential approach is to trace the activities of the engineers who were around when things were small. (Location 2129)
- Resilience in engineering is all about recovering stronger from failure. That means better monitoring, better documentation, and better processes for restoring services, but you can’t improve any of that if you don’t occasionally fail. (Location 2171)
- Broadly, these techniques are part of a methodology called Code Yellow, which is a cross-functional team created to tackle an issue critical to operational excellence. (Location 2218)
- Code Yellows have the following critical features that ensure their success over other project management approaches: The Code Yellow leader has escalated privileges. (Location 2228)
- The way a murder board works is you put together a panel of experts who will ask questions, challenge assumptions, and attempt to poke holes in a plan or proposal put in front of them by the person or group the murder board exercise is intended to benefit. (Location 2368)
- Design exercises come in various shapes and sizes, but they share these four distinct phases: (Location 2474)
- Exercise: Critical Factors3 This is a brainstorming exercise to do with a team to help prioritize conversations around the early stages of a modernization activity. What must happen for the project goals to be successful? What must not happen? (Location 2509)
- Exercise: The Saboteur4 A similar but inverse brainstorming exercise to the critical factors exercise is asking your team to play saboteur. If you wanted to guarantee that the project fails, what would you do? (Location 2517)
- Exercise: Shared Uncertainties5 This exercise also starts by asking team members to identify potential risks and challenges to a project’s success, but this time, you’re looking for differences in how such risks are perceived. Give each team member a four-quadrant map with the following axes: (Location 2528)
- Exercise: The 15 Percent6 In Chapter 3, I talked about the value of making something 5 percent, 10 percent, or 20 percent better. This exercise asks team members to map out how much they can do on their own to move the project toward achieving its goals. What are they empowered to do? What blockers do they foresee, and when do they think they become relevant? How (Location 2548)
- Probabilistic outcome-based decision-making is better known as betting. It’s a great technique for decisions that are hard to undo, have potentially serious impacts, and are vulnerable to confirmation bias. (Location 2566)
- Exercise: Affinity Mapping Affinity mapping is a common design exercise involving clustering ideas and statements from individuals together visually. (Location 2585)
- Individual incentives have a role in design choices. People will make design decisions based on how a specific choice—using a shiny new tool or process—will shape their future. Minor adjustments and rework are unflattering. They make the organization and its future look uncertain and highlight mistakes. To save face, reorgs and full rewrites become preferable solutions, even though they are more expensive and often less effective. An organization’s size affects the flexibility and tolerance of its communication structure. When a manager’s prestige is determined by the number of people reporting up to her and the size of her budget, the manager will be incentivized to subdivide design tasks that in turn will be reflected in the efficiency of the technical design—or as Conway put it: “The greatest single common factor behind many poorly designed systems now in existence has been the availability of a design organization in need of work.” (Location 2609)
- When an organization has no clear career pathway for software engineers, they grow their careers by building their reputations externally. (Location 2626)
- Organizations end up with patchwork solutions because the tech community rewards explorers. (Location 2639)
- Typically, this manifests itself in one of three different patterns: Creating frameworks, tooling, and other abstraction layers to make code that is unlikely to have more than one use case theoretically “reusable” Breaking off functions into new services, particularly middleware Introducing new languages or tools to optimize performance for the sake of optimizing performance (in other words, without any need to improve an SLO or existing benchmark) (Location 2647)
- One of the benefits of microservices, for example, is that it allows many teams to contribute to the same system independently from one another. Whereas a monolith would require coordination in the form of code reviews—a personal, direct interaction between colleagues—service-oriented architecture scales the same guarantees with process. (Location 2714)
- Candidates who are good at adapting have experiences of different sizes and industries on their résumés. (Location 2792)
- You might be familiar with the expression yak shaving. It’s when every problem has another problem that must be solved before it can be addressed. (Location 2809)
- The three effective structures for modernization are as follows: Teams that mirror existing components. (Location 2831)
- Lead team and subgroups. With this model, a lead team shapes the high-level view of the modernization effort and then dispatches tasks to the subgroups who are empowered to make any and all decisions on the details of how they implement (Location 2837)
- The members of the embedded team must have strong bonds of camaraderie with each other. They must feel like one team. They should treat their host teams with compassion and empathy, but they also should consider the host teams more like clients or customers rather than as peers. (Location 2848)
- I give everyone a piece of paper with a circle drawn on it. The instructions are to write down the names of the people whose work they are dependent on inside the circle (in other words, “If this person fell behind schedule, would you be blocked?”) and the names of people who give them advice outside the circle. If there’s no one specific person, they can write a group or team name or a specific role, like frontend engineer, instead. (Location 2885)
- Another expression that was popular among my colleagues in government was “air cover.” To have air cover was to have confidence that the organization would help your team survive such inevitable breakages. It was to have someone who trusted and understood the value of change and could protect the team. As a team lead, my job was to secure that air cover. (Location 2950)
- Italian researchers Cristiano Castelfranchi and Rino Falcone have been advancing a general model of trust in which trust degrades over time, regardless of whether any action has been taken to violate that trust.9 People take systems that are too reliable for granted. Under Castelfranchi and Falcone’s model, maintaining trust doesn’t mean establishing a perfect record; it means continuing to rack up observations of resilience. (Location 3068)
- The simplest and least threatening of failure drills is to restore from backup. (Location 3125)
- Logical view maps out how end users experience a system. This might take the form of a state diagram, where system state changes (such as updates to the database) are tracked along user behavior. (Location 3139)
- Process view looks at what the system is doing and in what order. Process views are similar to logical views except the orientation is flipped. Instead of focusing on what the user is doing, the focus is on what processes the machine is initiating and why. (Location 3142)
- Development view is the system how software engineers see it. The architecture is broken out by components reflecting the application code structure. (Location 3144)
- Physical view shows us our systems as represented across physical hardware. (Location 3145)