Chapter 4: Failure

Assessing reward, risk, reduction, and mitigation

Nov 20, 2022

Responsible risk-taking

Just because you’re taking a large risk doesn’t mean you’re in for a huge reward. And just because you’re ignoring a risk doesn’t mean you’re not taking one. Still, I believe the biggest risk is not taking any, so here are some thoughts on managing risk:

Agenda

Responsible risk-taking
Crazy idea
The next big thing
Tales from the soccer pitch
Risk management

Failure is not an option

My first NASA project was PICTURE, the Planet Imaging Coronagraph Telescope Using a Rocket Experiment. Credit for that elaborate acronym goes to Ben Lane, now at Draper, who also figured out how to use knowledge of the optical wavefront aberrations to adjust the image captured through a coronagraph, which would allow us to take a direct picture of an extrasolar planet. That was a big deal back in 2005.

Photos of electronics, nuller
and telescope before assembly. — The PICTURE instrument. I was the lead engineer for the coronagraph, deformable mirror, and a lot of the instrument avionics. Read more at UMass-Lowell.

The catch was, the experiment went in a sounding rocket, so we’d have about 5 minutes of freefall above the atmosphere in which to do the entire mission, which included targeting the star and correcting the optical aberrations to a wavefront error of 1 Angstrom. When the NASA panel was deciding whether to fund the mission, someone in the room said, “Failure isn’t an option — it’s inevitable!” But the hope — of capturing a direct image of an extrasolar planet — was enough to approve it, for Lockheed and Northrop to contribute expertise in-kind, for JPL and Goddard to collaborate with goodwill, and for it to eventually fly. The secret to happiness is low expectations, after all.

The warning was prescient. There were many failures:

First few deformable mirrors didn’t work
Ran out of money
Flex cables couldn’t be fabricated
Key staff changed jobs
Shattered glass in vibe test
Ran out of money
Flash drive wasn’t plugged in
Telemetry radio failed in flight
Hit a rock on landing, shattering the telescope mirror

On its own, the PICTURE mission was a failure. While we had low expectations from the instrument, most of the data was lost for preventable reasons. We did not capture an image of the planet orbiting the star Epsilon Eridani during the few weeks that decade that it would have been possible.

Musical interlude: Tubthumping, aka “I get knocked down”

After the late and somewhat disastrous first flight, Supriya Chakrabarti, Tim Cook, and their team brushed themselves off, leading to PICTURE-B and PICTURE-C. If at first you don’t succeed, try, try again.

Meanwhile, back at JPL, there was a lot of soul-searching. We told ourselves stories, like how it was a success in demonstrating something like 20 new technologies, no small feat. Still, “it could have been worse” isn’t the story you want to put on a resume. Should we have taken on the project in the first place? Was the reputational damage to the people involved worth the small gains? Should JPL take on risky missions like that again?

If there’s one thing I’ve learned from 30 years of taking on ambitious projects, it’s that they all appear doomed from the start. If the sponsors could have gotten results from anyone else, they would never have talked to me in the first place. And, as my UW Physics undergraduate department chair, Steve Ellis, used to say, “Science is like a swamp. You never know how much it stinks until you get your feet wet.” That means, if you give up just because the first challenge seems impossible, you miss the chance to find another 5 problems that nobody knows about.

Which returns us to the topic: there’s a difference between discovering new and interesting problems, and ignoring problems that you should have been able to predict.

PICTURE was a calculated risk, with a small chance of success. NASA and JPL management knew that going in, and I found myself regularly briefing Jakob Van Zyl, the associate lab director, on what we had learned the latest catastrophe. My division manager said afterward that if you considered how PICTURE had paved the way for the coronagraph for the Nancy Grace Roman Space Telescope, it was well worth the trouble. PICTURE was just one of the eggs in NASA’s astrophysics basket.

What about at a personal level? How do I rationalize the return/risk decision in retrospect? I don’t. It seemed like a good idea at the time, and I got to meet and work with some really great people, and that’s the long and short of it.

Your personal space agency

Ten years ago, there were just a handful of people investing in newspace companies. One of them, Dylan Taylor, now CEO of Voyager Space Holdings, was a generous mentor as I was getting started.

Dylan noted that the return/risk for any given newspace company was essentially unknowable. Even though we both had conviction that proliferated LEO satellite constellations, were coming, we agreed that predicting how and when it would happen, and who would benefit, was impossible. Here’s how he summed up his portfolio strategy at the time:

“If you don’t have any failures, you’re not taking enough risk”
— Dylan Taylor, on space investing

You can check out his portfolio on Crunchbase — today he is invested across the entire space ecosystem. While some his investments were duds, some of those early bets were him putting his metaphorical feet in the metaphorical swamp. I think his deals get better and better over time, and Voyager is his vehicle for building an engineering conglomerate.

Now, Dylan is more optimistic than I am. He’s believes that humans will reside, long-term, off this planet. He’s even willing to strap himself to the pointy end of a rocket. Good for him. Still, these are calculated risks, taken step-by-step, building on experience and earned knowledge.

What I’m doing is a little different: I’m developing a playbook for running your own space agency. Why would you want to do that? Maybe you want prestige. Maybe you want to create jobs. Maybe you just want to spend money to reward certain constituencies (buy The Dictator’s Handbook through my Amazon affiliate link). In rare circumstances, maybe you want results. What you’ll get in any case is a seat at the table — access to information about what the other space actors are doing.

There are scores of new national space agencies starting up, and even more personal space agencies, though they tend not to think of themselves that way. The investment thesis is simple: there’s more of the universe outside Earth than on it, and someday it’ll be exploitable, which means that it makes sense to keep your finger on the pulse of space. Deploying capital or deploying expertise will earn you a seat at the table.

So far, we’ve learned that failure is inevitable. Still, just because a project is high-risk doesn’t mean it’s high-return. Besides the crackpots (ultraviolet laser death rays, free energy, Machian relativity), there’s plenty of money going into startups that don’t want to spend the time to learn the physics and do the math. A knowable risk that you ignore isn’t the same as an unknowable risk.

Is it cheaper to learn through trial and error than it is to do the structural analysis in the first place? Maybe? SpaceX thinks so — but those are calculated risks. Each Falcon 9 is covered with strain guages, with data streamed to a recoverable flash drive, which is used to validate and refine the structural and aerodynamic models.

Many seem to believe that “it worked once in space” is a shortcut to “space qualified.” NASA’s Technology Readiness Level is intended to develop a design practice, a validation strategy, and a supply chain that quantifies what could possibly go wrong, and the consequences thereof.

So, as you start your own space agency, you can choose how you accept risk. You can make a lot of random bets, you can invest in charismatic extroverts, you can watch for someone else to signal approval (“market validation”), or you can try to understand the risk on your own.

Still, it hurts when your spacecraft deploys to the bottom of the ocean, or tumbles uncontrollably, or fritzes out unexpectedly. Success happens when you just keep getting back up again.

The next big thing

Aquaai is a startup making robotic fish. Betcha didn’t know you needed a school of those!

What are you trying to do? We want persistent, in-situ surveillance of inland and coastal waters. Temperature, pH, salinity, chemicals, …

How is it done today, and what are the limits of current practice? Water is either sampled in a fixed location, or a technician travels around to sample different locations. A fleet of autonomous undersea vehicles cost too much to continuously measure the water over an extended volume of a river or bay.

What's new in your approach and why do you think it will be successful? Aquaai’s robo-fish is the cheapest to build, the cheapest to operate, and the cheapest to service of all the autonomous underwater vehicles. Inspired by the original Land Rover, it’s built and maintained in the field. If a country wants a marine surveillance program, Aquaai trains a local team to build the fleet, using only commonly available components, plus a few 3D printed parts. The platform can be made in a range of sizes, with payload capacity up to 10’s of kg.

What difference will it make? We measure shockingly little about the oceans. Several countries are starting to allocate funding for persistent water monitoring, and Aquaai is poised to provide the best bang for the buck. Measuring the results is how we, as a species, will develop better plans to improve water quality.

What are the risks and the payoffs? Let’s say you have a lake, or a river, or a coastline. What could possibly go wrong? Do intentional polluters know when and where samples are collected daily, and spill their waste somewhere else? Does water quality vary throughout the day? Does changing weather affect currents, which affects mixing of silt, runoff, and oxygen? Is the seafloor full of unexploded mines from WWII, just waiting to randomly pop up and damage an oil tanker? Most governments pay for cleanup, not risk assessment or mitigation, but there are a few around the world that want schools of robo-fish to measure aquatic risks. While it always takes longer than you want to sell to a government, I invested in Aquaai because I think it’s better to be too early than too late.

Soccer league

I just finished coaching my oldest’s soccer team. Lemme tell you, keeping 18 teenage boys engaged for training that starts at 8 pm on a school night is a whole ‘nother kettle of fish. Much of the old playbook still applies, but it will need some new chapters.

As Dale Carnegie wrote, “arouse in the other person an eager want.” Just explaining what to do and why we do it isn’t enough. The boys were always in a hurry, so they didn’t really want to know why. It wasn’t clear that they wanted to win, either, so much as they wanted to feel powerful. Everyone likes power fantasies, right? That’s why we have superhero comics.

If you ever find yourself in this position, don’t try to play soccer with teenage boys. At least not if you’re a 47-year-old like me. You’ll only get hurt.

Besides coming up with a bunch of games, complex choreographed plays, and persisting in those boring ball-control foot drills, what eventually worked was giving the boys more agency. “Coach, I think we should be moving around more to receive passes. How about we move the cones to give a larger target area?”

Finally, this happened:

“It seems like every time we lose, we talk about what we should do differently next time. And it’s always the same — we should close the distance, we should look up to make a smart pass instead of just kicking it really hard, and we should run behind to cover and give options.”

“I feel like that’s what Coach has been trying to get us to do for the last 6 weeks.”

“OK, then, let’s do that next time!”

See? Soccer is all about risk management. While running.

It did work. Team Chaos totally dominated Temple City in the final game, winning with a penalty kick in the 89th minute. Overall, the team finished in 7th place out of 13 teams. Average works for me.

Speaking of risk, 1 second with 1 child is a 1 in 1,000,000* chance of something weird happening. Stray dogs, falling and breaking an arm, scaling a chain-link fence without realizing it, … I thought I’d seen everything, but today, a girl was engulfed in a swarm of bees. I don’t know how they do it, kids just attract risk. The bees were just passing through and, after swarming another referee’s chair, were content to be relocated without harming anybody, but, sheesh.

*rough extrapolation based on personal experience.

Risk management

I got myself on the steering committee of JPL’s project risk management system to ensure that it would implement some social engineering. Now, when an engineer goes to describe a new risk, she is forced to write a simple statement that connects cause, effect, and consequence. It looks something like this:

If cause is true, then effect will happen, which will result in consequence.

Now, nearly everyone comes up with this risk statement:

If adequate staffing is not available in time, then design will fall behind, which will result in the widget being delivered late.

You’re not actually supposed to submit that risk, management already knows that the project, and management also doesn’t like hearing about it all the time. What you’re actually supposed to write is something like this:

If the copper wires in the umbilical have latent defects, then repeated bending could cause them to break, and power or communication could be lost to the robot arm end-effector.

That’s something we can work with. The taxonomy of options is simple:

Avoid
Reduce
Mitigate

Avoid is when you try to prevent the cause from happening in the first place. For cable harnesses, you can get thermal wire strippers that don’t nick the copper. I recently crawled under a neighbor’s house to replace some wiring that had failed for exactly this reason. It doesn’t take all that much bending, sometimes. Since I didn’t bring a thermal wire stripper with me, I was just very careful with a knife, and then inspected the wire all around before sealing up the junction box.

Reduce is when you make the problem happen less often. One way to do this is redundancy. If the probability of one failure is 5%, then the probability of two failures is .05 x .05 = .0025. Just running twice as many wires for each signal gets you to 99.8% reliability without having to be particularly skillful.

Mitigate is when you allow the problem to happen, and you have a backup plan. Being at the edge of LA county, wind storms occasionally knock out power for a few days, so we got together with our neighbors to buy a generator. It runs on gas or propane, starts with a battery or a cord, and can be wheeled around to where we need it. It also makes a racket, so spend a little extra for the “quiet” model.

Back to the space project management … risks are generally scored according to their likelihood and severity. Likelihood is the probability that the cause will happen, multiplied by the probability that the effect follows from the cause. Severity is how bad the consequence is, ranging from minor annoyance to mission-ender. There’s an entire pseudoscience about assigning numbers, and then managing those numbers. The point is to think about each risk, develop options for how to deal with each, and then choose what to do first.

Startup scorecard

Recession? What recession? Aerospace is going gangbusters.

Space 14 active 3 exits
Robots 4 active
Earth environment information 3 active
Other 3 active 1 exit

If you want in on my dealflow, just ask. I can hype my book all day!

Thank you

Thank you for reading! If you don’t want more of these, just hit the unsubscribe button. But I really do hope you’ll stay around. If you’re new, here’s some more to read, and you can subscribe at shantirao.substack.com

Some older issues:

Chapter 0: Recap
Chapter 1: Rethink Everything
Chapter 2: Power Laws and Disruption
Chapter 3: Incentives

Shanti’s Newsletter

Discussion about this post

Ready for more?