It occurred to me as soon as I started writing that people absolutely love saying “I’ve always been fascinated (!) by this topic.” I don’t know how many times I’ve done it on this blog or other pieces of writing, but I’ll try to refrain from it at most times. In this case though, the subject is to some extent a personal one. Without getting into too much detail, I’ve always struggled with attention. By this, I don’t mean merely sitting through boring shit, but an actual lifelong difficulty with executing tasks that interest me. I’ve found ways to ameliorate it, but my resources still feel frustratingly underutilized on a regular basis. Just like a lifelong struggle with obesity can cause a person to look beyond a lack of “self-control” as the cause of their condition, noticing the odd discrepancies between my intentions and my actions led me to realize that “self control” is an explanation with no information or utility of any kind. Even if such a mechanism independently exists, perhaps commandeered by an ethereal homunuculus or some kind of genetic encoding, it doesn’t give you any actionable knowledge that you didn’t have before; in fact, I see little difference between unqualified ideas of “personal responsibility” and new age books like The Secret (but let’s not have that argument today.)*
My own struggles with what’s commonly called Attention Deficit Disorder certainly don’t make me unique or special in any way. I have yet to meet the person who feels satisfied with what they perceive as their self-control. Whether it’s not going to the gym enough, not getting enough sleep, or struggling to be more outgoing in social situations, we all have problems that manifest themselves as patterns of behavior. We’re told that it’s a choice, but we’re not told who exactly is choosing. Most would say that we can’t choose to crave junk foods but we can choose not to give in, but that sounds to me like saying that we can’t choose whether we get cancer but we can choose not to be sedentary. The more recent explanation put forward by behavioral psychologists is that we have a finite reserve of “willpower” that acts as a brake on our impulses and that it takes up physical energy, and that we make bad decisions under the condition of “ego depletion”, when we no longer have the energy to pass up instant gratification. This reserve relies on energy in the same way that the rest of our body does: sufficient blood sugar, rest, and available bandwidth; numerous experiments showing that memorizing digits and refusing temptation share the same resources, and that those resources become very strained if the person has low blood sugar.
The limits of this model become apparent to me, however, when considering the idea of willpower applied to dieting. The common explanation goes that if you’re fatigued that you won’t have as much willpower left over to resist junk food. Fair enough, but what if this fatigue is coming from low blood sugar? In this case, it can’t just be a matter of willpower, because the very craving for junk food comes from the fact that when blood sugar is low, the brain sends signals that demand that glucose be delivered to the body as quickly as possible; something that junk food tends to do. Worse, even if you resist the cravings using what willpower is left, your metabolism may slow down anyway in order to preserve energy. So you’d need to apply the willpower to exercise, but even assuming that your body won’t cut energy elsewhere to compensate, that would just drain more energy, eventually leaving you with no “willpower”. Taking all this into consideration, Occam’s razor would suggest that it’s metabolism, not a lack of willpower, that turns us into lazy overeaters.
I’ve already talked at length about the details of this metabolic process in a previous post, so I won’t go into further detail, but the purpose of this example was to show that whether or not our folk-concept of “willpower” as a manual override exists and influences our decisions about eating, it does not exist in a vacuum. Similarly, I’ve come to the conclusion that what we call “attention” is not a simple matter of “self discipline” but a complex nonlinear process that is fundamentally grounded in feedback.
Information Sensitivity and the Habit Loop
For those who read my post on allostasis, you might recall that the health of a complex system can in fact be defined by the ability to process feedback, which looks like high uncertainty and volatility from the outside. An inability to process feedback manifests as insensitivity, which can manifest itself in many forms: an insensitivity to insulin among the obese and diabetic, out of control inflation from excessive money printing, tolerance to a drug, hearing loss due to hearing the same noise over and over, not listening to the boy who cried “wolf!” because he was full of shit every other time. More recently, I came across an insightful but disappointingly flawed book called Money and Power: The Information Theory of Capitalism by George Gilder. Although there were many parts of the book that were so illogical, unrigorous, and ideologically driven that I almost devoted an entry on this blog to ripping the book apart, there were a few gems, including a very elegant metaphor to describe how cell phones and wireless internet expanded so much despite what seemed to be hard limits on the amount of bandwidth offered in the electromagnetic spectrum, and by extension, how an economies can grow at an exponential rate relative to the resources they consume.
Imagine a cocktail party in which everyone is talking to one another. You might have trouble hearing your friend because of all the noise, which causes you to speak up louder. Unfortunately, everyone else might do the same, which means that everyone has to raise their voice even more to be heard in the conversation. Eventually, it would get so loud that your voice gets hoarse and you can still barely hear your friend. Imagine, by contrast, if everyone decides that instead of raising their voice, they agree to keep their voice down and compensate for the background noise by having conversations in different languages. This way, the noise in the background is much less distracting since the noises in the background cannot be mistaken in any way for the words in your own conversation. In the field of wireless communications, the standard approach for most companies was to try to speak louder, which would allow their calls to overpower any data that slipped in from outside the communication channel. In addition, much bandwidth was dedicated to creating buffers between channels to minimize the amount of noise that could go into a certain channel. This approach is known as Time Division Multiple Access (TDMA). The alternative, pioneered by a company called Qualcomm was Code Division Multiple Access (CDMA), in which very little bandwidth is used, but each channel relies on a unique language, so that interfering bits of data do not register as anything intelligible. To see the difference: imagine that someone says “my train leaves at five”, but you hear “my train leaves at nine.” You wouldn’t know that you misheard. If, on the other hand, you heard “my train leaves at hobgoblin”, it won’t make any sense, so you’ll know that you didn’t get the right information. With this in mind, all that was needed was to create powerful decoders on each side of the conversation, which had far fewer constraints than the limits on bandwidth that limit the potential of using TDMA.
What gives CDMA its incredible advantage over TDMA is that rather than turning up the volume, it relies on a more sensitive device. The “sensitivity” of this device is not a matter of picking up more distant signals, but a matter of being able to detect patterns that other devices can’t. A murmur in the background often doesn’t register, but if you hear someone say your name, you’ll often find yourself turning around to see if someone was addressing you. In the same manner, an experienced firefighter might not have physically sensitive ears (quite the contrary if you consider all the extremely loud noises that happen over the course of fighting a single fire), but their understanding of relevant information is complex enough that even the slightest whiff of smoke or creak of the floorboards can cause them to immediately order everyone out of the building before they even know why they gave the order. The decoders used in CDMA phones do the same thing: they are not more sensitive to bits of data, they just have a better understanding of the relevant data. This doesn’t mean that the quiet is unnecessary: there’s still only so much bandwidth, and just like you need to speak louder than the background noise no matter what, the same goes for wireless communications.
A similar case can be made for how our attention works. Despite all the wonders of our inner logical faculties, we’re still just like every other animal in that we rely on feedback to learn how to navigate our environment. Although we often make deliberate choices and suppress our impulses, emotions, bodily sensations, and experience remain the primary determinants of our behavior. Our conscious minds, although undoubtedly impressive, are strongly limited in both their knowledge and control of the information we process and the choices that we make. The backbone of our decision making through the day is comprised of habits, each of which may be accessible to our consciousness to some degree. For a long time, I assumed that habits were just a matter of repetition, that if you repeat something enough, you start doing it automatically. This is definitely true to a degree, but there was another mechanism I wasn’t considering until I came across The Power of Habit by Charles Duhigg.
Duhigg’s fundamental insight is that habits operate as a loop, and must complete a circuit in order to be effective. The loop consists of three parts: a cue, an action, and a reward. A good, albeit very basic example, is the loop of hunger -> eating -> satiety. Forming a habit is a matter of feedback: a specific action in a specific context (cue) leads to a specific outcome (reward). Without such feedback, there is no basis for connecting a cue and a reward. This process, although very elementary, is the primary building block of learning and mastering skills. We instinctively engage in trial and error by creating behaviors to seek out a very specific reward: control over our surroundings. If I go outside and start shooting basketballs, every shot gives me some information about what to do the next time, and I get feedback in the form of making more baskets per shot. If I were to not improve even after an entire afternoon, however, I would get frustrated and not find the task very engrossing. This is very useful since it tells us not to waste time on things that we either can’t control or can’t make sense of. These habits of trial and error gradually layer on top of one another, and can become such sophisticated patterns that a chess grandmaster can feel excitement or anxiety at a configuration of pieces that means nothing to a novice.
When feedback becomes reliable and informative to the point of complete engrossment, the person is said to be in a state of flow, a term coined by the psychologist Mihaly Csikszentmihalyi. Although widely celebrated in many circles, a substantial case against it was made by the computer scientist and self-help author Cal Newport, who asserts that “flow” is a nice way to enjoy the skills one acquires through hard work, but that real practice is ultimately about delayed gratification. For a very long time, I almost entirely agreed with him. It certainly is the case, for example, that however much joy I take in programming or writing, it requires some mental effort, and while sometimes engrossing, does not usually feel like I’m gently coasting downstream. What I found odd, however, is how much of a struggle it has been at varying points in my life to stay focused on something that I’m genuinely passionate about. Even if I would rather be doing it than watching TV, there were times where I felt incapable of it. At some point in time, something else struck me as extremely odd: I never encountered these problems when playing video games. While mathematics was often a struggle due to the fact that I would make careless mistakes and lose track of what was on the page, a video game can so effortlessly take me away from reality that it’s not uncommon for me to not realize that several hours have passed. Nor did improving seem to necessarily require any kind of delayed gratification (though it is true that if you’re serious about, say, StarCraft [I'm not], then you need to resist the temptation to get easy wins through amateur moves and learn the techniques that will help you truly improve.) What was it about video games that could make my attention this laser sharp?
Two Faces of Flow: Learning vs. Addiction
I finally realized a probable answer a few weeks after I re-read a post on Ribbonfarm called The Calculus of Grit, in which the author argues that, contrary to popular belief, those who become masters of a craft do not have superhuman willpower. Rather, they’ve become adept enough at their own field that they can leverage their past experience as a means to exponentially accelerating feedback. What looks like “willpower” is actually very specific to the particular pursuit; an amazing writer may spend 8 hours a day completely focused on his writing, but may otherwise have horrible self-control when it comes to cleaning his house or sitting down to pay his taxes. You’d also be unlikely to find him sitting through a course on differential equations or spending 12 hours a week bodybuilding at the gym. All of this makes sense if you think about it in terms of aptitude: if you have no aptitude for programming, you won’t understand your own mistakes and the lack of feedback will make it a painful and most likely fruitless process. If you have some skill on the other hand, then you’ll progress so long as you’re interested and given the right set of challenges; too easy, and you’ll get bored, too difficult, and you’ll get frustrated.
All this suggests that self-discipline is a local phenomenon; it occurs almost entirely in places where there’s feedback. With this feedback comes improvement, and with improvement comes even greater returns on feedback. Why? Because the better you understand what you’re doing, the more sensitive you are to the feedback you’re getting. If someone like myself, who has never once been serious about playing chess (though I was naive enough to think that I was good at it when I was 12), were to study the record of a game between two grandmasters, I’d have trouble gleaning anything useful from it. For a serious student of chess, however, there exist all kinds of patterns that they could see from years of study and experience. It also goes without saying that a more experienced chess student would find the literature more engrossing than even the most enthusiastic novice. Putting two and two together, it finally hit me that attention can be specifically defined as sensitivity to feedback.
So what does all this have to do with video games and their uncanny ability to fully engross even the most scatterbrained individuals? The answer lies in the specific type of feedback provided by video games. Most video games, in comparison to many other challenges, provide feedback that is extremely loud, extremely frequent, and extremely simple. While a game like chess resides in a world of abstractions where a move in of itself provides little in the way of stimulation, a game like Doom or Halo accessorizes every decision, however minor or inconsequential, with a pattering of footsteps or a loud explosion. Immersed in a dazzling audiovisual spectacle, the player is constantly saturated with shiny objects that draw them into the world created by the game. Even a simple strategy game from the 90′s offers an immediate sensory thrill that can’t be rivaled by a book, a chessboard, or a board full of equations.
But the spectacle itself is not the central mechanism by which games hook themselves into the player, but rather a supplement. It’s the very structure of the feedback: most video games are designed so that the simplest tasks are rewarded by some tinge of satisfaction. A few more points for every monster slain, various badges for different accomplishments, and all done at a pacing that gives the player a cookie right before they get bored. The role of the game’s audio and visual elements is to create a sensory anchor for this feedback loop, providing visceral cues and rewards for the habit loop constructed by the game. The player literally sees and hears the satisfaction that will come from obeying the cue that has come up.
All of this comes at a price. Complex systems learn by adjusting to feedback, and feedback that is sufficiently loud and frequent will oversaturate the system’s inputs, leading it to reduce its overall sensitivity in order to register changes. When instant and immediate gratification becomes the norm, more subtle forms of feedback become harder to register. Getting engrossed in a book becomes increasingly difficult. The same goes for different kinds of stories: it’s easier to sit through an action movie than a drama because the story is simple and the movie is mostly comprised of satisfying bits of conflict resolution in the simple form of karate chops and shootouts. We might force ourselves to sit through a few chapters of Tolstoy, but the real issue is that we ultimately have to re-calibrate our receptivity to feedback in order to gain interest in more subtle flavors of experience.**
At this point, I may understandably sound like a puritanical naysayer conjuring the cultural paranoia of generations past, nor would I blame you. So I should clarify that video games are not categorically bad. Attention is a local phenomenon, and reduced sensitivity to a stimulus is a valid adaptation. We stop listening to the boy who cried “wolf!” because it’s a waste of time and energy to get into a frenzy over consistently false information. Similarly, becoming wired for more frequent and intense feedback might prove beneficial in some scenarios: while the internet might have made us a bit less singular in our focus, it can be slightly painful to watch people of the baby boomer generation work with a computer as if it were a complex and dangerous welding machine at a manufacturing facility. Nor would it be fair to say that video games never amount to anything more than a digital cocaine-pellet dispenser. While I myself don’t understand the appeal of being a professional StarCraft player, I’ve made the hobby of watching some professional games and have noted the degree to which these players have exhaustively studied the possibilities and developed a rigorous set of techniques that occasionally branch out into subtle novelties that throw the other player off guard. Unlike someone playing just for fun, they’ll watch replays of previous games and deliberately practice both fundamental maneuvers and techniques that they’re not used to in order to improve at the game.
Taking this into account, it’s apparent that while video games are just one of many sets of stimuli that we adapt to, there’s still a fine line between the sort of addictive behavior that constitutes sitting in front of the TV all day with nothing to show for it and becoming a professional gamer through the kind of consistent deliberate practice that most players wouldn’t feel compelled to engage in. To Cal Newport, this is the distinction between flow and deliberate practice: one involves the joyful feeling of getting lost that can only happen through reckless instant gratification while the other involves the hard work of resisting that temptation and practicing what is difficult and frustrating until you get it right. I don’t think he’s entirely wrong in practice, but I’m convinced that he has set up the wrong dichotomy; it hearkens back too easily to the folk concept of willpower, which involves hanging in there in the absence of meaningful feedback (in his defense, that’s not exactly what he’s saying, but he very clearly states his opposition to the idea of flow as being conducive to improvement.) By contrast, I think that the dichotomy is between two different kinds of flow, one that promotes growth, and another that promotes atrophy.
To get a rough idea of the difference between the two, imagine a very large linoleum board with many different interconnecting grooves etched into it. It has all kinds of rivers, hills, valleys, basins, and mountains. Now take a pinball and place it onto the board, watching it roll downhill traversing various nooks and crannies. Things stay interesting as long as the ball continues to roll onto a new path, not stopping for good at any one location. But imagine that it reaches a wide basin where it starts circling around but never gathers enough momentum to get out. From here, the path of least resistance is to stay in one place, slowly losing momentum as it never travels anywhere else, eventually coming to a full stop.
As long as the ball is moving, it’s learning: the path of least resistance offers territory that hasn’t been previously explored. By contrast, once it enters a basin, the ball’s prerogative is to stay there; following the path of least resistance has caused it to stay in a single place. This analogy is admittedly a very flawed one, but I chose it because not everyone who reads this blog is necessarily familiar with chaos theory, which provides a much more faithful version of the same idea. For the rest of you nerds, you can imagine that we are learning so long as we haven’t fallen into a basin of attraction, after which we cascade towards a point of total repetition. Even more technically, you can imagine that learning is a strange attractor, and that addiction is a state in which we get dragged closer to a stable attractor; the stable attractor itself being some kind of literal or figurative death:
Whatever analogy you prefer, the key difference is that when we’re learning, the feedback guides us to new information and new possibilities, and when we’re addicted, we remain in a zone of comfort because the feedback encourages us to engage in repetitive behaviors. While this is easy to see in the case of learning how to play the guitar versus spending an entire weekend watching Netflix, it can also take more subtle forms. Many people who are allegedly “workaholics” find comfort in the validation of staying within zones where they feel strong. This behavior is not necessarily addiction: after all, we avoid excessive failure because there’s little point in spending time on something that we have no aptitude for. If taken too far on the other hand, workaholism can not only become a means to avoid other uncomfortable areas of life such as socializing and personal development, but can even arrest the person’s development at that particular task by driving them to consistently punch below their weight class in order to avoid the possibility of failure. The epitome of such a person might be Brian from Family Guy or Oscar from The Office, both of whom circle a drain of mediocrity as they validate themselves by chiming into conversations with fragments of mainstream quasi-intellectual trivia that nominally qualifies them as “the smart guy” of the group. This kind of identity-wearing is in fact a very strong sign of someone who engages in strength-based addictions. At a greater extreme, such addictions can take the form of alcoholism or drug addiction, which provides the means for instant pleasure and/or pain relief and is Brian’s eventual crutch as he goes from Brown dropout to depressed house pet.
Although alcoholism is not the same thing as avoiding failure, let alone doing some task with no purpose other than repetitive instant gratification, they are for our purposes the same systematic behavior, albeit at much different magnitudes. When feedback fails to foster growth, the inevitable outcome is atrophy, as the subject not only fails to expand their knowledge, but in fact becomes further trapped in a habit loop with diminishing returns as the subject’s sensitivity to feedback gets dulled by repetitive stimuli. It’s also a relative phenomena: Bobby Fischer may have been addicted to chess as a way of (self-admittedly) avoiding the world outside the board (Boris Spassky also saw chess as an escape, having found it as a sanctuary while growing up in poverty in the USSR), but he was nonetheless constantly pushing his limits at the game, never letting himself become complacent with his own abilities. The distinction between learning and addiction is useful here insofar that it explains when flow is conducive to improving at something, and when it facilitates the exact opposite. All this leaves the question of how we can control this process and ultimately engage in a state of flow while avoiding addiction.
Willpower Revisited: Stress Responses and Signal Amplification
While I fundamentally disagree with Cal Newport in his belief that flow is inherently opposed to deliberate practice, that doesn’t mean that his ideas are entirely wrong. Most of the things he says are compatible with the dichotomy that I outlined: that deep procrastination is the result of not having a viable plan (in other words, that procrastinating comes from a lack of intelligible feedback), that passion is the result of mastering something you have some aptitude for and not some pre-determined magic bullet, and that developing a sustainable and effective road to expertise requires taking an approach as a “crafstman”, gradually tinkering with what you do via small bets.
All three of these ideas fit in with the notion that expertise is based on a process of feedback, and certainly don’t contradict any of what I’ve said about flow. On a macro level, all of his ideas about how to become successful focus on working where there’s feedback and resisting the temptation to attempt go-for-broke efforts, which he identifies as the courage fallacy. On a micro level, however, the difference between our dichotomies becomes significant. Newport’s distrust of “flow” comes from the fact that, as the name suggests, it’s an act of following the path of least resistance. In terms of my dichotomy of learning vs. addiction, Newport would likely see flow being inherently addiction-based. To avoid such addictive behaviors requires “deliberate practice”, in which one applies their willpower to work outside of their zone of comfort.
The difficulty of deliberate practice, as Newport himself notes, lies in the fact that there’s a vacuum of novelty that we are constantly tempted to fill. To Newport, resisting this temptation is a matter of willpower, and requires that we cultivate the metacognitive skill of hard focus. Cultivating this skill requires that we build it up through training, in the same way that one might train for a marathon. From this point of view, “hard focus” habit that we reinforce through practice. The issue I have with this point of view is that habits do not exist in a vacuum: they rely on cues and rewards, which may vary wildly depending on the task being engaged in. We may be able to improve our general ability to diligently push ourselves through tasks by developing better metacognition about our habits, but just like playing chess only improves one’s memory for chess positions, our raw “focus” is likely task-specific. The belief in a more universal notion of willpower or focus instead seems to come from a general analogy we’ve drawn that willpower is a muscle; which ironically betrays the flaws behind our folk concept of not just mental performance, but also physical perfromance.
Although exercise often leads to an overall increase of muscle mass, it is hardly the only factor in our ability to perform physical tasks. A person’s ability to lift a certain amount of weight in a certain way depends not only on the muscles used, but also their neural coordination, the ability of their metabolism to apply energy in the right places, and the distribution of muscle fibers (fast twitch vs. slow twitch). In fact, if we use Newport’s chosen analogy of running a marathon (which comes from Haruki Murakami’s excellent book, What I Talk About When I Talk About Running), the analogy falls apart even further. Due to the nature of the Krebs Cycle, aerobic exercise works primarily by optimizing the body’s metabolic pathways for a specific task: a person who is a champion at the stair-master may be completely unable to run a decent mile outdoors or on a treadmill.
None of this is to invalidate Newport’s legitimate concern that the path of least resistance, at least in a world offering constant novelty, will likely lead us down a path of addictive behaviors that bombard us with the most frequent, loud, and easily acquired forms of novelty. Just as our temptations to gorge on sugar, starch, and fat are the legacy of a world where calories were relatively scarce, data only became abundant with the invention of the printing press, and such abundance has now been dwarved by the arrival of the internet. The issue is that the folk concept of “willpower” does not seem to offer much of a solution. Just like applying “willpower” to the issue of dieting separates our decisions from our metabolic signals with a causal hatchet, looking at focus through this same lens ignores the habitual, emotional, and physiological factors that play a role in our decision making about work. Luckily, the analogy of “willpower is a muscle” can also lead us to another view that doesn’t force us to create a mythical homunculus that polices some vaguely defined set of our actions.
In order to get rid of this causal separation of willpower from the rest of our decision making apparatus, we need to get rid of the notion that our willpower works completely independently of feedback from other systems. In recent decades, cognitive scientists have begun this process by studying phenomenon such as ego depletion, in which one loses their ability to resist temptation due to either physical fatigue or mental fatigue due to applying too much willpower without recovering. Studies have also shown that willpower is lessened when the subject is busy concentrating on some other task, due to a phenomenon called cognitive load. Understanding the phenomenon of impaired judgement under cognitive load is easy if you realize that the connection between our consciousness and our actions has a very low bandwidth and we do not have the resources to make very many conscious decisions at once. Imagine, by analogy, the president of the United States: he makes a lot of key decisions, but he certainly can’t micromanage every piece of legislation being passed in every county, municipality, and state. He can only make so many decisions, and must work from a bird’s eye view, leaving much of what happens to the logic of an intractably complex system.
This “bandwidth limit” is in fact one of the things that makes the concept of willpower so troublesome: it assumes that the best decisions are made consciously, which ignores that the vast majority of the information needed to make coherent decisions resides in complex systems that are invisible to our conscious selves. The idea that willpower is some separate mechanism that acts completely independently, albeit constrained by some “energy budget” (as ego depletion suggests), implies that there is some strict binary between a decision that’s consciously willed and one that’s not, something that doesn’t make sense if every decision requires a significant amount of unconscious information, and makes even less sense when we consider the emotional and physiological factors that continually shape our conscious experience. On a more practical level, this is important because sometimes our instincts are good for telling us what we “should” do, leaving the bigger question of how much “willpower” we ought to have in the first place.
With the help of some simple mathematical concepts, however, it’s possible to escape this superstitious separation of subject and object by re-framing “willpower” as an issue of information rather than some raw force of conscious will. Consider again the concept of ego depletion: while we may only be able to concentrate on a difficult math problem for so many hours in a day and feel fatigued afterwards, it doesn’t compromise our more basic mechanisms of self-control. We may be more tactless after a 12 hour shift at work, but barring severe intoxication, we do not enter some state of absolute zero inhibition. It’s not that there are no limits to our self control, it’s that our capacity for self control decreases not linearly, but geometrically. This realization is actually a much more faithful interpretation of the idea that willpower is a muscle when one considers the difference between fast twitch and slow twitch fibers: the former take days to recover and get used up when lifting a heavy load, whereas the latter can fully recover in mere minutes, which is why no amount of weightlifting, barring severe injury, will prevent us from walking out of the gym. One might ask whether this is just a matter of habit in which our willpower is not required because we’ve created a routine that gets rid of the need for willpower. While there is some truth to this, it’s hard to untangle habit from self-control because all of our decisions act on feedback of some kind. Luckily, we can account for this non-linearity without forcing a dichotomy between “willpower” and habits by modeling the folk concept of “willpower” as a signal whose efficacy is based on sensitivity, rather than as an independent mechanism that draws energy from a finite gas tank.
In particular, I’d like to suggest that “willpower” is a stress-response that happens in the absence of feedback. To get an idea of how a stress response works, consider the hormone cortisol. Cortisol is one of several hormones that is excreted in the human stress response, and helps us avoid danger by shifting our body’s resources so that getting rid of the threat is our top priority. Pain, normally a signal that tells us to avoid harm, is dulled, since surviving is more important than avoiding injury. Our digestive system also shuts down, and we are able to use up more of our body’s energy than usual, since there’s no point conserving energy if we’ll get killed doing so. Once the threat is gone, our body will go into rest mode, and we’ll become more sedentary than usual in order to recover the energy that was lost.
In a scenario where threats are sufficiently spaced out, there is no problem. Unfortunately, the modern world has brought on the phenomenon of chronic stress. Even though the stress is rarely as acute as a life-threatening situation, the stress-response is turned on for an abnormal amount of time and can easily enough desensitize us, leading to conditions such as adrenal fatigue. In more extreme scenarios, such as war, it can lead to conditions such as Post Traumatic Stress Disorder (PTSD). While the layman’s idea of PTSD is that it’s caused by acutely threatening situations, the evidence suggests a more nuanced view. Soldiers in the front lines actually have lower PTSD rates than logistics soldiers, due to the fact that they are in more apparent control of their situation than the people whose job is not to fight back but to keep the supply lines running. Even then, modern warfare involves being on high alert for hours, and sometimes days, on end, and often engaging with threats in a way that’s contrary to one’s individual self-interest. Like many disorders, PTSD likely comes not from the experience of acute stress, but from a mismatch between the signal of stress and the person’s response to said signal.
Another important fact about this stress-response is that it benefits us up to a certain point. Police officers, firefighters, and soldiers are put through a certain amount of training to blunt their response to dangerous situations, so that when a real emergency happens, their response hits the “sweet spot” at which focus and energy improve, rather than going too far and causing a total loss of control; something that can leave people too paralyzed to even dial 911. This upside-down-U curve of benefits, common to all such signals, is best visualized with a graph that I’ve already borrowed for so many posts:
Like all other responses to stressors, the stress-response to this lack of information follows an upside-down U shaped curve that benefits us up to a point before it starts to harm us. For the third time over the course of five posts, I’m going to post the exact same graph by Nassim Nicholas Taleb:
Oddly, we seem to have a similar stress response regarding attention. Just as a perceived loss of control can raise cortisol levels and ultimately cause anxiety and depression, the same thing seems to happen when we lack good feedback. When not sufficiently engaged in meaningful tasks, boredom, and eventually anxiety, slips in. An episode of Breaking Bad is fun to watch after you’ve called it a day, but spending an entire day watching TV is often a desperate bid to alleviate a sense of boredom and dread (I say this from experience.) Given the higher incidence of stress among those who feel less in control, and the tendency of cortisol raising drugs such as caffeine, speed, and various ADHD medications to significantly raise people’s focus, I’m actually convinced that cortisol is one of the hormones involved in this stress response, and that the stress response I speak of now is simply a variation on the basic biochemical response that moves us out of harm’s way. But I digress, as I have no intention of speculating on the biochemistry behind the ideas beyond certain basic patterns.
There is, however, one more aspect of stress that I haven’t covered: in order to be in control, one has to be able to sufficiently predict the cause and effect of their actions. Prediction is in fact so vital to our sense of security that the most acute (and ultimately damaging) stress responses happen when harmful and threatening stimuli cannot be predicted at all. Shocks administered without any rhyme or reason are far more stress-inducing than those given with regularity. From an evolutionary standpoint, this also makes sense: if we are in an environment that we cannot predict, we’re in serious danger. If you know when the predators come out and where, you can use that information to make safer decisions. In the absence of such information, it’s imperative that you figure out what needs to be done or move to an environment you’re more familiar with. In other words, an absence of feedback will create a stress response, causing us to either double down and search for more information, or get out of harm’s way as quickly as we can. In the modern world, this stress response makes us decide whether to “buckle down” and filter out peripheral information (thus increasing our sensitivity to feedback), or walk away.
The region of the graph that rests above the horizontal line is our beloved “sweet spot”. Here, there is some absence of feedback that we respond to by learning and adapting, whereas to the right, we become increasingly frustrated and restless as we fail to make sense of our surroundings and become increasingly likely to quit (the same logic can also apply to being bored by something that is too abstruse or irrelevant). To the left lies the zone in which feedback is abundant and there is little ambiguity in what we are doing, leading to the kind of addictive behavior in which we grab the low-hanging fruit at the expense of development. Since such behaviors can de-sensitize our stress-response, this addictive behavior is harmful in that it actually causes atrophy by causing us to regress by default to even less nuanced feedback. Most importantly, the stress response does not get “trained” in any way, but is rather a means to helping us become more focused at specific tasks. When we are able to hit the “sweet spot” on a regular basis, we can engage in the kind of deliberate practice that Cal Newport advocates.
So what implications does this have for improving our efficacy at tasks? Is this just a long-winded way of restating the idea that we need to be diligent and not just take the path of least resistance? In part, yes, but not without some details that can give us more useful information than the folk concept of “self discipline.” By talking about attention as sensitivity to feedback and willpower as a stress response in the absence of feedback, we can revisit Gilder’s wireless communications story as a way of understanding how to approach the issue of focus on a more practical level. If you recall the metaphor of the cocktail party, the approach of everyone trying to talk over the noise of the crowd will result in little gain and hoarse voices. In wireless communications, the equivalent practice is to use additional energy to boost the clarity of a signal over a channel. Unfortunately, just like in the case of the cocktail party, the returns diminish quickly, as it takes a quadratically increasing amount of power to boost a channel’s signal. Meanwhile, the loud noise from everywhere else has to be compensated for by using spare bandwidth to create thick buffers that block out interference, diminishing the efficiency of the wireless network.
The stress-response that dictates deliberate self-control works in the same way. Although a certain amount of it is necessary and even beneficial, we quickly get diminishing returns on the signal. Add to this that the stress-response gets blunted over time as we become decreasingly sensitive, and returns will diminish even more quickly. We can also block out noisy stimuli by filtering out irrelevant stimuli, but given our limited bandwidth, this too is a drain on resources.
Just like our stress-response peaks in effectiveness before hitting diminishing returns, a similar factor is at play in Gilder’s explanation of wireless communications. Going back to the example of the cocktail party, if everyone tries to talk louder, it will become no easier to hear the other person, and worse, everyone’s voice will go hoarse and everyone’s hearing will be shot as they constantly try to talk over one another. The same thing happens even less ambiguously in wireless communications: boosting the signal (turning up the volume) of a channel requires a quadratically increasing amount of power, which means that you won’t be able to cost-effectively boost the signal past a certain point. Our stress response, when utilized, undergoes the same dynamics, becoming increasingly ineffective with overstimulation.
More speculatively, I suspect the stress response has two separate mechanisms that mirror the channel-boosting/insulating dynamics of TDMA. By amplifying the signal with additional power, we increase the neurological “volume” of feedback by creating a bigger rush of neurotransmitters (or something similar) for every stimulus. In addition, it uses up bandwidth in order to insulate ourselves from interfering stimuli from elsewhere, thus leaving us limited in our ability to make other decisions during demanding times. Yet another way of looking at it is that our stress response, by creating such tunnel-vision, narrows the possibilities we have: perhaps this is what’s behind the experience many have with stimulants in which it reduces their creativity With this comes an additional cost: the more over-saturated we are with feedback elsewhere, the more resources we have to devote to boosting the channel and insulating it from interference. Our energy and bandwidth are finite, and bandwidth that is spent blocking distractions is bandwidth that can’t do other things. Meanwhile, the more energy we spend getting a signal through a single channel, the more bandwidth is wasted that could have been used for more meaningful pursuits. Personally, I’m sick of it: I have too many days where I get home from work feeling too exhausted and unfocused to be productive, but still feel a nagging restlessness. Luckily, I believe that there’s an equivalent to CDMA that maximizes our available bandwidth, and greatly reduces the amount of energy needed to create a clear channel of feedback.
Attention Gardening: Via Negativia and Craftsmanship in Extremistan
Unsurprisingly, there’s a lot we can do if we think about all this in less linear terms. Going to the gym, even if one goes several days a week, only takes up a small slice of our waking hours, but the intensity of the effort significantly shapes our metabolism by initiating cascades of signals. Many, myself included, have found success in intermittent fasting by creating a large systemic impact using only a brief period of stress. In both cases, applying a relatively small amount of energy resulted in chains of feedback with convex effects. The upside-down-U curve that I showed earlier actually mimics this, as (up to a point) the value of f(x) not only increases faster than x, but in fact accelerates. The reason behind this is that the right chain of feedback will have compounding returns; in other words, it will be a positive feedback loop. More importantly, you don’t need to always get it right: as Aaron Brown in Red Blooded Risk explains about investing, “Successful risk taking is not about winning a big bet, or even a long series of bets. Success comes from winning a sufficient fraction of a series of bets, where your gains and losses are multiplicative. That pattern of gains and losses leads to exponential growth. This appears to observers as overnight success.” Nor am I the first to consider this approach to productivity, both Venkatesh Rao and Gregory Rader have talked about such an idea in terms of achieving “thrust” in order to make accelerating progress in a pursuit. Unsurprisingly, their parabolic thrust/drag model once again mimics the curve that I’ve repeatedly talked about in this entry (and this blog in general.)
The thrust-and-drag analogy has a lot in common with my own analogy to Gilder’s talk on TDMA vs CDMA in wireless communications. What they call “drag”, can be identified as interference that makes the channel noisy. Just as it costs a quadratic amount of power to boost the signal over a linear addition of noise, drag has a quadratic effect on the trajectory of a projectile. Therefore, reduction of drag is crucial to cultivating convex returns on your efforts. All this suggests that Jensen’s Inequality is a crucial element of any sound productivity strategy: the dose must be concentrated enough for a positive feedback loop to occur, so up to a point, returns are convex.
This still leaves two things left to consider: first, we only have so much “rocket fuel” to expend, and creating sufficient “thrust” will require a significant up-front investment. Is there any way that we could cut our investments into small pieces in the way Aaron Brown recommends? Second, a positive feedback loop can often (and may inherently be) an addiction in the way that I described earlier; all learning ultimately relies on a degree of negative feedback, which is behavior that corrects by filling informational gaps. It’s worth noting that Venkat himself, in the article that I posted, uses the word “addiction” to describe the flow of a creative task, and in fact inspired my own slightly more technical definition of addiction with his own similar idea known as Gollumization. These two points considered, how can we maximize such “thrust” without falling into addictive behaviors or falling back on naive ideas about self-discipline?
While Rao and Rader are both right on the money regarding the necessity of removing drag, I think that moving beyond a thermodynamic analogy can provide us with a better outlook on how to create momentum in our pursuits at minimal cost. In informational terms, drag is the absence of feedback that will be acted on via the stress response I talked about earlier. While too many distractions can definitely cause this, so can a lack of overall sensitivity, which can occur if we’re used to receiving intense novelty on a frequent basis at little cost. It’s also worth noting that this “drag” is reduced with increased adeptness, as we gain a greater sensibility for the actual subject at hand; like I said before, we cannot register feedback if we’re working with something we simply don’t understand. For this reason, it’s not only important to set the appropriate difficulty level (too easy is addictive, too hard is unintelligible), but to make sure that the pursuit automatically calibrates the difficulty level as we advance so that we don’t find ourselves struggling to stay focused after a series of promising initial gains.
To get an idea of how this “sensibility” matters, consider the use of unique codes in CDMA. These unique codes are usable because the decoders on the devices are extremely powerful and intelligent. By analogy, we can work according to feedback with minimal excess energy by having the sensibility and experience to see the nuances in the feedback that we’re receiving. I don’t think this is just a matter of using skills that we’ve already learned, but in fact, the reason why we can see such “accelerating returns” in creative pursuits, and even gain the metacognition to become better at focusing and learning at broad arrays of tasks. Although it’s known that becoming good at chess doesn’t improve memory in any area except for chess positions, it still seems to be the case that broad erudition makes us more suited for the uncertain future. Although I do not know of any empirical evidence, I strongly suspect that the ability to break domain dependence by reasoning through analogy allows us to draw more general lessons from earlier pursuits to accelerate the necessary learning curve of later ones. In a broad sense, becoming more sensitive to certain kinds of feedback not only frees up room to listen to new kinds of feedback, it also provides information of its own that we can apply to increase our sensitivity to those novel forms of feedback (thus freeing up even more room in a kind of virtuous cycle.)
The other issue with talking about chess is that as an activity, it’s a well-defined closed system. Closed systems can be mastered, albeit with difficulty, through a relatively straightforward kind of practice in which feedback comes at a reliable pace and common lessons can be easily passed down through more experienced practitioners. For less definite fields, feedback is not so straightforward, there is not nearly as much of an externally defined set of rules, and worst of all, even when the former and the latter both seem to be true, the logic of the system may be much more wild and unpredictable than it looks, even if it looks super calm.*** In the case of these open systems, metacognition has more of an impact, because there is no definite set of rules from which you can derive the logic of the system, and because the possibility of extreme outcomes means that insights can make you either fragile, antifragile. But once again I digress; this is getting into a whole other topic that I’d like to one day discuss, but simply can’t right now.
What’s worth noting about the analogy to CDMA is that it indeed does seem possible to quiet all of the individual channels down and maximize our productivity by learning how to become more sensitive to feedback. In fact, if one considers Claude Shannon’s Noisy-Chanel Coding Theorem, we can maximize the effectiveness (transmission rate) of our time (bandwidth/channel-capacity) by reducing interference (error) to virtually zero by creating the right transmission code. In this case, I think that there’s strong reason to believe that our transmission code is the sensibility we develop. Such a theorem suggests that the approach of creating more complex codes while avoiding power-boosting really is the optimal approach to wireless communications. This is admittedly speculation, as I still have much to learn about information theory and how to apply it, but I think that this possible insight can help us understand, by analogy, how to gain productivity through finesse rather than brute strength.
I think that the general idea can be achieved by using the logic of nonlinear cascades used above. By not overusing our “stress response”, increasing our sensitivity to feedback, and properly structuring/planning our tasks such that feedback comes to us in the right intervals, we can spread our energy widely without tiring ourselves out or obviating the possibility of compounding returns. To do this, I advocate for an approach in which we apply our energy not to running some kind of a marathon, but instead to “planting seeds”. Rather than trying to beat a challenge by force of will, or merely do the opposite and take the path of least resistance (which could just land us right into addictive patterns), we should make it our task to create the right conditions for productivity and allow the actual work to develop based on feedback. In other words, our job is not to make it happen, but to make sure that external conditions allow it to happen.
This is not just a matter of “rationing willpower”, but is actually key to an intrinsically better learning process. Our stress response, which comes from the absence of feedback, is a signal, and as such, is meant to tell us when to double down and when to walk away. This can be very helpful, since if you don’t have much aptitude for something, or if you don’t have a pressing reason to do something, then it’s probably a waste of time and energy to do it. If there were no limits to our “willpower”, we would be able to easily override our instincts, gut feelings, and tacit knowledge in such a way that our conscious awareness, armed with much less knowledge than we think, would endanger us by overwritnig the deeper logic that exists beyond our awareness. The reason why I am making an analogy to gardening is that plants grow according to their own fractal logic; something that we cannot and should not have any control over. Our job, as gardeners, is to facilitate this logic, such that the plant remains in a state of growth rather than a state of atrophy. With this in mind, how can we create such conditions?
First, we need the channels to be quiet. We only have so much “power”, as our stress response will lose its effectiveness with too much coffee, forced focus, and anxiety. The more that we can cut down on means of artificially raising the stress response, and the more that we can get away from noisy stimuli such as excessive TV, video games, social media, and news, the more sensitive we’ll be to feedback on the whole. Once there are few enough clouds in the area, we can decide to actually plant the seed. Planting the seed still requires some non-negligible amount of investment, but since this works as a stress-response and not as a finite gas tank of “willpower”, we can get more for our money by increasing our sensitivity to the stress response, and can preserve sensitivity to the stress response by having it work in moderate pulses rather than chronically activating it (yet another instance of Jensen’s Inequality). Once planted, our tree will grow according to the path of least resistance, dictated by an informationally rich logic that does most of the work for us. We may, however, have to prune the branches every so often if there’s something going seriously awry; if we enter into a cycle of addictive behavior, then it is time to intervene. Knowing when to do this is not an algorithm but a matter of metacognition, but it relates back to another one of Taleb’s heuristics involving Jensen’s Inequality: let minor problems take care of themselves, but do not hesitate to intervene in serious rare threats. Nor does it have to be perfect: since feedback is part of even our most high-level conscious experiences, we should be content to cut our trees in a wabi-sabi kind of way.
The tree will eventually bear fruit, and that will give us the means to plant yet more seeds. What’s important to note is that our conscious role in all this is the metacognition necessary to make the system run. Beyond that, too much tampering will replace the nuanced information of the feedback cycles with sloppy pseudo-approximations made within the limited scope of our awareness. So on a final note, don’t ever make “goals” or “plans” or “schedules” in the traditional sense. Such management is good if and only if it’s about setting up the conditions for an information-rich learning process. Although all of this seems like I’ve gone way too far to explain a few simple concepts, I think that this appreciative model allows us to move beyond platitudes and actually come up with real reasons why expectations and beating ourselves up over failure are not parts of a good strategy, nor does it make sense to treat focus like a raw force of will. Focus comes from being sufficiently sensitized to feedback, a product of a well-calibrated stress-response, fine-tuned sensibilities, and the proper alignment of skill and complexity. In other words, it’s a matter of preserving the integrity and clarity of new information. Learning this has led me to a much more hands-off approach where my primary concern is looking at what major events will trigger or inhibit compound intellectual and creative growth, and has made me wonder if we can see substantial changes in how we think about learning disabilities in the same way that we’ve gained a more nuanced and effective understanding of obesity. Best of all, I might just get over the fact that I’m far from satisfied with this essay.
*Some of you might be wondering if this is a sort of moral nihilism. Hardly: I believe that morality is a matter of accounting. Whatever the reason was that we did something wrong, it’s imperative that we be held accountable so that people aren’t encouraged to go do it. Justice is about the task of honest accounting, doing only what’s necessary for the sake of holding society together. Going beyond that is immoral.
**Some of you might be wondering if this is a sort of moral nihilism. Hardly: I believe that morality is a matter of accountability. Whatever the reason was that we did something wrong, it’s imperative that we be held accountable so that people aren’t encouraged to go do it. Justice is about the necessary task of honest accounting for the sake of the greater good, not vengeance. And yes, this is just my opinion.
**A similar argument was made by Nicholas Carr in The Shallows, but I had not connected his idea with video games, since I figured “I’m not multitasking.” I also don’t think that his argument was as fundamentally about feedback, though perhaps I’m not giving him enough credit.
***This is based on a relatively technical probabalistic/statistical point that the variance of a statistical sample does not necessarily reveal the variance of a true generator. In other words, the fact that something looks tame doesn’t mean that it actually is tame, as there may be black swans waiting. See The Black Swan or Antifragile for further reading.