Coffee: Visualizing the Response Function

I have been calling my 3D response function a “surface”—which is linguistically feckless at best. Of course, I have some company, the immortal Box, Hunter, and Hunter call the experiment design a response surface method. So note carefully, the response function is not a surface, it completely fills a volume.

Naturally, this is hard to visualize. The contour plots are slices through the response space, but I really have a hard time inferring what my coffee quality function is telling me. Better visualization to the rescue.

First, put the contour plots together into slices, using just the shaded parts and not the lines. Note that these cuts run through the center of the design point, and not the stationary point.

This is still quite difficult to interpret. The blue and green corners of the areas are clearly to be avoided, while the red-tinted areas near the top are to be embraced. Red, in all these plots, is a higher value of q than blue. The weird little balls are the points on the steepest ascent experiment path. Their radius corresponds to their predicted q values. Tufte, of course, would have a fit; clearly the volume should be proportional to the indicated variable.

Those planar cuts are nice to look at, but we can do even better. Add “isosurfaces”—these are surfaces that lie on constant values of q. In two dimensions a topo map shows lines of constant elevation. For 3D data, the lines become surfaces. In the following figure I’ve kept the horizontal cut, and added isosurfaces. The big black line shows the steepest ascent.

This is getting useful. To make a good cup of coffee, you want to be somewhere in the red cup on the upper-left-front corner. The zone is not enormous. To more clearly visualize this, here is the same concept from a few different angles.

So, here we are, and I still don’t have a good idea what the response function looks like outside the design space. Clearly there is a region in the middle which is relatively high quality, and it appears that higher C/W ratios are favored.

Finally, to get a solid sense of the response function I thought it would be helpful to zoom out. The next graph shows the response function in extrapolated over a much wider range.

The little box in the middle is the design space. In this case there are four isosurfaces drawn, for q=1,2,3,4. With this visualization you can clearly see the shapes represented by the response function. I’m sure that the function is a poor representation of truth in the regions far from the design space. Nevertheless, this diagram is more helpful to me than any other, as I can see there is clearly a cup-shaped region in which my coffee should be really awesome. The region is surprisingly narrow over C/W ratio, time, and temperature. A poor-man’s sensitivity analysis.

All the previous analysis has been on the response surfaces for the RSO data set only, that is, I left out the original data from the screening experiment. If I include the alternate data the fit remains qualitatively similar in that there is a tasty region in the upper left hand part of the original design space. However, the rest seems very different. There is now a saddle point, and more of the design space is better than before. Now, too, the steepest ascent path points toward increasing C/W ratio and decreasing temperature. Unlike the previous fit, though, there is substantial decrease in time inferred. Furthermore, this surface admits quality values in excess of 5. The upper isosurface is q = 5.

Resources

On the off chance you’re wondering how I made these really cool plots…

Analysis of the experiment and fitting of the response model to the data was accomplished in the wonderful language R. I couldn’t find any way to do cut-planes in R, and in any case am not terribly good with it. So I used Python(x,y) with the Enthought Tool Suite, in particular the mlab and Mayavi interfaces. Many, many thanks to the good people of the Pythonic world, and especially Enthought, for the quality product they have created. Who needs Matlab anyway?

My source code is available upon request.

The Results

This bread is very different from the artisanal style with white flour. It is more flavorful, the crust has a similar chewiness and resilience, and the crumb is nice. It is, however, denser than I like. On the whole it is quite pleasant to eat, but there is room for improvement. The color is dark, almost like rye bread, but the flavor is distinctly “whole wheat”. There was some caramelization of the crust edges and bottom, which makes it taste a little toasty—a flavor I rather like.

The loaf did not loft as high as I wanted. To try to enhance the loft, I would let the oven preheat longer, raising the temperature of the casserole. That might give more steam-powered spring to the loaf. I would add a little bit more water, the loaf was drier than the white flour loaves made from the same recipe. Being too dry makes the dough stiffer, and perhaps this keeps the steam-spring from lofting the bread. Finally, I would let the loaf’s second rise last longer than an hour, so that it really doubles in size.

Why Make This?

Whole wheat grains, but not flour, stores very well for food storage. Using it, however, is not so easy. Certainly the foods we are used to having from white flour, pasta and breads, require adaptation both of our palates and of our recipes. Bread baking is one of the principal ways to use and consume flour. How do we cook it without an oven, which typically requires electricity?

The best method I have is the Dutch oven, heated with charcoal. Solar cookers could work too, but I don’t have one. For experimentation I use a regular home oven for the heat, and a Pyrex casserole as a substitute for the Dutch oven. I also used store-ground wheat flour, since I don’t yet have a wheat grinder.

In short, this experiment was done to determine if 100% whole wheat Dutch oven bread would be palatable. And the answer is, yes—quite good actually. I would miss white flour if I had to forego it. On the other hand, I would be smart to quit it anyway. The whole wheat stuff is supposed to be much more healthful.

Recipe

• 3 C Whole wheat flour
• ½ tsp Active dry yeast
• 1 ½ tsp salt
• 1 ½ C warm water (plus some more)

Mix all the dry ingredients together in a non-metallic bowl. Add the water and stir it with your hand just until it comes together in a sticky gooey mass. If it seems like you could turn it out and knead it, then you didn’t add enough water. It should be messy.

Cover it, and let it rise 12 to 18 hours. Really.

Dump the dough out and use just enough flour to keep it from sticking while you form it into a ball. Put it to rest on something that you can clean. I used parchment paper last time, next time I’ll use parchment paper covered with oat bran or corn meal. It stuck like hell to the parchment paper. It is going to rise for 1 to 2 hours, until it has increased to almost double in size.

Put your Dutch oven (or casserole) in the oven and preheat to 450 F. I think, actually, that you should go as low as 420 if you are using a glass casserole.

Pop the doubled dough into casserole, cover, and bake 30 minutes. Uncover and bake 15 to 20 more minutes.

Note that “pop the doubled dough into casserole” is code for “make a huge mess and possibly burn yourself”. Turn the TV up before this step so the children aren’t scarred by your cursing.

Finally, I got this recipe from the Mother Earth News (which I love)—their recipe, Easy, No Knead Crusty Bread, emphasizes one of the main benefits, not kneading.

The New York Times also covers this recipe and includes a link to a video that is well worth watching.

Coffee: Analysis & Results of the Response Surface

I have executed the experiment design discussed Coffee: Design of the Experiments (Part 2: RSO). This “response surface method” experiment design is carefully crafted to provide the data necessary to fit a formula of the form

q = b0T2 + b1t2 + b2r2 + b3Tt + b4tr + b5Tr + b6T + b7t + b8r + b9

This model describes quadratic terms (T2, t2, r2), main effects (T, t, r), and first-order interactions (Tt, tr, Tr). The model basically assumes that quality is locally curved or locally linear, but not an undulating surface. This assumption of local smoothness is a good one in this case—were it not the experience of buying a cup of a coffee would be a dangerous gamble where small changes made by the brewer resulted in big changes in our cup.

Some possible findings in our analysis

• There is a maximum value of q at some point in the design space, whose approximate location we can find with the preceding method.
• There is a maximum value of q at the edge of the design space, that is, we might find that the best q occurs at some maximum value of concentration (a face of the design cube), or even, for example, the maximum value of concentration and temperature (an edge of the design cube), or a maximum of all the values (a corner of the design cube).
• Some variables don’t matter, and the entire experience is governed by only one variable—indeed the screening experiment hinted at this outcome.

If the optimum is inside the design space, then I know immediately how to brew the best cup of coffee—within some amount of error. If the optimum is outside the design space, at least I know which experimental trend to pursue. Namely, new experiments are positioned along the path of most rapid ascent, or a new design cube is positioned outside the current design cube, with the center moved along the path of most rapid ascent.

Results

The following three graphs show slices through the response surface fit to the experiments. The slices are positioned so that they intersect the “stationary point”. If there is a local maximum, then it is the stationary point; the plots then display the local sensitivity around that stationary point. The quadratic fit can also produce saddle-shaped responses, in which case the center of the saddle is a stationary point.

Note that these plots are all based on the fit to just the response surface design. A different fit is obtained by including the data from the screening results.

The graphs all work in “coded variables”, that is, they are all adjusted so that the values lie between -1 and 1. It is necessary to decipher these values into measureable numbers, which the following equations do. The coded value is indicated with a c subscript.

Time tc = 1.35log10(t) – 2.35

Temperature Tc = 0.1(T – 195)

C/W Ratio rc = 50(r – 0.055)

These equations are easily inverted, though below I provide uncoded (normal) values for the points in the steepest ascent.

The fitted space suggests that water temperature is important—certainly more important than the screening experiment suggested. However, my cup quality actually improves as the temperature declines. Almost all the brewing words I’ve read, including the latest Consumer Reports, harp on the importance of getting water hot enough. My experiment suggests that modulating extraction duration and C/W ratio are at least as important.

Steepest Ascent

The response surface methodology encourages further experiment design. Starting from the stationary point, you can follow the path of steepest ascent to move toward ever better quality—at least in theory. To my surprise, this worked for a one-trial qualitative test; more on that later. The steepest ascent predicted using the fit to just the RSO data is shown in the next table. Note that the concentration of coffee is quickly leaves the design space, which was limited to concentrations less than 0.075.

 Steepest (RSO only) Time [sec] Coffee [g] Water [fl oz] Temp [F] C/W Ratio [g/ml] Predicted Q 55 22.3 13.7 195 0.055 3.7 69 22.3 12.1 192 0.062 3.9 90 22.3 10.7 190 0.071 4.1 116 22.3 9.5 187 0.079 4.3 148 22.3 8.6 185 0.088 4.4 188 22.3 7.8 183 0.096 4.6 238 22.3 7.2 181 0.105 4.7 301 22.3 6.6 179 0.114 4.8 380 22.3 6.2 177 0.123 4.9 479 22.3 5.7 175 0.131 5.0 601 22.3 5.4 173 0.140 5.0

A similar fit using all the available data produced the steepest ascent trials in the next table. It is, perhaps, less believable since the quality values rise in excess of 10.

 Steepest (RSO + Screening) Time [sec] Coffee [g] Water [fl oz] Temp [F] C/W Ratio [g/ml] Predicted Q 55 22.3 13.7 195 0.055 3.3 78 22.3 12.0 193 0.063 3.7 105 22.3 10.4 192 0.072 4.0 132 22.3 9.2 191 0.082 4.5 161 22.3 8.2 190 0.092 5.1 192 22.3 7.4 190 0.102 5.7 226 22.3 6.8 190 0.112 6.4 264 22.3 6.2 189 0.121 7.2 307 22.3 5.7 189 0.131 8.1 355 22.3 5.3 189 0.141 9.1 409 22.3 5.0 188 0.151 10.2

Despite the implausibility, I conducted a trial of the 3rd experiment in the combined table, that is, 105 seconds extraction at 192 F for a 0.072 concentration coffee. My results:

“Quality = 4.2. Toasted marshmallow flavor! Aroma is rich and without char. Acid and bitter are nicely balanced in the first sip but the bitter dominates the aftertaste.”

The quality agreement (and I really didn’t peak) is unexpectedly close—4.2 instead 4. Furthermore, there I was genuinely surprised at entirely new flavors. It really was a great cup of coffee. Damned strong though.

For comparison, the contour plots produced by the fit to the RSO data and the screening data are shown next.

A future post will discuss the goodness of fit, and whether the models are meaningful. For the present, rest assured that predicting a quality of 4 and measuring the same convinces me that the results are useful.

Data

The actual data, along with my notes at the time, is shown in the following table.

 Coffee [g] Water [fl oz] C/W [g/ml] Time [sec] Temp [F] Trial Order Q Date Comment 22.8 14.0 0.055 300 205 1 1.5 2/20/2009 Not good. Bitter flavor with simple aroma. Too bitter 22.8 22.0 0.035 55 205 2 2.5 2/23/2009 Simple flavor with muted cemplexity. Aroma and flavor dominated by char. Neither too bitter nor too acid. Not balanced though. 22.8 14.0 0.055 10 205 3 2.5 2/24/2009 Too bitter. Aroma fairly simple. Mouth feel too watery. Little acidity, poorly balanced between acid and bitter. 22.8 10.3 0.075 55 185 4 4 2/25/2009 Aroma is excellent w/ complex interesting smells. There is some crema. Nicely acidic and reasonably balanced w/ bitter. 22.8 14.0 0.055 55 195 5 3.5 3/2/2009 Acid-bitter balance too heavy on bitter side. Nice crema and nice aroma. Rich mouth feel. 22.8 22.0 0.035 10 195 6 3 3/3/2009 Moderately well balanced. Aroma is simple. Mouth feel watery. 22.8 14.0 0.055 55 195 7 4 3/4/2009 Well balnced w/ good aroma. A little too bitter, but not unpleasant. 22.8 10.3 0.075 10 195 8 3.5 3/5/2009 Aroma somewhat simple. Flavor strong or even rich. Too bitter overall and acid/bitter balance poor. 22.8 14.0 0.055 10 185 9 2 3/6/2009 Bitter simple flavor without much acid. Aroma is farily good but not as rich as I like. 22.8 10.3 0.075 300 195 10 3 3/9/2009 Smelled bitter. Taste moderately well balanced with a little too much bitter. Rich feel. 22.8 22.0 0.035 300 195 11 2.8 3/10/2009 Watery. Yet implanaced toward bitter. Flavor is well-bodied and aroma good. 22.8 10.3 0.075 55 205 12 2.5 3/11/2009 Burned and charred taste. Too bitter while also being nicely acidic. Not well balanced. Aroma dominated by char. 22.8 22.0 0.035 55 185 13 2.5 3/12/2009 Aroma dominated by char and relatively uninteresting. Taste is poorly balanced–too bitter. Strong flavor. 22.8 14.0 0.055 300 185 14 4 3/16/2009 Well balanced and nicely aromatic. A little bitter on the finish. 22.8 14.0 0.055 55 195 15 3.5 3/17/2009 Bitter flavor with weak mouth feel and low complexity. Aroma nicely balanced though.

My oven is gas+electric. The thermostat and ignition are electric, but the heat is from gas. In a power failure my oven would serve well as a cabinet. In the back yard, though, I can cook in a Dutch oven, just like I used to in Boy Scouts.

It turns out, though, that you can make a really good loaf of bread in a Dutch oven, indoors, inside your conventional oven. The good people at the Mother Earth News show how to make an Easy, No Knead Crusty Bread with a Dutch oven, inside a conventional kitchen oven. And it really works well. I’ve made fantastic loaves in the Dutch oven indoors in the oven, or even in a Pyrex casserole dish in the oven. Actually, the Pyrex works better than my cast iron.

It works in the backyard using charcoal with the cast iron Dutch oven too. Temperature control is harder in general, and getting the high heat (450F) has been difficult for me, but results are acceptable. In the recipe the last 10 or 15 minutes of baking are with the bread uncovered, to darken the crust. Can’t do that with a charcoal-heated Dutch oven in the back yard, but you can gap the lid a little to let the steam escape and enhance browning. Gapping lets the heat out too, so cook time gets longer.

The following picture shows a finished product, with the Dutch oven lid just lifted. You can see the steam rising; the loaf is pale, not quite the golden brown I wanted.

There isn’t much equipment involved. My list is a Dutch oven, a place to cook, a charcoal starter chimney, some charcoal, pliers, and tongs. An infrared thermometer has proved interesting, but certainly not critical.

In one experiment I gapped the lid with three pieces of wire like the one shown below. This was terribly ineffective—too much steam remained trapped inside and the bread crust remained pasty. In later experiments I found just setting the lid ajar, about a half-centimeter gap, was adequate.

Details for a Food Storage Plan

This recipe uses about 0.66 pounds of flour, a little salt, and a little active dried yeast, which provides about 1500 calories for the whole loaf. It is not a huge loaf; a family of four could devour this in one meal without trouble. This is really unfortunate, since bread would easily keep a day or two without refrigeration, and cooking for two or three days in advance would be really nice.

I used about 1 pound of charcoal to cook the loaf. A pound is really quite marginal—it produces enough coals, but about the time the loaf is done the coals are completely ash. On two occasions I really wanted additional hot coals to either enhance the heat (after gapping the lid) or to extend the cook time. By the time I needed to the fresh coals my original ones were burned down so far that I could not start new ones just from contact with the old.

I justified my Dutch oven purchase because I’ll use it camping—so emergency or not I get value from the oven.

Minimizing Toil

It is a pain to make bread dough—though this “no knead” stuff is easier than muffins—and it is a pain to start a charcoal fire and monitor it. If I were planning to bake bread regularly in an emergency, I would want two or three more Dutch ovens, and then I could cook with them in a stack. This would allow baking two or three loaves at once, along with a pot of beans or some other tasty bread companion. There would be less charcoal per loaf when cooking in a stack, probably charcoal use scales at about (n+1)/2, because two Dutch ovens share a layer. Four Dutch ovens cooking simultaneously would consume about 2.5 pounds of charcoal, instead of the 4 pounds you might guess. A big savings if you’re feeding teenagers.

Future Experiments

I would like to try the experiment with 100% whole wheat dough. In my experience 100% whole wheat produces really dense, loathsome loaves. If the result is palatable then storing whole grain (and a grinder) might be sensible.

I would also like to try to increase the loaf size, or split the loaf into two abutting sections, so that the value from the cooking effort would go farther.

Food Storage Requirements

Storage of food and affiliated supplies could be a real hassle. It could also be expensive. However, if your plan satisfies some basic requirements it will be both serviceable to your family and economical. Possibly it will enhance your daily health…at least if you eat like me.

My requirements for a food storage plan are listed below. The fresh, steaming bread in the picture below was made in a charcoal-heated Dutch oven on my back patio.

Requirements

Requirement 1: Everything in this menu plan shall be eatable and enjoyable without refrigeration.

Requirement 2: Total cycle time shall be one year or less (that is, many foods should be rotated out every year).

Requirement 3: All of the foods stored shall be eaten before spoiling once cycled from storage.

Desirement 1: Food should be fun to cook, so that using your storage is practiced.

The total cycle time is derived from the nominal shelf life of typical products. Imagine that you want to store one year of flour for emergency use. You assume 3 pounds of flour per day (I’m making up the numbers) and end up with about one thousand pounds of flour in the storage facility. Day to day your family eats about 5 pounds of flour per week, since you are a recreational baker. If you stopped storing, and just started using, it would take you about 4 years to eat it all. Of course, you’d throw out a bunch, since the shelf life is only 1 year…

For these kinds of foods to work there must be a “dilution factor” built into the plan. There are two ways this dilution can be realized: store for a shorter emergency, or store longer-lived products.

My plan, at least at present, is for shorter emergencies. For one thing, I’m almost guaranteed to use what I store, since it is part of daily life and won’t require extremely long storage lives. Secondly, a short emergency is more likely to occur than a long emergency.

There is a hybrid approach too, where you find select items that aren’t part of your regular diet but could be substituted in a pinch, that have long shelf lives, and that are cheap, for example, whole-grain wheat (and a grinder). These foods have a long shelf life, and are fairly inexpensive. If an emergency never materializes you throw most of it away just before the kids come to drag you to the nursing home. If an emergency does occur, you may be willing to break out the grinder and render that stuff into flour to eat.

In fact, I think of stored foods in three tiers:

1. Foods you eat every day, where storage just means there is more in the queue. Vegetable oil, flour, salt, and the like.
2. Food substitutes for food you eat every day, but which are fairly expensive. Tomato powder to make spaghetti and dried eggs come to mind.
3. Food you never eat but that you would in a pinch, such as whole wheat when you normally buy white flour at the store.

Guiding Assumptions

During an emergency it is easier to cook a relatively complex and big lunch than it is to cook breakfast and dinner. Midday offers the most light, so solar ovens work best, and you may benefit most from a fire in the morning.

Most leftovers are impractical, because there is no refrigeration. Breads could be leftover, but soups and stews probably could not. Cooking enough bread, for example, to have some left is quite difficult while also meeting caloric needs for a day (as I intend to discuss in a future post). What this means is that every day, multiple times a day, you’re cooking everything from scratch—toil. The only mitigation I can imagine is to eat a huge breakfast and a huge late lunch. Maybe food can be safely left out for a few hours, so that if lunch is big enough it can also be dinner. In which case “boring” has consumed “toil”. Presumably an emergency has enough other excitement anyway.

Coffee: Considering Cold Extraction

I won’t start the cold extraction experiments until I finish the response surface set, and possibly until I perform some local optimization using the RSO outcome. To warm up, paradoxically, here are some thoughts on cold extraction.

The basic process, everywhere I’ve read about it, is to put grounds in room temperature water for a long time—overnight for example—and then filter or decant the extract into storage. The extract is generally much stronger than regular coffee, and so to serve it you dilute it with hot water.

Naturally, you can buy expensive equipment to help this process, or you can use a French press. I imagine that decanting through your drip pot’s filter cage would work well too.

Economics

The premier advocate of the method would appear to be Toddy Products. They recommend extracting one pound of beans in 9 cups water, then reconstituting the coffee with 3:1 hot water to coffee extract. I’m guessing that you get about 8 cups of extract, and in turn 32 cups of coffee. I believe that Toddy is talking about 8 fl oz cups, though in the world of coffee you can never be certain. In any case, 1 lb of coffee to about 32 cups of water corresponds to a C/W ratio of 0.060 g/ml. This compares quite reasonably with the C/W ratio used in hot brewing. Recall that in my experimental set-up I use ratios between 0.035 and 0.075 g/ml.

Another site recommends using 1 gallon (16 cups) of water with 1 lb of coffee. All the other sites I examined suggest similar ratios of approximately 0.060 g/ml.

Grind Size

Recommended grind size is coarse, though it appears the reason is to allow easy filtering. Since extraction time is 6 to 12 hours, depending on reference, I suspect that what dominates the extraction process is the solubility of various compounds at room temperature, rather than the diffusion of those chemicals from the beans. Any old grinder is probably fine, and variation in the particle size is probably irrelevant.

Predictions

I believe this process will make an insipid cup of coffee. It is recommended by some because it produced lower acidity and less bitterness. Translation: less flavor. Terroir Coffee asserts that the “resulting cup is light bodied and bland since all acidity and many aromatics, which require hot water for the right chemical reactions, are never formed.” Not inspiring. Of course, I didn’t think brewing coffee had much to do with chemical reactions as with dissolution…

The process offers great convenience in that you can brew just once for two weeks’ coffee. Furthermore, it should be palatable to people who like weak coffee. The strength of the cup can be easily changed to suit the members of your party.

Food Storage: An Unconsidered Idea

A friend of mine intimated today that his food storage plan was principally MRE (meals ready-to-eat), like they use in the military. He noted that the nutritional shelf life was essentially eternal, though I wonder if they would outlast a Twinkie. He commented that the meat loaf meal was particularly yummy.

I hadn’t considered these as an option. While I think they would be too expensive to be the foundation of a year-long storage plan, they might be a beneficial part (several weeks worth, perhaps). They are low effort to prepare and provide wide variety, two shortcomings of any plan I’ve considered. Another item to consider.