当前位置:首页 >> 法语学习 >>

Extending mobile computer battery life through energy-aware adaptation


Extending Mobile Computer Battery Life through Energy-Aware Adaptation
Jason Flinn
CMU-CS-01-171

December 2001

School of Computer Science Computer Science Department Carnegie Mellon University Pittsburgh, PA Thesis Committee: M. Satyanarayanan, Chair Todd Mowry Dan Siewiorek Keith Farkas, Compaq Western Research Laboratory Submitted in partial ful?llment of the requirements for the degree of Doctor of Philosophy.
Copyright c 2001 Jason Flinn
This research was supported by the Defense Advanced Research Projects Agency (DARPA) and the Air Force Materiel Command (AFMC) under contracts F19628-93-C-0193 and F19628-96-C-0061, DARPA, the Space and Naval Warfare Systems Center (SPAWAR) / U.S. Navy (USN) under contract N660019928918, the National Science Foundation (NSF) under contracts CCR-9901696 and ANI-0081396, IBM Corporation, Intel Corporation, AT&T, Compaq, Hughes, and Nokia. The views and conclusions contained in this document are those of the author and should not be interpreted as representing the of?cial policies or endorsements, either express or implied, of DARPA, AFMC, SPAWAR, USN, the NSF, IBM, Intel, AT&T, Compaq, Hughes, Nokia, Carnegie Mellon University, the U.S. Government, or any other entity.

?

Keywords: Energy-aware adaptation, application-aware adaptation, power management, mobile computing, ubiquitous computing, remote execution

Abstract
Energy management has been a critical problem since the earliest days of mobile computing. The amount of work one can perform while mobile is fundamentally constrained by the limited energy supplied by one’s battery. Although a large research investment in low-power circuit design and hardware power management has led to more energy-ef?cient systems, there is a growing realization that more is needed—the higher levels of the system, the operating system and applications, must also contribute to energy conservation. This dissertation puts forth the claim that energy-aware adaptation, the dynamic balancing of application quality and energy conservation, is an essential part of a complete energy management strategy. Energy-aware applications identify possible tradeoffs between energy use and application quality, but defer decisions about which tradeoffs to make until runtime. The operating system uses additional information available during execution, such as resource supply and demand, to advise applications which tradeoffs are best. This dissertation ?rst shows how one can measure the energy impact of the higher levels of the system. It describes the design and implementation of PowerScope, an energy pro?ling tool that maps energy consumption to speci?c code components. PowerScope helps developers increase the energy-ef?ciency of their software by focusing attention on those processes and procedures that are responsible for the bulk of energy use. PowerScope is used to perform a detailed study of energy-aware adaptation, focusing on two dimensions: reduction of data and computation quality, and relocation of execution to remote machines. The results of the study show that applications can signi?cantly extend the battery lifetimes of the systems on which they execute by modifying their behavior. On some platforms, quality reduction and remote execution can decrease application energy usage by up to 94%. Further, the study results show that energy-aware adaptation is complementary to existing hardware energy-management techniques. The operating system can best support energy-aware applications by using goal-directed adaptation, a feedback technique in which the system monitors energy supply and demand to select the best tradeoffs between quality and energy conservation. Users specify a desired battery lifetime, and the system triggers applications to modify their behavior in order to ensure that the speci?ed goal is met. Results show that goal-directed adaptation can effectively meet battery lifetime goals that vary by as much as 30%.

iii

iv

Acknowledgments
I may perhaps be unusual in that I found the writing of my dissertation to be a very enjoyable task. I think that this is in large part due to the tremendous people with whom I have worked during this project. My advisor, Satya, made invaluable contributions not just to this dissertation, but also to my development as a research professional. He exhibited a constant optimism in the ultimate usefulness and success of my research that helped to sustain me as I worked through the hard problems. He was also remarkably successful in keeping an eye on the “big picture” and preventing me from getting mired in the details. Perhaps his most valuable contribution, though, lies in teaching me how to express my ideas through both the spoken and written word. I am indebted to the other members of my thesis committee, Keith Farkas, Todd Mowry, and Dan Siewiorek, for helping me select interesting research directions to explore. Their input led to the development of Spectra, which turned out to be one of the more fun projects in the dissertation. I have built upon the work of the many members of the Odyssey group. Brian Noble not only laid the foundation for this work by developing Odyssey; he also served as a role model during my ?rst years at CMU. Dushyanth Narayanan, Eric Tilton, and Kip Walker were my brothers-in-arms in the coding trenches—we spent many late nights running experiments, getting demos to run, and building a working system. The newer students in our group, Rajesh Balan and SoYoung Park, have helped me re?ne my ideas and evaluate my work. It is always a pleasure when one can work with such a talented group of people. My former 8208 of?cemates, Hugo Patterson, David Petrou, and Sanjay Rao, served as sounding-boards for many crazy ideas. Bob Baron and Jan Harkes provided a tremendous amount of technical advice about kernel internals and the Coda ?le system. Eyal de Lara provided a great deal of help with the Puppeteer system. I’d like to especially thank my father for encouraging me to go back to graduate school, and my mother for reminding me of the importance of having fun. My friends from Penn, Ben Matelson, John Mayne, Patrick O’Donnell, Ted Restelli, and Geoff Taubman, have provided a support network that has withstood the test of time. I’d also like to thank the many new friends I have made here at CMU, including, but not limited to: Rajesh Balan, Angela and Joe Brown, Chris Colohan, Charlie Garrod, Chris Palmer, Carrie Sparks, Greg Steffan, Kip Walker, and Ted Wong.

v

vi

Contents
1 Introduction 1.1 Energy management in mobile computing 1.2 Energy-aware adaptation . . . . . . . . . 1.3 The thesis . . . . . . . . . . . . . . . . . 1.4 Road map for the dissertation . . . . . . . 1 1 2 3 3 5 5 6 7 8 9 10 13 15 16 17 17 18 19 21 23 23 26 29 31 31 32 33 34 34 34

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

2

Background 2.1 Energy metrics . . . . . . . . . . . . . . . . . 2.2 Hardware platform characteristics . . . . . . . 2.2.1 The IBM 560X laptop computer . . . . 2.2.2 The Itsy pocket computer . . . . . . . . 2.2.3 Comparison of platform characteristics 2.3 The Odyssey platform for mobile computing . . 2.4 Summary . . . . . . . . . . . . . . . . . . . . PowerScope: Pro?ling application energy usage 3.1 Design considerations . . . . . . . . . . . . . 3.2 Implementation . . . . . . . . . . . . . . . . 3.2.1 Overview . . . . . . . . . . . . . . . 3.2.2 The System Monitor . . . . . . . . . 3.2.3 The Energy Monitor . . . . . . . . . 3.2.4 The Energy Analyzer . . . . . . . . . 3.3 Validation . . . . . . . . . . . . . . . . . . . 3.3.1 Accuracy . . . . . . . . . . . . . . . 3.3.2 Overhead . . . . . . . . . . . . . . . 3.4 Summary . . . . . . . . . . . . . . . . . . . Energy-aware adaptation 4.1 Goals of the study . . 4.2 Methodology . . . . 4.3 Experimental setup . 4.4 Video player . . . . . 4.4.1 Description . 4.4.2 Results . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

3

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

4

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

vii

viii 4.5 Speech recognizer . . . . . . 4.5.1 Description . . . . . 4.5.2 Results . . . . . . . 4.5.3 Results for Itsy v1.5 Map viewer . . . . . . . . . 4.6.1 Description . . . . . 4.6.2 Results . . . . . . . Web browser . . . . . . . . 4.7.1 Description . . . . . 4.7.2 Results . . . . . . . Effect of concurrency . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

CONTENTS
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 37 39 40 43 43 43 47 47 48 50 53 57 57 58 59 60 60 63 65 66 67 69 69 70 70 74 78 78 79 80 81 81 84 84 88 92 98

4.6

4.7

4.8 4.9

5 A proxy approach for closed-source environments 5.1 Overview . . . . . . . . . . . . . . . . . . . . 5.2 Puppeteer . . . . . . . . . . . . . . . . . . . . 5.3 Measurement methodology . . . . . . . . . . . 5.4 Bene?ts of PowerPoint adaptation . . . . . . . 5.4.1 Loading presentations . . . . . . . . . 5.4.2 Editing presentations . . . . . . . . . . 5.4.3 Background activities . . . . . . . . . . 5.4.4 Autosave . . . . . . . . . . . . . . . . 5.5 Summary . . . . . . . . . . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

6 System support for energy-aware adaptation 6.1 Goal-directed adaptation . . . . . . . . . . . . . . . . . . . 6.1.1 Design considerations . . . . . . . . . . . . . . . . 6.1.2 Implementation . . . . . . . . . . . . . . . . . . . . 6.1.3 Basic validation . . . . . . . . . . . . . . . . . . . . 6.1.4 Sensitivity to half-life . . . . . . . . . . . . . . . . 6.1.5 Validation with longer duration experiments . . . . . 6.1.6 Overhead . . . . . . . . . . . . . . . . . . . . . . . 6.2 Use of application resource history . . . . . . . . . . . . . . 6.2.1 Bene?ts of application resource history . . . . . . . 6.2.2 Recording application resource history . . . . . . . 6.2.3 Learning from application resource history . . . . . 6.2.4 Using application resource history to evaluate utility 6.2.5 Using application resource history to improve agility 6.2.6 Validation . . . . . . . . . . . . . . . . . . . . . . . 6.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

CONTENTS
7 Remote execution 7.1 Target environment . . . . . . . . . . . . . . . . . . 7.2 Design considerations . . . . . . . . . . . . . . . . . 7.2.1 Competing goals for functionality placement 7.2.2 Variation in resource availability . . . . . . . 7.2.3 Self-tuning operation . . . . . . . . . . . . . 7.2.4 Modi?cation to application source code . . . 7.2.5 Granularity of remote execution . . . . . . . 7.2.6 Support for remote ?le access . . . . . . . . 7.3 Implementation . . . . . . . . . . . . . . . . . . . . 7.3.1 Overview . . . . . . . . . . . . . . . . . . . 7.3.2 Application interface . . . . . . . . . . . . . 7.3.3 Architecture . . . . . . . . . . . . . . . . . . 7.3.4 Resource monitors . . . . . . . . . . . . . . 7.3.5 Predicting resource demand . . . . . . . . . 7.3.6 Ensuring data consistency . . . . . . . . . . 7.3.7 Selecting the best option . . . . . . . . . . . 7.3.8 Applications . . . . . . . . . . . . . . . . . 7.4 Validation . . . . . . . . . . . . . . . . . . . . . . . 7.4.1 Speech recognition . . . . . . . . . . . . . . 7.4.2 Document preparation . . . . . . . . . . . . 7.4.3 Natural language translation . . . . . . . . . 7.4.4 Overhead . . . . . . . . . . . . . . . . . . . 7.5 Summary . . . . . . . . . . . . . . . . . . . . . . . Related work 8.1 Energy measurement . . . . . . . . . . . . . . . . . 8.2 Energy management . . . . . . . . . . . . . . . . . 8.2.1 Higher-level energy management . . . . . . 8.2.2 Processor energy management . . . . . . . . 8.2.3 Storage power management . . . . . . . . . 8.2.4 Network power management . . . . . . . . . 8.2.5 Comprehensive power management strategies 8.3 Adaptive resource management . . . . . . . . . . . . 8.4 Remote execution . . . . . . . . . . . . . . . . . . . Conclusion 9.1 Contributions . . . . . . . . . . . . . . . . . . 9.1.1 Conceptual contributions . . . . . . . . 9.1.2 Artifacts . . . . . . . . . . . . . . . . . 9.1.3 Evaluation results . . . . . . . . . . . . 9.2 Future work . . . . . . . . . . . . . . . . . . . 9.2.1 Hybrid energy measurement . . . . . . 9.2.2 Application-aware power management

ix 99 99 100 101 101 101 102 103 103 104 104 105 106 108 111 112 113 114 116 116 119 122 124 126 127 127 129 130 130 132 133 134 134 135 137 137 138 138 139 139 139 140

. . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . .

8

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

9

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

x 9.2.3 Support for adaptation in closed-source environments . 9.2.4 Extensions to Spectra . . . . . . . . . . . . . . . . . . 9.2.5 Proactive service management . . . . . . . . . . . . . Closing remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

CONTENTS
. . . . . . . . . . . . . . . . . . . . 141 142 142 143

9.3

List of Figures
2.1 2.2 2.3 2.4 3.1 3.2 3.3 3.4 3.5 3.6 3.7 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10 4.11 4.12 4.13 4.14 4.15 4.16 4.17 4.18 4.19 4.20 Power consumption of the IBM ThinkPad 560X Power consumption of the Itsy v1.5 . . . . . . Models of adaptation . . . . . . . . . . . . . . Odyssey architecture . . . . . . . . . . . . . . PowerScope architecture . . . . . . . . . PowerScope API . . . . . . . . . . . . . Sample energy pro?le . . . . . . . . . . . PowerScope accuracy . . . . . . . . . . . Effect of variation in the sample frequency PowerScope CPU overhead . . . . . . . . PowerScope energy overhead . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 . 9 . 10 . 12 . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 19 22 24 26 27 28 34 35 36 37 38 39 41 42 43 44 45 46 47 48 49 50 51 52 53 54

Odyssey video player . . . . . . . . . . . . . . . . . . . . . . . Energy impact of ?delity for video playing . . . . . . . . . . . . Predicting video player energy use . . . . . . . . . . . . . . . . Odyssey speech recognizer . . . . . . . . . . . . . . . . . . . . Energy impact of ?delity for speech recognition . . . . . . . . . Predicting speech recognition energy use . . . . . . . . . . . . . Energy impact of ?delity for speech recognition on the Itsy v1.5 Comparison of per-platform speech recognition energy use . . . Odyssey map viewer . . . . . . . . . . . . . . . . . . . . . . . Energy impact of ?delity for map viewing . . . . . . . . . . . . Effect of user think time for map viewing . . . . . . . . . . . . Predicting map viewer energy use . . . . . . . . . . . . . . . . Predicting map viewer energy use by number of features . . . . Odyssey Web browser . . . . . . . . . . . . . . . . . . . . . . . Energy impact of ?delity for Web browsing . . . . . . . . . . . Effect of user think time for Web browsing . . . . . . . . . . . . Predicting Web browser energy use . . . . . . . . . . . . . . . . Effect of concurrent applications . . . . . . . . . . . . . . . . . Background and dynamic energy use for concurrent applications Summary of the energy impact of ?delity . . . . . . . . . . . .

xi

xii 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 6.10 6.11 6.12 6.13 6.14 6.15 6.16 6.17 6.18 6.19 7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9 7.10 7.11 7.12 7.13 Puppeteer architecture . . . . . . . . . . . . . . . . . Sizes of sample presentations . . . . . . . . . . . . . . Energy used to load presentations . . . . . . . . . . . Normalized energy used to load presentations . . . . . Energy used to page through presentations . . . . . . . Energy used to re-page through presentations . . . . . Energy used by background activities during text entry Effect of autosave options on application power usage . . . . . . . . . . . . . . . . .

LIST OF FIGURES
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 60 61 62 63 64 65 67 71 76 77 78 79 80 82 83 85 86 87 89 90 92 93 94 95 96 97 104 106 107 108 116 117 118 119 120 121 122 123 125

User interface for goal-directed adaptation . . . . . . . . . . Example of goal-directed adaptation—supply and demand . Example of goal-directed adaptation—application ?delity . . Summary of goal-directed adaptation . . . . . . . . . . . . . Sensitivity to half-life . . . . . . . . . . . . . . . . . . . . . Longer duration goal-directed adaptation . . . . . . . . . . . Odyssey multi-?delity API . . . . . . . . . . . . . . . . . . Sample con?guration ?le for a Web browser . . . . . . . . . Utility function for the incremental policy . . . . . . . . . . Web energy use as a function of ?delity and image size . . . Utility function for history-based policy . . . . . . . . . . . Example of operation history replay for = 0.1 . . . . . . . Example of operation history replay for = 0.2 . . . . . . . Energy use as a function of ?delity for the Web browser . . . Change in energy supply for the incremental policy . . . . . Change in ?delity for the incremental policy . . . . . . . . . Change in energy supply for the history-based policy . . . . Change in ?delity for the history-based policy . . . . . . . . Summary of the effectiveness of application resource history Spectra architecture . . . . . . . . . . . . . . . . . Sample Spectra server con?guration ?le . . . . . . Sample service implementation . . . . . . . . . . . Resource monitor functions . . . . . . . . . . . . . Speech recognition execution time . . . . . . . . . Speech recognition energy usage . . . . . . . . . . Latex execution time for the small document . . . . Latex execution time for the large document . . . . Latex energy usage for the small document . . . . Latex energy usage for the large document . . . . . Accuracy of Spectra choices for Pangloss-Lite . . . Relative utility of Spectra choices for Pangloss-Lite Spectra overhead . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Chapter 1 Introduction
Energy is a vital resource for mobile computing. The amount of work one can perform while mobile is fundamentally constrained by the limited energy supplied by one’s battery. Unfortunately, despite considerable effort to prolong the battery lifetimes of mobile computers, no silver bullet for energy management has yet been found. Instead, there is growing consensus that a comprehensive approach is needed—one that addresses all levels of the system: circuit design, hardware devices, the operating system, and applications. This dissertation puts forth the thesis that energy-aware adaptation, the dynamic balancing of energy conservation and application quality, is an essential part of a comprehensive energy management solution. Occasionally, energy usage can be reduced without affecting the perceived quality of the system. More often, however, signi?cant energy reduction perceptibly impacts system behavior. The effective design of mobile software thus requires striking the appropriate balance between application quality and energy conservation. It is incorrect to make static decisions that arbitrate between these two competing goals. Dynamic variation in time operating on battery power, hardware power requirements, application mix, and user speci?cations all affect the optimum balance between quality and energy conservation. Energy-aware adaptation surmounts these dif?culties by making decisions dynamically. Applications statically specify possible tradeoffs, but defer decisions about which tradeoffs to make until execution. The system uses additional information available during execution, such as resource supply and demand, to advise applications which tradeoffs are best. This chapter begins with an overview of previous approaches to energy management. It then provides a more detailed vision of energy-aware adaptation and presents the thesis statement. It concludes by presenting a road map for the rest of the dissertation.

1.1 Energy management in mobile computing
Energy management can be viewed as a resource constraint problem. When a computing device is mobile, the supply of energy in its battery must be suf?cient to meet the energy 1

2

CHAPTER 1. INTRODUCTION

demands of the work it will perform before being reconnected to an external power source. Thus, if one wishes to accomplish more work while mobile, one must increase energy supply or decrease demand. Attacking the supply side of the problem has proven dif?cult. Historically, battery technology has improved very slowly over time [62]. Further, the need for mobility requires computing systems to be as small and light as possible. Since batteries represent a signi?cant portion of the size and weight of mobile devices, one cannot increase battery size without also increasing these undesirable properties. Attacking the demand side of the problem has historically proven more fruitful. Advances in low-power circuit design have led to the development of energy-ef?cient hardware components. For example, the Transmeta Crusoe processor [41] and Bluetooth network technology [30] are both designed to reduce the energy needs of mobile devices. Research in hardware power management has led to further energy reductions. Ideally, power-managed components expend energy only when they are performing useful work. When not being used, they enter power-saving states which greatly lower power dissipation. Examples of hardware power management are voltage-scaling processors [71, 72, 97], wireless network protocols [43, 44], and disk spin-down algorithms [17, 16, 57]. Unfortunately, advances in low-power circuit design and hardware power management have not been enough to meet the growing energy demands of mobile computers. Partly, this is because lower-level strategies can not capitalize on opportunities for energy management presented by applications and the operating system. Without knowledge of application intent, it is impossible to prioritize activities and save energy by performing only the most important ones. Further, hardware power management strategies must be conservative. Since hardware drivers cannot assess the impact of performance degradation on applications, they reduce energy usage only when the performance impact is almost certain to be negligible. In recent years, there has been a growing realization that the higher levels of the system, the operating system and applications, must be involved in energy management [18, 66, 89]. This dissertation focuses on these levels and proposes energy-aware adaptation as the key mechanism for implementing higher-level energy management.

1.2 Energy-aware adaptation
Simply stated, energy-aware adaptation is the dynamic balancing of quality and energy conservation. One aspect of quality is data ?delity, the degree to which data presented at a client matches the ideal reference copy at a server. Fidelity is a type-speci?c notion since different kinds of data can be degraded using a variety of type-speci?c algorithms. For example, a client playing video data could switch to a lower frame rate to save energy when battery life is critical. Yet another aspect of quality is computational ?delity, the degree to which the output of a computation matches the highest-quality output that could be produced. Performance is also an aspect of quality. For example, consider an application which

1.3. THE THESIS

3

has the ability to execute a portion of its functionality on a remote server. Remote execution can often reduce the energy usage of the mobile client by decreasing the utilization of the CPU and other hardware components. However, remote execution can also lead to increased execution time if a large amount of communication is needed. In such a scenario, energy-aware adaptation is needed to balance the competing goals of performance and energy conservation. Energy-aware applications statically determine the possible tradeoffs between quality and energy conservation, but defer decisions about which of these tradeoffs to make. During their execution, the system provides support for making these decisions by monitoring energy supply and demand, providing a history of past energy usage, and soliciting user preferences. The system uses this information to provide dynamic advice to applications about which tradeoffs they should make.

1.3 The thesis
Energy-aware adaptation is the focus of this dissertation’s thesis: A collaborative relationship between the operating system and applications can effectively reduce the energy usage of mobile computers. Energy-aware adaptation allows this collaboration to dynamically balance application quality and energy conservation. It is feasible to construct such a system with only modest modi?cation to existing application source code.

1.4 Road map for the dissertation
The rest of this document validates the thesis. The next chapter begins by setting the context for this work. It proposes metrics for evaluating the effectiveness of energy management and discusses the energy-use characteristics of mobile systems. It also describes the Odyssey platform for mobile computing, a framework that will be used to provide operating system support for energy-aware applications. Chapter 3 describes PowerScope, a tool for measuring software energy usage. PowerScope is an energy pro?ler—it attributes energy consumption to speci?c code components of applications and the operating system. By focusing attention on those code components most responsible for energy usage, PowerScope helps developers make their software more energy-ef?cient. In the context of this dissertation, PowerScope provides the measurement infrastructure necessary to study the effectiveness of energy-aware adaptation. Chapters 4 and 5 evaluate the feasibility of energy-aware adaptation. They show that applications can modify their behavior to signi?cantly extend the battery lifetimes of the systems on which they execute. Further, they reveal that the bene?ts of energy-aware adaptation are often very predictable, and that energy-aware adaptation is complementary to

4

CHAPTER 1. INTRODUCTION

existing hardware energy-management techniques. Chapter 4 studies four applications running on the Linux operating system: a video player, a speech recognizer, a map viewer, and a Web browser. Chapter 5 extends these results to shrink-wrapped applications running on closed-source operating systems. It shows how a middleware-based proxy approach can add energy-awareness to Microsoft’s PowerPoint 2000 application. Chapter 6 describes operating system support needed to effectively support energyaware applications. It introduces goal-directed adaptation, a feedback technique that allows the system to adjust for the current importance of energy conservation. Users specify a goal for battery lifetime, and the system attempts to ensure that the goal is met by guiding applications to adapt their behavior. Then, the chapter shows how the system can improve the effectiveness of goal-directed adaptation by maintaining a history of application energy usage. It describes how the history of energy usage allows the system to support a wider range of adaptation policies and react more agilely to changes in energy supply and demand. Chapter 7 shows how remote execution represents an additional dimension of energyaware adaptation. It describes Spectra, a system which enables applications to save energy by partly executing on remote computers. Spectra balances the the competing goals of performance, energy conservation, and application quality in deciding where applications can best locate functionality. It re?ects both application resource demand and current resource availability by monitoring CPU, network, energy, and ?le cache state on local and remote machines, and by using goal-directed adaptation to determine the relative importance of energy conservation. Related work is discussed in Chapter 8. Chapter 9 concludes the dissertation with a summary of the key contributions. It also discusses future research directions generated by this dissertation.

Chapter 2 Background
This chapter describes the background context of the dissertation. The next section provides an overview of the metrics that will be used to evaluate the effectiveness of energy management strategies. Section 2.2 ?rst explores the diversity of form factors and the energy usage characteristics of mobile computers. It then provides speci?c details about the two primary platforms that will be used for evaluation: the IBM 560X laptop computer and Compaq’s Itsy pocket computer. Section 2.3 describes the Odyssey platform for mobile computing. Odyssey provides the basic building blocks necessary to implement system support for energy-aware adaptation.

2.1 Energy metrics
An ideal battery can be modeled as a ?nite store of energy. If a battery-powered device expends some amount of energy to perform an activity, the energy supply available for other activities is reduced by that amount. The power usage of a device is its instantaneous rate of energy usage. Power is expressed in units of Watts, while energy is expressed in Joules (Watt-seconds). For discrete activities such as performing a ?xed amount of computation or browsing a Web page, energy usage is the best metric for evaluating the impact on battery lifetime. For continuous activities such as displaying streamed video data or backlighting a display, average power usage is a more appropriate metric. When measuring impact on battery lifetime, it is important to capture the energy usage of an entire mobile computing system rather than the isolated energy usage of individual components such as the processor or network interface. A strategy which decreases one component’s energy usage may increase the energy usage of other components. For example, network power management can increase the total energy used to transfer a ?le; although network energy use decreases, other components use more energy because the data takes longer to transfer [21]. Because all hardware components are typically powered by the same battery, strategies that decrease one component’s energy needs, but increase total system energy usage are misguided. Unless otherwise noted, the measurements in this 5

6

CHAPTER 2. BACKGROUND

dissertation report energy and power usage for the entire mobile computing system under study. At the next level of detail, it is often useful to characterize the background power usage of a device. This is the amount of power dissipated by a mobile computer when no activity of interest to the user is being performed, i.e. while it executes the kernel idle procedure. Most modern processors, including Intel’s Pentium and StrongArm chips, provide halt instructions which are called during the idle procedure to minimize power demand. Further, on some mobile laptops, the operating system may use Advanced Power Management (APM) support [35] to place other components in power-savings states. Nevertheless, background power usage can be considerable for devices such as laptop computers. Although components such as the processor and disk enter low power states, they still must be partially powered so that they can be quickly restarted when needed. Dynamic power usage is the amount of power consumed by an activity above and beyond the background power usage of the device on which it executes. Thus, total power usage is the sum of background and dynamic power usage. Dynamic power usage is a useful metric for estimating the power demand of concurrent activities: the total power usage of two concurrent activities should be roughly equivalent to the sum of the background power usage of the device and the dynamic power usage of the two activities (Section 4.8 explores this issue in more detail). For discrete activities, one can calculate dynamic energy usage by multiplying average dynamic power usage by execution time. The above metrics assume that batteries behave ideally. However, this is rarely true in practice. The most important deviation from ideal behavior is nonlinearity—as power draw increases, the total energy that can be extracted from a battery decreases [61]. In addition, batteries may exhibit recovery: a reduction in load for a period of time may result in increased capacity. Finally, research has shown that peak power usage can sometimes be a more important factor than average power usage in determining battery capacity [62]. Unless otherwise noted, this dissertation assumes the ideal model for battery behavior. One important reason is simplicity—the impact of nonlinearity, recovery, and peak power usage depend upon the speci?c characteristics of the mobile system under study, as well as the type of battery technology being employed. Since this dissertation will assess the impact of energy-aware adaptation on a variety of mobile systems, no single model for nonideal battery behavior will apply. In addition, it is important to note that most of the energy management techniques studied in this dissertation decrease average power use. Thus, the gains reported will be slightly understated due to nonlinear battery behavior.

2.2 Hardware platform characteristics
Mobile computers come in widely varying form factors. High-end laptop computers can weigh over seven pounds with a volume of over 225 cubic inches [33]. In contrast, a typical handheld computer weighs only ?ve ounces with a volume of 7 cubic inches [70]. Current research efforts are reducing mobile computer form factors even further, for example, IBM Research has created a wristwatch computer capable of running Linux [65].

2.2. HARDWARE PLATFORM CHARACTERISTICS

7

Form factor diversity is generated by a fundamental tradeoff between mobility and functionality. The need for mobility drives manufacturers to create smaller and smaller computing platforms. Size and weight restrictions limit resource availability on these platforms: they have less powerful processors, less storage capacity, and smaller batteries. They therefore can not provide the same level of functionality as their larger counterparts. Since the optimal tradeoff between mobility and functionality is task-dependent, it is reasonable to expect that the current variety of form factors will persist. Form factor diversity leads to diversity in the energy-use characteristics of mobile devices. Since the battery capacity of small, handheld devices is extremely limited by size constraints, energy-ef?ciency is typically a primary concern in their design. On the other hand, battery capacity is usually much greater in large devices such as laptop computers— consequently, laptops typically have much higher power consumption than handheld devices. In this dissertation, I will account for diversity in form factors and energy-use characteristics by validating proposed energy management techniques on two different hardware platforms. These platforms represent two of the most common form factors: laptops and handheld computers. The next two sections describe these platforms: the IBM 560X laptop computer and Compaq’s Itsy pocket computer. Section 2.2.3 compares the characteristics of the two platforms.

2.2.1 The IBM 560X laptop computer
The IBM 560X laptop used for evaluation has a 233 MHz Pentium processor and 64 MB of memory. Additionally, either a Lucent 900 MHz or 2.4 GHz WaveLAN PCMCIA card provides 2 Mb/s wireless network access. The use of different network cards re?ects changes in my experimental environment over time—the original 900 MHz network was replaced by the 2.4 GHz network. In the dissertation, I will note which network was used for each experiment. Figure 2.1 shows the power usage of several hardware components of the laptop. The measurements were obtained by executing benchmarks that varied the power state of individual hardware components and measuring steady-state power dissipation with a digital multimeter. As Figure 2.1 shows, background power usage is quite signi?cant: with the CPU idle, the display off, and the network and disk in power-saving states, the laptop draws 5.6 Watts. The processor and display are the most signi?cant power consumers—the processor uses 5.10 Watts to execute a busy-wait loop in which all accesses hit in the L1 cache, and the display consumes from 1.95–4.54 Watts, depending upon screen brightness. The network interface and disk consume less power: 1.46 Watts and 0.88 Watts in their respective idle states.

8 Component CPU / MMU State CPU Halted Busy Wait Memory Read Memory Write Bright Dim Idle Standby Idle Standby Idle

CHAPTER 2. BACKGROUND
Power (W) 0.00 5.10 3.54 4.10 4.54 1.95 1.46 0.18 0.88 0.24 3.20

Display WaveLAN Disk Other

Background power (CPU halted, display dim, WaveLAN & disk standby) = 5.6 Watts. This ?gure shows the measured power consumption of components of the IBM 560X laptop. Power usage is slightly but consistently super-linear; for example, the laptop uses 10.28 Watts when the screen is brightest and the disk and network are idle—0.21 Watts more than the sum of the individual power usage of each component. The WaveLAN measurements are for the 900 MHz network card. The last row shows the power used when the display, network, and disk are all powered off. Each value is the mean of ?ve trials—in all cases, the sample standard deviation is less than 0.01 Watts.

Figure 2.1: Power consumption of the IBM ThinkPad 560X

2.2.2 The Itsy pocket computer
The Itsy pocket computer [31] is a high-performance handheld developed by Compaq’s Palo Alto Research Labs. Two different Itsy units are used for evaluation: an Itsy v1.5 and an Itsy v2.2. Both models have a StrongArm 1100 processor that can operate at 11 different clock frequencies, ranging from 54.0 MHz to 206.4 MHz, to reduce power demand. Unless otherwise noted, all Itsy measurements in this dissertation use the maximum 206.4 MHz clock frequency. The Itsy v1.5 has 48 MB of DRAM and 32 MB of ?ash memory—the Itsy v2.2 has 32 MB of DRAM and 32 MB of ?ash. The Itsy v1.5 is powered by two AAA batteries and contains precision resistors that allow measurement of total power usage as well as the power used by various subsystems. The Itsy v2.2 is powered by a Lithium-Ion rechargeable battery. In addition to precision resistors, it also contains a DS2437 smart battery chip [12] which reports detailed information about battery status and power drain. Both Itsy models lack a wireless network interface—a serial link is used for communication. Figure 2.2 shows the measured power consumption of several hardware components of the Itsy v1.5. More detailed measurements of the energy characteristics of this platform can be found in [19] and [22]. Viredaz and Wallach have performed detailed power measurements of the Itsy version 2 [93]. Their results show the version 2 power usage is roughly

2.2. HARDWARE PLATFORM CHARACTERISTICS
Component CPU / MMU State CPU Halted Busy Wait Memory Read Memory Write Enabled Enabled Transmitting Idle Power (W) 0.00 0.43 0.62 1.41 0.04 0.05 0.12 0.16

9

Display UART Other

Background power (CPU halted, display and UART enabled) = 0.25 Watts. This ?gure shows the measured power consumption of components of the Itsy v1.5. The last row shows the power used when the display and UART are powered off. Each value is the mean of ?ve trials—in all cases, the sample standard deviation is less than 0.01 Watts.

Figure 2.2: Power consumption of the Itsy v1.5 similar to that of the Itsy v1.5. The background power usage of the Itsy v1.5 is only 0.25 Watts. The CPU is clearly an important power consumer—executing a busy-wait loop consumes an additional 0.43 Watts. The memory subsystem also represents an important source of power demand. The dynamic power used to read data from DRAM memory is 0.62 Watts and the dynamic power needed to write data is 1.41 Watts. The UART (serial network interface) consumes an additional 0.05 Watts when enabled—the UART power drain increases to 0.12 Watts when data is transmitted. The LCD display consumes only 0.04 Watts—the low power consumption can be attributed to the lack of a backlight.

2.2.3 Comparison of platform characteristics
Comparing Figures 2.1 and 2.2, the most striking difference between the two platforms is the order-of-magnitude differential in power demand. The background power usage of the Itsy v1.5 is approximately 22 times less than the background power usage of the IBM 560X. Similarly, the dynamic power needed to execute a busy-wait loop is approximately 12 times less on the Itsy. It is also clear that the relative range of power demand is much greater for the Itsy v1.5. For example, the ratio of dynamic to background power usage is 5.6 when the write benchmark is executed. For the laptop, the maximum ratio of dynamic to background power is 0.9 (occurring when a busy-wait is executed). Thus, the Itsy is more ef?cient in its use of energy resources—it expends relatively less power when hardware components are idle. The relative power expenditure of hardware components varies by platform. For ex-

10
Application-aware Odyssey

CHAPTER 2. BACKGROUND

Laissez-faire Eudora

Application-transparent Coda

Figure 2.3: Models of adaptation ample, the memory subsystem is a large power consumer for the Itsy (as shown by the difference between memory write and busy wait power consumption). However, the memory subsystem is a relatively insigni?cant portion of the laptop’s power budget. Similarly, the display represents a relatively more signi?cant portion of the laptop’s power budget. An important consequence of this observation is that power tradeoffs between hardware components are platform-speci?c. One such tradeoff is remote processing, which reduces CPU power demand but increases network power usage. Since the ratio of network to processor power usage differs between the Itsy and the IBM 560X, remote execution will sometimes reduce power usage on one platform but not the other.

2.3 The Odyssey platform for mobile computing
In this dissertation, the Odyssey platform for mobile computing provides the basis for implementing system support for energy-aware adaptation. This section provides a brief overview of the relevant details of Odyssey—a more complete discussion of the design rationale and architecture can be found in [68]. Odyssey provides support for mobile information access through application-aware adaptation, a collaborative partnership between the operating system and applications. The system monitors resource levels, noti?es applications of relevant changes, and makes resource allocation decisions. The original Odyssey prototype only supported network bandwidth adaptation. This dissertation describes how the infrastructure has been expanded to also support energy-aware adaptation. Adaptation in Odyssey involves the trading of data or computational quality for resource consumption. For example, a client playing full-color video data from a server could switch to black and white video when bandwidth drops, rather than suffering lost frames. Similarly, a map application might fetch maps with less detail rather than suffering long transfer delays for full-quality maps. Odyssey captures this notion of data degradation through an attribute called data ?delity, that de?nes the degree to which data presented at a client matches the reference copy at a server.

2.3. THE ODYSSEY PLATFORM FOR MOBILE COMPUTING

11

Odyssey also supports applications which can vary the quality of their computations to adjust for variations in resource availability. For example, a speech recognition engine running on a handheld device with little processing power might use a smaller, task-speci?c vocabulary to provide speech-to-text translations with reasonable latency. Odyssey captures this notion through an attribute called computational ?delity, that de?nes the degree to which the output of the computation matches the highest-quality output that could be produced. Fidelity is a type-speci?c notion since different kinds of data and computation can be degraded differently. Fidelity may often be multi-dimensional—for example, a video player may choose to degrade quality by using a greater amount of lossy compression, reducing the size of the video display, or decreasing the video frame rate. Since the minimal level of ?delity acceptable to the user can be both context and application dependent, Odyssey allows each application to specify the ?delity levels it currently supports. Odyssey is designed to support multiple applications concurrently executing on a mobile client. The need to coordinate resource management across applications mutes the effectiveness of many previous approaches to mobile computing. For example, commercial applications such as Eudora [74] provide vertically integrated support for mobility, in which each application assumes that it has full use of available network bandwidth. Eudora implicitly adapts to network bandwidth by transmitting messages in order of importance. Even a more sophisticated toolkit approach such as Rover [39] only pays minimal attention to resource coordination. Odyssey provides centralized monitoring and coordinated resource management that controls the use of limited resources by applications. Figure 2.3 places application-aware adaptation in context, spanning the range between two extremes. At one extreme, adaptation is entirely the responsibility of individual applications. This laissez-faire approach, used by commercial software packages such as Eudora, avoids the need for system support. But, it fails to address the issue of application concurrency. At the other extreme, application-transparent adaptation, the system bears full responsibility for both adaptation and resource management. This approach, exempli?ed by the Coda ?le system [40], is especially attractive for legacy applications because they can run unmodi?ed. Application concurrency is well supported, but application diversity is not, since control of ?delity is entirely in the hands of the system. Odyssey’s client architecture is shown in Figure 2.4. Odyssey is conceptually part of the operating system, even though it is implemented in user space for simplicity. The viceroy is the Odyssey component responsible for monitoring the availability of resources and managing their use. Code components called wardens encapsulate type-speci?c functionality. There is one warden for each data type in the system. Several applications have been modi?ed to use Odyssey, including a video player, a speech recognizer, a map viewer, a Web browser, and a virtual reality application. Odyssey provides applications with two separate interfaces. The ?rst interface allows an application to express its resource expectations. If resource levels stray beyond the speci?ed expectations, Odyssey noti?es the application through an upcall. The application then adjusts its ?delity to match the new resource level and communicates a new set of

12

CHAPTER 2. BACKGROUND

Odyssey
Viceroy
Interceptor

Warden3 Warden2 Warden1

Application

Kernel

Figure 2.4: Odyssey architecture expectations to Odyssey. This interface is most appropriate for applications which perform continuous operations, can change ?delity levels dynamically, and understand their own resource requirements. The second interface allows applications to periodically query Odyssey to determine the ?delity level at which they should operate. This interface is more appropriate for applications which perform discrete operations or do not know their own resource requirements. An application ?rst describes the operation it is about to perform. Odyssey then estimates the resource demand of the application, matches demand to current resource availability, and returns the ?delity level most appropriate for the operation. Some applications, such as the Odyssey Web browser and map viewer, use a proxy to avoid modi?cations to application source code. Other applications, such as our video player and speech recognizer, are modi?ed to interact directly with Odyssey. In all cases, the total amount of code that needs to be modi?ed is very small, i.e. less than 1000 lines of code. In its current instantiation, Odyssey assumes that applications are cooperative. Thus, Odyssey expects that applications will execute at the ?delity it speci?es. However, by adding appropriate operating system support, Odyssey could potentially enforce its resource allocation decisions by detecting and penalizing misbehaving applications.

2.4. SUMMARY

13

2.4 Summary
This chapter began by discussing the metrics that will be used to evaluate energy management strategies in this thesis. Total energy usage will be used for discrete activities, while average power usage will be used for continuous activities. The chapter then described the two primary hardware platforms for evaluation: the IBM 560X laptop and the Itsy pocket computer. The choice of these platforms re?ects the diversity in form factors and energy ef?ciency in mobile computing. Finally, this chapter described the Odyssey platform for mobile computing, which will provide the basis for implementing system support for energy-aware adaptation. Odyssey provides support for application-aware adaptation, a collaborative partnership between the operating system and applications. Odyssey monitors resource levels, noti?es applications of relevant changes, and makes resource allocation decisions. Applications modify data or computational ?delity to adjust their resource demands to meet changing resource availability. The previous version of Odyssey only supported network bandwidth adaptation—this dissertation extends Odyssey to support energy-aware adaptation.

14

CHAPTER 2. BACKGROUND

Chapter 3 PowerScope: Pro?ling application energy usage
One of the keys to progress in energy-ef?cient software design is the ability to attribute energy consumption to speci?c software components. Unfortunately, there is currently a dearth of tools which have the ability to measure the energy impact of software. This chapter describes how I have constructed one such tool, called PowerScope, which ?lls this need by pro?ling application energy usage. CPU pro?lers such as prof and gprof have proven useful for software performance optimization because they expose code components wasteful of CPU cycles. In a similar fashion, PowerScope helps developers design energy-ef?cient software by using statistical pro?ling to map energy consumption to program structure. Using PowerScope, one can determine what fraction of the total energy consumed during a certain time period is due to speci?c processes in the system. Further, one can drill down and determine the energy consumption of different procedures within a process. By providing such feedback, PowerScope allows attention to be focused on those system components responsible for the bulk of energy consumption. As improvements are made to these components, PowerScope quanti?es the bene?ts and helps expose the next target for optimization. Through successive re?nement, a system can be improved to the point where its energy consumption meets design goals. PowerScope also helps developers expose energy-related bugs in their code which are not revealed through traditional testing methodology. For example, a busywait loop may have no perceptible performance impact, but PowerScope would reveal its wasteful energy usage. Section 3.1 discusses the important considerations in the design of an energy pro?ler. The implementation of PowerScope is detailed in Section 3.2. Section 3.3 evaluates the tool, focusing on two key issues: the accuracy with which PowerScope attributes energy costs to speci?c processes and procedures, and the overhead of its operation.

15

16

CHAPTER 3. POWERSCOPE: PROFILING APPLICATION ENERGY USAGE

3.1 Design considerations
The design of PowerScope follows from its primary purpose: enabling application developers to build energy-ef?cient software. PowerScope’s design scales to complex applications, which may consist of several concurrently executing threads of control, and which may run on a variety of mobile platforms. For both simple and complex applications, PowerScope provides developers detailed and accurate information about energy usage. The most important consideration in the design of PowerScope is the need to gather suf?cient information to produce a detailed picture of application activity. The usefulness of a pro?ling tool is directly related to how de?nitively it assigns costs to speci?c application events. Attributing costs in detail enables developers to quickly focus their attention on problem areas in the code. While it is certainly desirable to map energy costs to speci?c processes, the added detail of mapping energy costs to procedures within each process can provide valuable information. PowerScope therefore reports both sets of information, attributing energy usage to both processes and to procedures within each process. As will be discussed in Section 3.3.1, the speci?c hardware characteristics of the system being monitored limit the minimum procedure size that can be accurately pro?led. It is also important for PowerScope to monitor the activities and energy use of all processes executing on a computer system. Complex applications often consist of several concurrently executing processes. Further, pro?ling the activity of only a single process omits critical information about total energy usage. For instance, a task which blocks frequently may expend large amounts of energy on the screen, disk, and network when the processor is idle. Asynchronous activity, such as network interrupts, can also account for a signi?cant portion of energy consumption. An energy pro?ler which monitors energy usage only when a speci?c process is executing will not account for the energy expended by these activities. Another consideration in PowerScope’s design is that the tool be easily portable between different hardware platforms. The power dissipation characteristics of mobile platforms differ widely, so energy optimizations for one platform may be inappropriate for others. To determine the best design for a particular application, developers may need to pro?le it on a variety of mobile devices. PowerScope therefore does not require speci?c hardware to be present on a mobile computer, not does it depend upon platform-speci?c knowledge such as device power characteristics. This design minimizes the effort required to generate pro?les on different hardware devices. Finally, PowerScope is designed to minimize the overhead that it imposes on the system it is monitoring. This overhead is re?ected both in additional CPU usage and in additional energy expended during execution. Because overhead affects the pro?le results, minimizing the pro?ling overhead helps maximize the accuracy of the generated pro?le. The design of PowerScope includes several optimizations, described in the next section, that reduce its impact on the system being pro?led.

3.2. IMPLEMENTATION
Profiling Computer Apps
Power
Source

17
Data Collection Computer
HP-IB Bus

Digital Multimeter

System Monitor
Trigger PC / PID Samples

Energy Monitor

Correlated Current Levels

(a) Data collection
Profiling Computer
Symbol Tables PC / PID Samples Correlated Current Levels

Energy Analyzer

Energy Profile

(b) Off-line analysis
This ?gure shows how PowerScope generates an energy pro?le. As applications execute on the pro?ling computer, the System Monitor samples system activity and the Energy Monitor samples power consumption. Later, the Energy Analyzer uses this information to generate an energy pro?le.

Figure 3.1: PowerScope architecture

3.2 Implementation
3.2.1 Overview
The prototype version of PowerScope, shown in Figure 3.1, uses statistical sampling to pro?le the energy usage of a computer system. To reduce overhead, pro?les are generated by a two-stage process. During the data collection stage, the tool samples both the power consumption and the system activity of the pro?ling computer. PowerScope then generates an energy pro?le from this data during a later analysis stage. Because the analysis is performed off-line, it creates no pro?ling overhead. During data collection, PowerScope uses two computers: a pro?ling computer, on which applications execute, and a data collection computer, which is used to reduce overhead. A digital multimeter samples the power consumption of the pro?ling computer. I

18

CHAPTER 3. POWERSCOPE: PROFILING APPLICATION ENERGY USAGE

require that this multimeter have an external trigger input and output, as well as the ability to sample DC current or voltage at high frequency. The present implementation uses a Hewlett Packard 3458a digital multimeter, which satis?es both these requirements. The data collection computer controls the multimeter and stores current samples. An alternate implementation would be to perform measurement and data collection entirely on the pro?ling computer using an on-board digital multimeter with a PCI or PCMCIA interface. However, this implementation makes it very dif?cult to differentiate the energy consumed by the pro?led applications from the energy used by data collection and by the operation of the on-board multimeter. Further, the present implementation makes switching the measurement equipment to pro?le different hardware platforms much easier. The functionality of PowerScope is divided among three software components. Two components, the System Monitor and Energy Monitor, share responsibility for data collection. The System Monitor samples system activity on the pro?ling computer by periodically recording information which includes the program counter (PC) and process identi?er (PID) of the currently executing process. The Energy Monitor runs on the data collection computer, and is responsible for collecting and storing current samples. Because data collection is distributed across two monitor processes, it is essential that some synchronization method ensure that they collect samples closely correlated in time. I have chosen to synchronize the components by having the digital multimeter signal the pro?ling computer after taking each sample. The ?nal software component, the Energy Analyzer, uses the raw sample data collected by the monitors to generate the energy pro?le. The analyzer runs on the pro?ling computer since it uses the symbol tables of executables and shared libraries to map samples to speci?c procedures. There is an implicit assumption in this method that the executables being pro?led are not modi?ed between the start of pro?le collection and the running of the offline analysis tool.

3.2.2 The System Monitor
The System Monitor consists of a device driver which collects sample data and a user-level daemon process which reads the samples from the device driver and writes them to a ?le. The device driver is currently implemented as a Linux loadable kernel module (LKM), allowing PowerScope to run without any modi?cation to kernel source code. Although the System Monitor currently operates only on the Linux operating system, this design approach should enable it to be relatively portable to other operating systems. The design of the System Monitor is similar to the sampling components of Morph [100] and DCPI [3]. The present implementation samples system activity when triggered by the digital multimeter. Each twelve byte sample records the value of the program counter (PC) and the process identi?er (PID) of the currently executing process, as well as additional information such as whether the system is currently handling an interrupt. This assumes that the pro?ling computer is a uniprocessor—a reasonable assumption for a mobile computer. Samples are written to a circular buffer residing in kernel memory. This buffer is emp-

3.2. IMPLEMENTATION

19

pscope_init (u_int size); pscope_read (void* sample, u_int size, u_int* ret_size); pscope_start (void); pscope_stop (void);

Figure 3.2: PowerScope API tied by the user-level daemon, which writes the samples to a ?le. The daemon is triggered when the buffer grows more than 7/8 full, or by the end of data collection. The System Monitor records a small amount of additional information that is used to generate pro?les. First, it associates each currently executing process with the pathname of an executable. Then, for each executable it records the memory location of each loaded shared library and associates the library with a pathname. For Linux versions 2.1 and greater, the kernel d path() routine is used to associate each process or library with a corresponding pathname. For previous versions of Linux in which this method is unavailable, the System Monitor associates each process or library with a device and inode number. In both cases, this mapping is recorded only once for each library or executable. The information is written to the sample buffer during data collection, and is used during off-line analysis to associate each sample with a speci?c executable image. The programming interface shown in Figure 3.2 allows applications to control pro?ling. The API is implemented as a user-level library which marshals arguments and calls ioctl operations on the PowerScope device driver. The user-level daemon calls pscope init() to set the size of the kernel sample buffer. Since there is a tension between excessive memory usage and frequent reading of the buffer by the daemon, the buffer size has been left ?exible to allow ef?cient pro?ling of different workloads. The daemon calls pscope read() to read samples out of the buffer. The pscope start() and pscope stop() system calls allow application programs to precisely indicate the period of sample collection. Multiple sets of samples may be collected one after the other; each sample set is delineated by start and end markers written into the sample buffer.

3.2.3 The Energy Monitor
The Energy Monitor runs on the data collection computer and communicates with the digital multimeter. There is no speci?c operating system requirement for the data collection computer; it currently runs Windows 95 to take advantage of manufacturer-provided device drivers for the multimeter.

20

CHAPTER 3. POWERSCOPE: PROFILING APPLICATION ENERGY USAGE

The Energy Monitor con?gures the multimeter to periodically sample the power usage of the pro?ling computer. The speci?c method of power measurement depends upon the system being pro?led. For many laptop computers, the simplest method is to sample the current drawn through the laptop’s external power source. Usually, the voltage variation is extremely small, for example it is less than 0.25% for the IBM 701C and 560X laptops. Therefore, current samples alone are suf?cient to determine the energy usage of the system. The battery is removed from the laptop while measurements are taken to avoid extraneous power drain caused by charging. Current samples are transmitted asynchronously to the Energy Monitor which stores them in a ?le for later analysis. An alternate method can be employed for systems such as the Compaq Itsy v1.5 pocket computer that provide internal precision resistors for power measurement [92]. For the Itsy, the Energy Monitor con?gures the multimeter to measure the instantaneous differential voltage, ? , across a ? precision resistor located in the main power circuit. The instantaneous current, I, can therefore be calculated as ? ? . Since the voltage being supplied to the computer, ?×???, does not vary signi?cantly, these measurements are suf?cient to calculate instantaneous power usage, ? , as ? ?×??? ? ? . Further, because the Itsy contains additional internal resistors, the same method can be used to pro?le the isolated power usage of Itsy subsystems. The above method is also useful when the maximum current drawn by the pro?ling computer exceeds the rated capacity of the measurement equipment. In such cases, PowerScope can measure the current drop across a precision resistor inserted between the pro?ling computer and its external power supply. Sample collection is driven by the multimeter clock. Synchronization with the System Monitor is provided by connecting the multimeter’s external trigger input and output to I/O pins on the pro?ling computer. The speci?c pins are platform-speci?c—for example, I use parallel port pins for the IBM 560X laptop and general purpose I/O pins for the Itsy. Immediately after the multimeter takes a power sample, it toggles the value of an input pin. This causes a system interrupt on the pro?ling computer, during which the System Monitor samples system activity. Upon completion, the System Monitor triggers the next sample by toggling an output pin (unless pro?ling has been halted by the pscope stop system call). The multimeter buffers this trigger until the time to take the next sample arrives. This method ensures that the power samples re?ect application activity, rather than the activity of the System Monitor. The original PowerScope design used the clock of the pro?ling computer to drive sample collection. Although simpler to implement, that design had the disadvantage of biasing the pro?le values of activities correlated with the system clock. Since PowerScope drives sample collection from the multimeter, the lack of synchronization between the multimeter and pro?ling computer clocks introduces a natural jitter that makes clock-related bias very unlikely. Using the multimeter clock also allows PowerScope to generate interrupts at a ?ner granularity then that allowed by using kernel clock interrupts. An alternative approach would be to trigger interrupts using processor performance counters such as those found on the StrongARM 1100 and Pentium II chips. I rejected this

?? ?

? ?? ?

3.2. IMPLEMENTATION

21

approach due to portability concerns. Some processors, such as the Pentium chip used in the IBM 560X laptop, lack performance counters. Further, methods for accessing performance counters vary by processor family, and thus require architecture-speci?c code. The user may specify the sample frequency as a parameter when the Energy Monitor is started. With the multimeter currently being used, the maximum sample frequency is approximately 700 samples per second.

3.2.4 The Energy Analyzer
The Energy Analyzer generates an energy pro?le of system activity. Recall that total energy usage can be calculated by integrating the product of the instantaneous current, ?? , and voltage, ?? , over time, as follows:

?? ?
? ?

(3.1)

This value can be approximated by simultaneously sampling both current and voltage at regular intervals of time ?. Further, in the systems which I have measured, ?? is constant within the limits of accuracy for which I am striving. PowerScope therefore calculates total energy over ? samples using a single measured voltage value, ?? × , as follows:

?

?

? ? × ?

? ??
?

(3.2)

?

The Energy Analyzer reads the raw data generated by the monitors and associates each current sample collected by the Energy Monitor with the corresponding sample collected by the System Monitor. It assigns each sample to a process bucket using the recorded PID value. Samples that occurred during the handling of an asynchronous interrupt, such as the receipt of a network packet, are not attributed to the currently executing process but are instead attributed to a bucket speci?c to the interrupt handler. If no process was executing when the sample was taken, the sample is attributed to a kernel bucket. The energy usage of each process is calculated as in Equation 3.2 by summing the current samples in each bucket and multiplying by the measured voltage (?? × ) and the sample interval ( ?). The Energy Analyzer then generates a summary of energy usage by process, such as the one shown in Figure 3.3(a). Each entry displays the total time spent executing the process, calculated by multiplying the total number of samples that occurred while the process was executing by the sample period. Each entry also displays the total energy usage of the process and its average power usage, which is calculated by dividing energy usage by execution time.

?

22

CHAPTER 3. POWERSCOPE: PROFILING APPLICATION ENERGY USAGE

Energy Usage by Process: Elapsed Total Average Time (s) Energy (J) Power (W) ---------- ---------- ---------40.521 489.522 12.081 40.572 301.210 7.424 27.654 296.287 10.714 18.073 218.458 12.087 13.369 162.659 12.167 11.730 141.101 12.029 2.130 25.087 11.776 1.495 17.791 11.901

Process ---------------------------/obj/odyssey/bin/janus kernel Interrupts-Wavelan /obj/odyssey/bin/xanim /usr/X11R6/bin/XF86_SVGA /obj/odyssey/bin/viceroy /obj/odyssey/bin/editor /usr/bin/netscape3

(a) Partial summary of energy usage by process

Energy Usage Detail for process /obj/odyssey/bin/viceroy User-level procedures: Elapsed Total Average Procedure Time (s) Energy (J) Power (W) ----------------------------- ---------- ---------- ---------Internal_Signal 0.210 2.585 12.327 ExaminePacket 0.165 1.939 11.763 Dispatcher 0.160 1.872 11.693 sftp_DataArrived 0.106 1.285 12.162 IOMGR_CheckDescriptors 0.096 1.159 12.064 IOMGR_Select 0.078 0.955 12.177

(b) Partial detail of process energy usage This ?gure shows a sample energy pro?le for a computer running multiple concurrent applications. Part (a) shows a portion of the summary of energy usage by process. Part (b) shows a portion of the detailed pro?le for a single process,

Figure 3.3: Sample energy pro?le

3.3. VALIDATION

23

The Energy Analyzer repeats the above steps for each process to determine the energy usage by procedure. The process and shared library information stored by the System Monitor is used to reconstruct the memory address of each procedure from the symbol tables of executables and shared libraries. Then, the PC value of each sample is used to place the sample in a procedure bucket. When the pro?le is generated, procedures that reside in shared libraries and kernel procedures are displayed separately. Figure 3.3(b) shows a partial pro?le of one typical process.

3.3 Validation
For PowerScope to be effective, it must accurately determine the energy cost of processes and procedures. Further, it must operate with a minimum of overhead on the system being measured to avoid signi?cantly perturbing the pro?le results. I created benchmarks to assess how successful PowerScope is in meeting both of these goals. Each benchmark was run on two hardware platforms, the Compaq Itsy v1.5 pocket computer and the IBM ThinkPad 560X laptop computer.

3.3.1 Accuracy
There are several factors which potentially limit PowerScope’s accuracy. First, the digital multimeter’s power measurements are not truly instantaneous—the multimeter’s A/D converter must measure the input signal over a period of time. However, this period, or integration time, is normally quite small. In the case of the HP 3458a multimeter used for these experiments, the minimum integration time is only 1.4 s. Second, there will be some capacitance in the computer system being measured. High-frequency changes in power usage may not be measurable at the point in the circuit where the multimeter probes are attached. Finally, there is a delay between the time when the multimeter takes a measurement and the time when the corresponding kernel sample is taken; this delay includes time to propagate an electrical signal to the pro?ling computer and time to handle the corresponding hardware interrupt. If a procedure is of suf?ciently short duration, a sample taken during its execution may be incorrectly attributed to a procedure which executes later. Combined, these factors limit PowerScope’s accuracy—there will be some minimum event duration below which PowerScope will be unable to accurately determine the event’s power usage. I measured the minimum event duration by running a benchmark which alternates execution between two different procedures. Each procedure has a known power usage and runs for a con?gurable length of time. When these procedures are of suf?ciently long duration, for example, one second, PowerScope can accurately determine the power usage of each procedure. However, as the duration of the two procedures is shortened, PowerScope will eventually be unable to successfully determine their individual power usages. To ensure maximum accuracy, I used the highest sampling rate supported by my current measurement equipment for these measurements—approximately 700 samples per second.

24

CHAPTER 3. POWERSCOPE: PROFILING APPLICATION ENERGY USAGE

9

Power (W)

Additions Multiplications
8

7

0.001

0.010

0.100

1.000

10.000

100.000

1000.000

Procedure length (ms)
(a) PowerScope accuracy for ThinkPad 560X

1.2

Copys Additions

Power (W)

1.0

0.8

0.001

0.010

0.100

1.000

10.000

100.000

1000.000

Procedure length (ms)
(b) PowerScope accuracy for Itsy v1.5
This ?gure shows PowerScope’s accuracy as a function of the length of the event being measured. Each graph shows the power usage reported for two different procedures which execute alternately. As the procedure length is reduced, PowerScope is eventually unable to distinguish the individual power usage of the two procedures. The measurements in the top graph were performed on the IBM ThinkPad 560X laptop and the measurements in the bottom graph were performed on the Compaq Itsy v1.5. Each point represents the mean of ten trials—the (barely noticeable) error bars in each graph show 90% con?dence intervals. Note that procedure length, on the x-axis, is displayed using a log scale.

Figure 3.4: PowerScope accuracy

3.3. VALIDATION

25

Figure 3.4(a) shows the results of running the benchmark on the 560X laptop. The ?gure shows the power usage of two procedures reported by PowerScope for a variety of procedure durations. The ?rst procedure performs additions in an unrolled loop and has a power usage of 8.04 Watts (measured with a duration of 1 second). The second procedure performs multiplications in an unrolled loop and has a power usage of 6.97 Watts. White it may seem unintuitive that multiplication requires less power than addition, less multiplication instructions execute per unit of time, meaning that the total energy needed to perform a multiplication is higher. PowerScope correctly reports the individual power usage of each procedure within experimental error for durations of 10 ms. When the procedure length is set to 1 ms., PowerScope reports slightly inaccurate results (within 1% of the correct value). As the procedure duration is further decreased, PowerScope’s accuracy also decreases. At a duration of 100 s., PowerScope is unable to distinguish the power usage of individual procedures. In this case, the limiting factor is probably the capacitance of the laptop computer. Figure 3.4(b) shows the results of running the benchmark on the Itsy v1.5. Because the power used to perform multiplications on the Itsy is very similar to the power used to perform additions, the benchmark replaces the multiplication procedure with one that copies data from one memory location to another. The copies are performed in an unrolled loop and all memory references hit in the ?rst-level data cache. The power usage of the addition procedure is 0.85 Watts, and the power usage of the copy procedure is 1.08 Watts. On the Itsy, PowerScope correctly reports individual power usage within experimental error for durations of 1 ms. The reported values are slightly inaccurate with a procedure length of 100 s. (within 3% of the correct value). Interestingly, at 10 s., PowerScope reports a higher power usage for the addition procedure than for the copy procedure. Because these results are signi?cant within experimental error, they strongly indicate that the power measurements are being perturbed by the latency between the time when measurements are taken and the time when the System Monitor samples system activity on the pro?ling computer. Power samples that should be attributed to one procedure are instead being incorrectly attributed to the other. The preceding measurements cannot detect one possible source of inaccuracy: the effect of variation in the sampling frequency. To quantify this potential effect, I measured the power usage of the IBM560X laptop while it executed the benchmark application for approximately 10 seconds using a procedure duration of 1 ms.. I sampled power usage at various frequencies using the HP3458a multimeter. Since I did not use PowerScope for these experiments, any variation in power usage can be attributed to variation in the sampling frequency. Figure 3.5 shows the results of these experiments for ?ve different frequencies between 80 and 2309 samples per second (the maximum frequency for the multimeter without PowerScope). The measured power usage varies by less than 1 mW. Thus, the choice of sampling frequency does not signi?cantly impact measurement results.

26

CHAPTER 3. POWERSCOPE: PROFILING APPLICATION ENERGY USAGE
10 8 6 4 2 0 0 500 1000 1500 2000 2500

Power (Watts)

Samples / second
This ?gure shows that variation in the sample frequency does not impact power measurements. It shows the power usage of an IBM560X laptop executing a benchmark application, measured at several different sampling frequencies. Each point represents the mean of ?ve trials—90% con?dence intervals are not visible on this graph.

Figure 3.5: Effect of variation in the sample frequency

3.3.2 Overhead
Running PowerScope imposes a small overhead on the system being pro?led due to the activity of the System Monitor. The impact can be expressed in terms of both CPU usage and additional energy consumption on the pro?ling computer. To determine PowerScope’s CPU overhead, I measured the execution time of the benchmark described in Section 3.3.1 for a variety of sampling rates, and compared the results to the execution time of the benchmark when PowerScope was not running. For the 560X laptop, latency was measured using the Pentium cycle counter. For the Itsy, the Linux gettimeofday system call was used. This benchmark does not include the cost of periodically writing data to a ?le for long-running pro?les. However, because the ?le write is amortized across a large number of samples, the additional CPU cost should be quite low. Figure 3.6 shows the results of these experiments for the two platforms. In both cases, PowerScope’s CPU overhead is quite low—less than 0.6% at the maximum sampling rate of the multimeter. Note that although two measurements in Figure 3.6(b) show a negative overhead, the upper bound of each measurement’s 90% con?dence interval is greater than zero, meaning that the anomalies can likely be attributed to experimental error. To determine PowerScope’s energy overhead, I used PowerScope to measure the energy usage of the benchmark described in Section 3.3.1 for a variety of sampling rates. For comparison, I measured the energy consumption of the benchmark without PowerScope by using the digital multimeter to measure the energy consumption of the pro?ling computer during a period when the benchmark was the only activity executing. Because there is a marked difference between energy consumption depending upon whether or not the bench-

3.3. VALIDATION

27

0.006

Relative overhead

0.004

0.002

0.000

-0.002 0 200 400 600 800

Samples / second
(a) CPU overhead for ThinkPad 560X
0.006

Relative overhead

0.004

0.002

0.000

-0.002 0 200 400 600 800

Samples / second
(b) CPU overhead for Itsy v1.5
This ?gure shows PowerScope’s relative CPU overhead as a function of the sampling frequency. Both graphs show the additional time required to complete a ?xed amount of calculations while PowerScope is used to pro?le energy consumption. The measurements in the top graph were performed on the IBM ThinkPad 560X laptop and the measurements in the bottom graph were performed on the Compaq Itsy v1.5. Each point represents the mean of ten trials—the error bars in each graph show 90% con?dence intervals.

Figure 3.6: PowerScope CPU overhead

28

CHAPTER 3. POWERSCOPE: PROFILING APPLICATION ENERGY USAGE

Measured overhead

0.012 0.009 0.006 0.003 0.000 0 200 400 600 800

Samples / second
(a) Energy overhead for ThinkPad 560X

Measured overhead

0.012 0.009 0.006 0.003 0.000 0 200 400 600 800

Samples / second
(b) Energy overhead for Itsy v1.5
This ?gure shows PowerScope’s measured relative energy overhead as a function of the sampling frequency. Both graphs show the amount of additional energy usage reported by PowerScope relative to baseline energy consumption without PowerScope. The measurements in the top graph were performed on the IBM ThinkPad 560X laptop and the measurements in the bottom graph were performed on the Compaq Itsy v1.5. Each point represents the mean of ten trials—the error bars in each graph show 90% con?dence intervals.

Figure 3.7: PowerScope energy overhead

3.4. SUMMARY

29

mark is running, it was possible to manually identify those samples collected when the benchmark was executing and to calculate the average power consumption of the benchmark. The energy usage of the benchmark was then calculated by multiplying its average power usage by its measured execution time. Figure 3.7 shows the results of these experiments for the two platforms. In both cases, PowerScope’s energy overhead is low—about 1.0% for the ThinkPad 560X and 1.3% for the Itsy at the maximum sampling rate. Like the CPU benchmark, this value does not include the cost of periodically writing data to a ?le for long-running pro?les. While the laptop results show that energy overhead increases fairly regularly with the sample rate, the Itsy results are decidedly more irregular. While it is unclear precisely what leads to this effect, it is possible that different sampling frequencies induce slightly different cache effects on the benchmark application. Such cache effects would be much more noticeable on the Itsy since its cache is smaller and memory usage makes up a much larger percentage of its overall power budget.

3.4 Summary
This chapter has described the design and implementation of the PowerScope energy pro?ler. PowerScope helps developers design energy-ef?cient software by mapping energy consumption to speci?c processes executing on a computer system, and to individual procedures within those processes. There are several important considerations in the design of an energy pro?ler. First, it should map energy consumption to code components as accurately as possible. Second, it should report the energy usage of all activities occurring on the computer system during the pro?ling period. Third, it should maximize portability to support pro?ling on a variety of hardware platforms. Finally, it should minimize the amount of overhead imposed during the pro?ling period. Evaluation of PowerScope shows that it is successful in meeting these goals. Depending upon the system being pro?led, PowerScope can accurately attribute energy to events with durations as small as 100 s. Further, pro?les such as the one in Figure 3.3 report the energy consumption of all processes that execute during the pro?ling period. The results in Section 3.3 show that PowerScope can generate pro?les on two very different computing platforms. Finally, PowerScope’s overhead is quite low. On the platforms evaluated, its CPU overhead was at most 0.6% and its energy overhead was at most 1.3%.

30

CHAPTER 3. POWERSCOPE: PROFILING APPLICATION ENERGY USAGE

Chapter 4 Energy-aware adaptation
In order to design energy-aware software, it is ?rst necessary to understand how software design choices impact system energy use. This need motivated me to perform a detailed study of energy usage for several applications that might commonly be found in mobile computing environments. In this chapter, I discuss this study, its results, and the implications for energy-aware application and operating system design.

4.1 Goals of the study
The primary goal of the study was to assess the feasibility of energy-aware adaptation. For adaptation strategies to be effective, it is crucial that changes in application behavior yield signi?cant energy savings. If the potential savings are meager, then developers will not modify applications to make them energy-aware. Further, the resulting energy savings should be predictable. The more accurately an adaptive system can project the impact of potential changes, the quicker it can converge upon the optimum balance between application quality and energy conservation. In this study, the main dimension of energy-aware adaptation is the tradeoff between application ?delity and energy use. Since ?delity is an application-dependent metric, it was necessary to measure several different applications as they executed at different levels of ?delity. The greater the difference in energy use, the more effective energy-aware adaptation is likely to be for a given application. Although it was not my main focus, I also studied another dimension of energy-aware adaptation: the tradeoff between execution location and energy use. Although it seems intuitive that of?oading computation to a server can reduce client energy usage, this is not always the case. Even though the energy used by the client’s CPU will decrease, this bene?t may be offset by increased network energy use. Alternatively, remote execution may prove to be slower than local execution when communication needs are signi?cant, creating a tradeoff between energy conservation and performance. A secondary goal of the study was to assess the impact of existing hardware powermanagement strategies, such as spinning down the hard drive, disabling wireless network 31

32

CHAPTER 4. ENERGY-AWARE ADAPTATION

receivers, and turning off the display. I measured both the stand-alone impact of these strategies, as well as their impact when combined with energy-aware adaptation. I was especially interested in con?rming that the energy savings from energy-aware adaptation could enhance those achievable through hardware power management. Although these distinct approaches to energy savings seem composable, it was important to verify this experimentally.

4.2 Methodology
I measured the energy used by four applications: a video player, a speech recognizer, a map viewer, and a Web browser. My selection of these particular applications was driven by several factors. First, these are applications that are commonly used when mobile. Speech recognition enables hands-free operation, while map viewing assists navigation. It is also reasonable to expect that mobile access of Web and video data will increase as wireless bandwidth becomes more plentiful. Second, each application has at least one de?nable dimension of ?delity, allowing me to study the relationship between ?delity and energy use. The ?nal factor in selecting these applications was their extensibility. All ran on Linux, and three had freely available source code. The fourth, Netscape, had not yet released source code, but had a well-de?ned interface for extension. In the study, source-code availability allowed me to use PowerScope to gain a more detailed picture of application energy use. The large degree of extensibility allowed me to easily modify applications to support multiple levels of ?delity. However, making applications energy-aware does not always require application or operating system source code, as will be seen in the next chapter. I used the IBM 560X laptop as the primary platform for evaluation. Since the video player, map viewer, and Web browser have a considerable amount of platform-speci?c code, it would have been prohibitively time-consuming to port them to the Itsy. However, the speech recognizer proved relatively easy to port—I therefore measured its energy consumption for both platforms and compared the results. I ?rst observed the applications as they operated in isolation, and then as they operated concurrently. In each experimental trial, the ?delity of an application was ?xed at a constant value. I explored sensitivity of energy consumption to data ?delity by using four data objects for each application: that is, four video clips, four speech utterances, four maps, and four Web images. I ?rst measured the baseline energy usage for each object at highest ?delity with hardware power management disabled. Secondly, I measured energy usage with hardware power management enabled. Then, I successively lowered the ?delity of the application, measuring energy usage at each ?delity with hardware power management enabled. This sequence of measurements is directly re?ected in the format of the graphs presenting the results: Figures 4.2, 4.5, 4.10, and 4.15. Since a considerable amount of data is condensed into these graphs, I explain their format here even though their individual contents will not be meaningful until the detailed discussions in Sections 4.4 through 4.7.

4.3. EXPERIMENTAL SETUP

33

For example, consider Figure 4.2. There are six bars in each of the four data sets on the x-axis; each data set corresponds to a different video clip. The height of each bar shows total energy usage, and the shadings within each bar show energy usage by software component. The component labeled “Idle” aggregates samples that occurred while executing the kernel idle procedure—effectively a Pentium hlt instruction. The component labeled “WaveLAN” aggregates samples that occurred during wireless network interrupts. For each data set, the ?rst and second bars, labeled “Baseline” and “Hardware-Only Power Mgmt.”, show energy usage at full ?delity with and without hardware power management. The difference between the ?rst two bars gives the application-speci?c effectiveness of hardware power management. Each of the remaining bars shows the energy usage at a different, reduced ?delity level with hardware power management enabled. The difference between one of these bars and the ?rst bar (“Baseline”) gives the combined bene?t of hardware power management and ?delity reduction. The difference between one of these bars and the second one (“Hardware-Only Power Mgmt.”) gives the additional bene?t achieved by ?delity reduction above and beyond the bene?t achieved by hardware power management. The measurements for all bars except “Baseline” were obtained while powering down as many hardware components as possible for each application. After ten seconds of inactivity, I transitioned the disk to standby mode. Further, I modi?ed the network communication package used to place the wireless network interface in standby mode except during remote procedure calls or bulk transfers. Finally, I turned off the display during the speech application. Since the other applications were interactive, the display was continuously enabled during their operation. These are the maximally aggressive power management strategies that could be employed with the computer used in the study. However, more recent mobile computers support further dimensions of power management—for example, several mobile processors can reduce their power and energy requirements by decreasing the CPU clock frequency. Hence, the effectiveness of hardware power management appears to be increasing with time. As it will be shown in this chapter that hardware power management and ?delity reduction are synergistic, it is reasonable to expect that the bene?ts of ?delity reduction will continue to increase in the future as hardware power management becomes more effective.

4.3 Experimental setup
For this study, I used the ThinkPad 560X described in Section 2.2.1 as the client. The client ran the Linux 2.2 operating system. All servers were 200 MHz Pentium Pro desktop computers with 64 MB of memory. The client communicated with servers over a 2 Mb/s wireless WaveLAN network operating at 900 MHz. I measured application and system energy use with PowerScope, sampling at a rate of approximately 600 times per second. To avoid confounding effects due to non-ideal battery behavior, the client used an external power supply. Further, to eliminate the effects of charging, the client’s battery was removed.

34

CHAPTER 4. ENERGY-AWARE ADAPTATION

4.4 Video player

Viceroy Xanim Video Warden Video Server

Figure 4.1: Odyssey video player

4.4.1 Description
I ?rst measured the impact of ?delity on the video application shown in Figure 4.1. Xanim fetches video data from a server through Odyssey and displays it on the client. It supports two dimensions of ?delity: varying the amount of lossy compression used to encode a video clip, and varying the size of the window in which it is displayed. There are multiple tracks of each video clip on the server, each generated off-line from the full ?delity video clip using Adobe Premiere. They are identical to the original except for size and the level of lossy compression used in frame encoding.

4.4.2 Results
Figure 4.2 shows the total energy used to display four videos at different ?delities. At the baseline ?delity, much energy is consumed while the processor is idle because of the limited bandwidth of the wireless network—not enough video data is transmitted to saturate the processor. Most of the remaining energy is consumed by asynchronous network interrupts, the Xanim video player, and the X server. For the four video clips, hardware-only power management reduces energy consumption by a mere 9–10%. There is little opportunity to place the network in standby mode since it is nearly saturated. Most of the reduction is due to disk power management—the disk remains in standby mode for the entire duration of an experiment. The bars labeled Premiere-B and Premiere-C in Figure 4.2 show the impact of lossy compression. Whereas the baseline video is encoded at QuickTime/Cinepak quality level 2, Premiere-B and Premiere-C are encoded at quality levels 1 and 0 respectively. PremiereC, the highest level of compression, consumes 16–17% less energy than hardware-only

4.4. VIDEO PLAYER
2500

35

eli Ha n Po rdwa e Pr wer re-O em M ier gmt nly Pr e-B . em

2000

Energy (Joules)

1500

1000

500

0

Video 1

Co

Idle Xanim X Server Odyssey WaveLAN Kernel

ier ece C dW mb ind ine ow d Re du

Ba

s

Video 2

Video 3

Video 4

This ?gure shows the total energy used to display four QuickTime/Cinepak videos from 127 to 226 seconds in length, ordered from right to left above. For each video, the ?rst bar shows total energy usage without hardware power management or ?delity reduction. The second bar shows the impact of hardware power management alone. The next two show the impact of lossy compression. The ?fth shows the impact of reducing the size of the display window. The ?nal bar shows the combined effect of lossy compression and window size reduction. The shadings within each bar detail energy usage by software component. Each value is the mean of ?ve trials—the error bars show 90% con?dence intervals.

Figure 4.2: Energy impact of ?delity for video playing power management. Note that these gains are understated due to the bandwidth limitation imposed by the wireless network. With a higher-bandwidth network, I could raise baseline ?delity and thus transmit better video quality when energy is plentiful. The relative energy savings of Premiere-C would then be higher. By examining the shadings of each bar in Figure 4.2, it can be seen that compression signi?cantly reduces the energy used by Xanim, Odyssey, and the WaveLAN device driver. However, the energy used by the X server is almost completely unaffected by compression. I conjectured that this is because video frames are decoded before they are given to the X server, and the size of this decoded data is independent of the level of lossy compression. To validate this conjecture, I measured the effect of halving both the height and width of the display window, effectively introducing a new dimension of ?delity. As Figure 4.2 shows, shrinking the window size reduces energy consumption 19–20% beyond hardwareonly power management. The shadings on the bars con?rm that reducing window size signi?cantly decreases X server energy usage. In fact, within the bounds of experimental error, X server energy consumption is proportional to window area.

36
2000

CHAPTER 4. ENERGY-AWARE ADAPTATION

1500

Energy (Joules)

Hardware-Only Power Mgmt. Premiere-B Premiere-C Reduced Window Combined

1000

500

0 0 50 100 150 200

Video Length (seconds)
Figure 4.3: Predicting video player energy use
This ?gure shows the relationship between system energy use, data ?delity, and video length. For each level of data ?delity, four data points show the total energy used to display the four videos from Figure 4.2—the corresponding line represents the best linear ?t through these points. All measurements were taken with hardware power management enabled. The (barely noticeable) error bars show 90% con?dence intervals for energy use.

Finally, I examined the effect of combining Premiere-C encoding with a display window of half the baseline height and width. This results in a 28–30% reduction in energy usage relative to hardware-only power management. Relative to baseline, using all the techniques (hardware power management, lossy encoding, and reducing the window size) together yields about a 35% reduction. From the viewpoint of further energy reduction, the rightmost bar of each data set in Figure 4.2 seems to offer a pessimistic message: there is little to be gained by further efforts to reduce ?delity. Virtually all energy usage at this ?delity level occurs when the processor is idle. Fortunately, this is precisely where advances in hardware power management can be of the most help. For example, consider a modern mobile processor, such as TransMeta’s Crusoe chip [41], which can reduce its clock frequency to save power and energy. As the ?delity of the video is reduced, the processor can operate at correspondingly lower clock frequencies since the needed computation per frame is smaller. With such a processor, the total energy used by the lowest ?delity will be signi?cantly reduced. Figure 4.3 shows video player energy use as a function of video length for ?ve different levels of data ?delity: baseline, Premiere-B, Premiere-C, reduced window size, and

4.5. SPEECH RECOGNIZER

37

Viceroy Speech Front-End Speech Warden

Remote Janus Server

Local Janus Server

Figure 4.4: Odyssey speech recognizer the combination of Premiere-C and reduced window size. In each case, hardware power management is enabled. For each ?delity level, four data points show the total energy used to play each of the videos in the study, and the corresponding line shows the best linear ?t through these points. From the data, it is clear that the linear model is a good ?t—the coef?cient of determination (?? ) is greater than 99% for every ?delity. In addition, the maximum relative error for any data point is 2.3%. Thus, if an adaptive system were to be given a new video and could determine its length, it is reasonable to expect that it would be able to accurately predict the energy needed to display the video at each ?delity. Of course, these results apply directly only to the ?delities investigated in this study; it is possible that different encoding schemes may prove to be less predictable.

4.5 Speech recognizer
4.5.1 Description
The second application is an adaptive speech recognizer. As shown in Figure 4.4, it consists of a front-end that generates a speech waveform from a spoken utterance and submits it via Odyssey to a local or remote instance of the Janus speech recognition system [94]. Local recognition avoids network transmission and is unavoidable if the client is disconnected. In contrast, remote recognition incurs the delay and energy cost of network communication but can exploit the CPU, memory, and energy resources of a remote server that is likely to be operating from a power outlet rather than a battery. The system also supports a hybrid mode of operation in which the ?rst phase of recognition is performed

38
150

CHAPTER 4. ENERGY-AWARE ADAPTATION

Idle Janus Odyssey
100

Energy (Joules)

50

0

Utterance 1

Utterance 2

Utterance 3

This ?gure shows the energy used to recognize four spoken utterances from one to seven seconds in length, ordered from right to left above. For each utterance, the ?rst bar shows energy consumption without hardware power management or ?delity reduction. The second bar shows the impact of hardware power management alone. The remaining bars show the additional savings realized by adaptive strategies. The shadings within each bar detail energy usage by activity. Each measurement is the mean of ?ve trials—the error bars show 90% con?dence intervals.

Figure 4.5: Energy impact of ?delity for speech recognition locally, resulting in a compact intermediate representation that is shipped to the remote server for completion of the recognition. In effect, the hybrid mode uses the ?rst phase of recognition as a type-speci?c compression technique that yields a factor of ?ve reduction in data volume with minimal computational overhead. Fidelity is lowered in this application by using a reduced vocabulary and a less complex acoustic model. This substantially reduces the memory footprint and processing required for recognition, but degrades recognition quality. The system alerts the user of ?delity transitions using a synthesized voice. The use of low ?delity is most compelling in the case of local recognition on a resource-poor disconnected client, although it can also be used in hybrid and remote cases. Although reducing ?delity limits the number of words available, the word-error rate may not increase. Intuitively, this is because the recognizer makes fewer mistakes when choosing from a smaller set of words in the reduced vocabulary. This helps counterbalance the effects of reducing the sophistication of the acoustic model.

Ba s Ha elin rdw e Re ar Re duce e-On mo d M ly Re te o de Pow er l Hy mote Mg b Hy rid - Re mt du bri . ce ddM Re od du el ce dM od el

WaveLAN Kernel

Utterance 4

4.5. SPEECH RECOGNIZER
100

39

80

Hardware-Only Power Mgmt. Reduced Model

Energy (Joules)

60

40

20

0 0 2 4 6 8

Utterance Length (seconds)
Figure 4.6: Predicting speech recognition energy use
This ?gure shows the relationship between system energy use, data ?delity, and utterance length. For each level of data ?delity, ten data points show the total energy used to recognize different utterances (including the ones from Figure 4.5)—the corresponding line represents the best linear ?t through these points. All measurements were taken with hardware power management enabled. The (barely noticeable) error bars show 90% con?dence intervals for energy use.

4.5.2 Results
Figure 4.5 presents measurements of client energy usage when recognizing four pre-recorded utterances using local, remote, and hybrid strategies at high and low ?delity. The baseline measurements correspond to local recognition at high ?delity without hardware power management. Since speech recognition is compute-intensive, almost all the energy in this case is consumed by Janus. Hardware power management reduces client energy usage by 33–34%. Such a substantial reduction is possible because the display can be turned off and both the network and disk can be placed in standby mode for the entire duration of an experiment. This assumes that user interactions occur solely through speech, and that disk accesses can be avoided because the vocabulary, language model and acoustic model ?t entirely in physical memory. More complex recognition tasks may trigger disk activity and hence show less bene?t from hardware power management. Lowering ?delity by using a reduced speech model results in a 25–46% reduction in energy consumption relative to using hardware power management alone. This corresponds

40

CHAPTER 4. ENERGY-AWARE ADAPTATION

to a 50–65% reduction relative to the baseline. Remote recognition at full ?delity reduces energy usage by 33–44% relative to using hardware power management alone. If ?delity is also reduced, the corresponding savings is 42–65%. These ?gures are comparable to the energy savings for remote execution reported in the literature for other compute-intensive tasks [69, 77]. As the shadings in the fourth and ?fth bars of each data set in Figure 4.5 indicate, most of the energy consumed by the client in remote recognition occurs with the processor idle—much of this is time spent waiting for a reply from the server. Lowering ?delity speeds recognition at the server, thus shortening this interval and yielding additional energy savings. Hybrid recognition offers slightly greater energy savings than remote recognition: 47– 55% at full ?delity, and 53–70% at low ?delity, both relative to hardware-only power management. Hybrid recognition increases the fraction of energy used by the local Janus code; but this is more than offset by the reduction in network transmission and idle time. Overall, the net effect of combining hardware power management with hybrid, low?delity recognition is a 69–80% reduction in energy usage relative to the baseline. In practice, the optimal strategy will depend on resource availability and the user’s tolerance for low-?delity recognition. Chapter 7 explores this issue in greater detail. Figure 4.6 shows speech recognizer energy use as a function of utterance length for full and reduced ?delity with hardware power management enabled. For both ?delity levels, each data point shows the total energy used to recognize one of a set of ten spoken utterances—this set includes the four utterances from Figure 4.5. The corresponding lines show the best linear ?t through these points. The linear model for the baseline ?delity produces a reasonable ?t (93.7% coef?cient of determination), while the ?t is considerably better for the reduced ?delity (99.8%). At the baseline ?delity, the relative error for six of the ten utterances is less than 10%; at the reduced ?delity, nine utterances have relative error less than 10%. While speech recognition is less predictable than video display, these results are quite encouraging—simple linear models appear to do a reasonably good job of predicting energy use.

4.5.3 Results for Itsy v1.5
I have ported Janus to the Itsy v1.5 pocket computer in order to explore the impact of different hardware platforms on energy-aware adaptation. Figure 4.7 presents measurements of client energy usage for the same four pre-recorded utterances when the speech front-end executes on the Itsy v1.5. Since the Itsy does not have a PCMCIA interface, a serial link was used as the network transport. Because the Itsy is an experimental platform, there exists less system support for hardware power management than on the ThinkPad laptop. Thus, the results in Figure 4.7 do not re?ect the possibility that one could disable the Itsy’s display and serial link when they are not in use, or that one could reduce the processor clock frequency when CPU load is low. Contrasting Figures 4.5 and 4.7 shows that the Itsy consumes more energy than the

4.5. SPEECH RECOGNIZER
200

41

Energy (Joules)

150

Idle Janus Odyssey Kernel

100

Ba

50

0

Utterance 1

Utterance 2

Utterance 3

Utterance 4

This ?gure shows the energy used to recognize four spoken utterances from one to seven seconds in length, ordered from right to left above. The client machine which executes the recognitions is an Itsy v1.5 pocket computer—the results can be contrasted with Figure 4.5 in which the client machine is an IBM ThinkPad 560 laptop. The ?rst bar shows energy consumption without ?delity reduction. The remaining bars show the additional savings realized by adaptive strategies. The shadings within each bar detail energy usage by activity. Each measurement is the mean of ?ve trials—the (barely noticeable) error bars show 90% con?dence intervals.

Figure 4.7: Energy impact of ?delity for speech recognition on the Itsy v1.5 ThinkPad to perform local speech recognition, but signi?cantly less energy to perform hybrid and remote recognition. Although the Itsy is considerably more energy-ef?cient than the ThinkPad, its StrongArm processor is slower. More signi?cantly, the Janus recognizer performs a large number of ?oating-point operations, which are emulated in software on the Itsy. These factors combine to make local speech recognition prohibitively slow. Although average power usage is much lower on the Itsy, the total energy needed for local recognition is higher. Use of the reduced quality speech model for local recognition provides considerable bene?t on the Itsy, reducing energy usage from 44–49%. During hybrid and remote execution, use of the reduced speech model has little impact. Although the recognition completes more quickly on the remote server, the Itsy expends very little energy during the additional time it must wait for full-quality recognition because its background power usage is very small. Remote recognition uses much less energy on the Itsy. The energy savings from remote execution range from 92–94% for full-quality recognition and from 86–90% for reducedquality recognition. Two factors contribute to this large savings: the lack of ?oating-point support makes local processing less attractive, and the energy-ef?ciency of the hardware

lin e Re ce dM mo Re te od el Hymote bri - R Hy d ed bri uc ded Re Mo du de ce l dM od el du

Re

se

42 Execution Location Local Local Remote Remote Hybrid Hybrid

CHAPTER 4. ENERGY-AWARE ADAPTATION
ThinkPad 560X Normalized Energy 1.00 0.64 0.61 0.47 0.48 0.36 Itsy v1.5 Normalized Energy 1.00 0.53 0.06 0.06 0.09 0.08

Fidelity Full Reduced Full Reduced Full Reduced

This ?gure compares speech recognition energy usage on the Itsy v1.5 and ThinkPad 560X. Each row shows the average energy used to recognize an utterance on the two platforms with hardware power management enabled. Each data value is the geometric mean of energy usage, normalized to local energy usage at full ?delity, for the four utterances shown in Figures 4.5 and 4.7.

Figure 4.8: Comparison of per-platform speech recognition energy use platform means that very little energy is expended waiting for remote processing requests to complete. Hybrid recognition uses more energy than remote recognition on the Itsy, even though the operation completes in less time. The energy impact of performing the ?rst stage of recognition locally is greater than the total amount of energy spent waiting for recognition to complete during fully-remote execution. This has two important implications for remote execution. First, it illustrates that tradeoffs between performance and energy conservation exist for common applications. Second, it shows that remote execution decisions must consider both the client and its available resources when deciding where to locate functionality. The speech recognition results give some evidence that the effectiveness of energyaware adaptation will increase as mobile hardware becomes more energy-ef?cient. Intuitively, modifying application behavior can only affect dynamic power usage—background power usage is ?xed by device energy characteristics. Energy-ef?cient hardware platforms, which have lower background power usage and a greater ratio of maximum dynamic to background power usage, will bene?t more. The results in Figures 4.5 and 4.7 bear this out. The maximum energy savings due to ?delity reduction and remote execution is only 64–78% on the ThinkPad, but 92–94% on the more energy-ef?cient Itsy. However, these results are biased by the Itsy’s lack of ?oating-point support, which contributes to some of the energy differential. Figure 4.8 highlights the difference between the two platforms be showing the energy usage of each ?delity and remote execution location normalized to local energy usage at full ?delity.

4.6. MAP VIEWER

43

Viceroy Anvil Map Warden Map Server

Figure 4.9: Odyssey map viewer

4.6 Map viewer
4.6.1 Description
The third application that I measured was an adaptive map viewer named Anvil. As shown in Figure 4.9, Anvil fetches maps from a remote server via Odyssey and displays them on the client. Fidelity can be lowered in two ways: ?ltering and cropping. Filtering eliminates ?ne detail and less important features (such as secondary roads) from a map. Cropping preserves detail, but restricts data to a geographic subset of the original map. The client annotates the map request with the desired amount of ?ltering and cropping. The server performs any requested operations before transmitting the map data to the client.

4.6.2 Results
I measured the energy used by the client to fetch and display maps of four different cities. Viewing a map differs from the two previous applications in that a user typically needs a non-trivial amount of time to absorb the contents of a map after it has been displayed. This period, which I will refer to as think time, should logically be viewed as part of the application’s execution since energy is consumed to keep the map visible. In contrast, the user needs negligible time after the display of the last video frame or the recognition of an utterance to complete use of the video or speech application. Think time is likely to depend on both the user and the map being displayed. My approach to handling this variability was to use an initial value of 5 seconds and then perform sensitivity analysis for think times of 0, 10 and 20 seconds. For brevity, Figure 4.10 only presents detailed results for the 5 second case; for other think times, I present only the summary information in Figure 4.11. The baseline bars in Figure 4.10 show that most of the energy is consumed while the kernel executes the idle procedure; a signi?cant portion of this can be attributed to background power usage during the ?ve seconds of think time. The shadings on the bars indicate

44
150

CHAPTER 4. ENERGY-AWARE ADAPTATION
Idle Anvil X Server Odyssey WaveLAN Kernel

100

50

0

San Jose

Allentown

Boston

This ?gure shows the energy used to view U.S.G.S. maps of four cities. For each map, the ?rst bar shows energy usage without hardware power management or ?delity reduction, with a 5 second think time. The second bar shows the impact of hardware power management alone. The remaining bars show the additional savings realized by degrading map ?delity. The shadings within each bar detail energy usage by activity. Each measurement is the mean of ten trials—the error bars are 90% con?dence intervals.

Figure 4.10: Energy impact of ?delity for map viewing that network communication is a second signi?cant drain on energy. The comparatively larger con?dence intervals for this application result from variation in the time required to fetch a map over the wireless network. Hardware power management reduces energy consumption by about 9–19% relative to the baseline. Although there is little opportunity for network power management while the map is being fetched, the network can remain in standby mode during think time. Since the disk is never used, it can always remain in standby mode. The third and fourth bars of each data set show the effect of ?delity reduction through two levels of ?ltering. One ?lter omits minor roads, while the more aggressive ?lter omits both minor and secondary roads. The savings from the minor road ?lter range from 6–51% relative to hardware-only power management. The corresponding ?gure for the secondary road ?lter is 23–55%. The ?fth bar of each data set shows the effect of lowering ?delity by cropping a map to half its original height and width. Energy usage at this ?delity is 14–49% less than with hardware-only power management. In other words, cropping is less effective than ?ltering for these particular maps. Combining cropping with ?ltering results in an energy savings of 36–66% relative to hardware-only power management, as shown by the rightmost bars of each data set. Relative to the baseline, this is a reduction of 46–70%. There is little savings

Ba s Ha eline rdw Mi are n Se or Ro -Onl yP c ad o Cr onda Fil op ter wer ry pe Mg Ro Cr d o mt ad . Fil Cr pped ter op pe - Mi no dSe r Ro co nd ad F ary ilte Ro r ad Fil ter

Energy (Joules)

Pittsburgh

4.6. MAP VIEWER
Baseline Hardware-Only Power Mgmt. Lowest Fidelity

45

250

Energy (Joules)

200

150

100

50

0 0 5 10 15 20 25

Think Time (seconds)
This ?gure shows how the energy used to view the San Jose map from Figure 4.10 varies with think time. The data points show measured energy usage. The solid, dashed and dotted lines represent linear models of energy usage for the baseline, hardware-only power management and lowest ?delity cases. The latter combines ?ltering and cropping, as in the rightmost bars of Figure 4.10. Each measurement is the mean of ten trials—the error bars are 90% con?dence intervals.

Figure 4.11: Effect of user think time for map viewing left to be extracted through software optimization—almost all the energy is consumed in the idle state. After examining energy usage with 5 seconds of think time, I repeated the above experiments with think times of 0, 10, and 20 seconds. At any given ?delity, energy usage, ? ? ? , where ? increases with think time, ?. I expected a linear relationship: ? ? is the background power ? is the energy usage for this ?delity at zero think time and ? consumption on the client (5.6 Watts from Figure 2.1). Figure 4.11 con?rms that a linear model is indeed a good ?t. This ?gure plots the energy usage for four different values of think time for three cases: baseline, hardware power management alone, and lowest ?delity combined with hardware power management. The divergent lines for the ?rst two cases show that the energy reduction from hardware power management scales linearly with think time. The parallel lines for the second and third cases show that ?delity reduction achieves a constant bene?t, independent of think time. The complementary nature of these two approaches is thus well illustrated by these measurements. Figure 4.12 shows map viewer energy use as a function of the undistilled map size

·

46

CHAPTER 4. ENERGY-AWARE ADAPTATION

80

Energy (Joules)

60

Hardware-Only Power Mgmt. Minor Road Filter Combined

40

20

0 0 200 400 600 800 1000

Undistilled Map Size (KB)
Figure 4.12: Predicting map viewer energy use
This ?gure shows the relationship between system energy use, data ?delity, and undistilled map size. For each level of data ?delity, ten data points show the total energy used to view six maps (including the ones from Figure 4.10)—the corresponding line represents the best linear ?t through these points. All measurements were taken with hardware power management enabled. The error bars show 90% con?dence intervals for energy use.

for three ?delity levels: baseline, minor road ?ltering, and the combination of minor road ?ltering, secondary road ?ltering, and cropping. (The remaining ?delity levels are omitted for clarity). For each ?delity, six data points show the total energy used to fetch and display different city maps, including the four from Figure 4.10. Since I have already shown that the energy cost of user think time is quite predictable, these results assume zero think time, and thereby isolate the variance in the energy cost of fetching and displaying maps. Visual inspection of the linear ?ts reveals that energy usage corresponds closely to image size only in the baseline case. Whereas the baseline has a coef?cient of determination of greater than 99%, the minor road ?lter and combined ?delities have ?? values of 36% and 69% respectively. Clearly, the undistilled map size does not remain a good predictor when map ?delity is reduced. When simple models do not yield good predictions, it is possible that additional information will give more accurate results. In the speci?c example of the map viewer, the percentage of features omitted by a given ?lter may vary widely from map to map. However, if the map server were to store summary information with each map listing the occurrence of different feature types, one could anticipate the effectiveness of a ?lter. This would allow an adaptive system to determine the number of map features that would be fetched at

4.7. WEB BROWSER
8 80

47

Hardware-Only Power Mgmt. Minor Road Filter
6

Combined

Energy (Joules)

60

Energy (Joules)
0 10000 20000 30000 40000 50000

4

40

20

2

0

0 0 1000 2000 3000 4000 5000

Number of Map Features

Number of Map Features

Figure 4.13: Predicting map viewer energy use by number of features
This ?gure shows the relationship between system energy use, data ?delity, and the number of map features. For each level of data ?delity, ten data points show the total energy used to view the six maps from Figure 4.12—the corresponding line represents the best linear ?t through these points. All measurements were taken with hardware power management enabled. The error bars show 90% con?dence intervals for energy use.

various levels of ?ltering. Figure 4.13 shows the bene?t of this additional information. It displays map viewer energy use as a function of ?delity and the number of features fetched. For clarity, the data is displayed in two graphs: the graph on the left shows the baseline and minor road ?lter ?delities, while the other graph shows the combined ?delity. Knowledge of the number of features that will be omitted by the minor road ?lter makes predictions much more accurate—the ?? value for the minor-road ?lter ?delity is now 99%. Unfortunately, although the linear ?t for the combined ?delity improves, it is still quite poor with an ?? value of 79%. Although the model can anticipate the effect of ?ltering, it is still unable to cope with variation in the amount of features removed by cropping.

4.7 Web browser
4.7.1 Description
The fourth application is an adaptive Web browser based on Netscape Navigator, as shown in Figure 4.14. In this application, Odyssey and a distillation server located on either side of a variable-quality network mediate access to Web servers. Requests from an unmodi?ed

48

CHAPTER 4. ENERGY-AWARE ADAPTATION

Viceroy

Proxy

Netscape

Distillation Server Web Warden

to Web servers

Figure 4.14: Odyssey Web browser Netscape browser are routed to a proxy on the client that interacts with Odyssey. After annotating the request with the desired level of ?delity, Odyssey forwards it to the distillation server which transcodes images to lower ?delity using lossy JPEG compression. This is similar to the strategy described by Fox et al [23], except that control of ?delity is at the client.

4.7.2 Results
As with the map application, a user needs some time after an image is displayed to absorb its contents. I therefore include energy consumed during user think time as part of the application. I use a baseline value of 5 seconds and perform sensitivity analysis for 0, 10, and 20 seconds. Figure 4.15 presents measurements of the energy used to fetch and display four GIF images of varying sizes. Hardware power management achieves reductions of 22–26%. The shadings on the ?rst and second bars of each data set indicate that most of this savings occurs in the idle state, probably during think time. The energy bene?ts of ?delity reduction are disappointing. As Figure 4.15 shows, the energy used at the lowest ?delity is merely 4–14% lower than with hardware power management alone; relative to baseline, this is a reduction of 29–34%. The maximum bene?t of ?delity reduction is severely limited because the relative amount of energy used during think time (28 Watts) is much greater than the energy used to fetch and display an image, (2–16 Watts). Thus, even if ?delity reduction could completely eliminate the energy used to fetch and display an image, energy use would drop only 9–36%. Although Web ?delity reduction shows little bene?t in this study, it may be quite useful in other environments. For example, a more energy-ef?cient mobile device would use less energy during think time, increasing the possible bene?t of ?delity reduction. Similarly, if a high-speed network were unavailable, the energy bene?t of distillation would increase. The effect of varying think time is shown in Figure 4.16. The linear model introduced

4.7. WEB BROWSER
Idle Netscape Cellophane X Server Odyssey WaveLAN Kernel

49

60

Energy (Joules)

40

20

0

Image 1

JP G-50 E JP G-25 EG -5

eli Ha ne Po rdw we are JP r M -O E JP G-75 gmt. nly E

Ba

s

Image 2

Image 3

Image 4

This ?gure shows the energy used to display four GIF images from 110 B to 175 KB in size, ordered from right to left above. For each image, the ?rst data bar shows energy consumption at highest ?delity without hardware power management, assuming a think time of ?ve seconds. The second bar shows the impact of hardware power management alone. The remaining bars show energy usage as ?delity is lowered through increasingly aggressive lossy JPEG compression. The shadings within each bar detail energy usage by activity. Each measurement is the mean of ten trials—the error bars show 90% con?dence intervals.

Figure 4.15: Energy impact of ?delity for Web browsing in Section 4.6.2 ?ts observations well for all three cases: baseline, hardware-only power management, and lowest ?delity. The close spacing of the lines for the two latter cases re?ects the small energy savings available through ?delity reduction. The divergence of the lines for the ?rst two cases shows the importance of hardware power management during think time. Figure 4.17 shows Web browser energy use as a function of the undistilled image size for three ?delity levels: baseline, JPEG 50, and JPEG 5. For each ?delity, seven data points show the total energy used to fetch and display different images, including the four from Figure 4.15. As with the map application, these results include no user think-time, thereby isolating the variation in energy used to fetch and display images. For this application, simple linear models yield reasonable predictions: the ?? values for baseline, JPEG 50, and JPEG 5 are 91%, 93%, and 98% respectively. A related study of Netscape energy usage [64] looks at the relationship between energy use and image size in more detail. It attributes some of the variation in energy use to the Netscape application, speculating that the scheduling behavior of Netscape’s user-level threading package leads to non-deterministic effects. The study also ?nds that different

50

CHAPTER 4. ENERGY-AWARE ADAPTATION
Baseline Hardware-Only Power Mgmt. Lowest Fidelity

200

Energy (Joules)

150

100

50

0 0 5 10 15 20 25

Think Time (seconds)
This ?gure shows how the energy used to display Image 1 from Figure 4.15 varies with user think time. The data points on the graph show measured energy usage for user think times of 0, 5, 10, and 20 seconds. The solid, dashed and dotted lines represent linear models of energy consumption for the baseline, hardware-only power management, and lowest ?delity cases. Each measurement represents the mean of ten trials—the error bars are 90% con?dence intervals.

Figure 4.16: Effect of user think time for Web browsing images vary slightly in the amount of compression achieved for equivalent levels of JPEG quality.

4.8 Effect of concurrency
How does concurrent execution affect energy usage? One can imagine situations in which total energy usage goes down when two applications execute concurrently rather than sequentially. For example, once the screen has been turned on for one application, no additional energy is required to keep it on for the second. One can also envision situations in which concurrent applications interfere with each other in ways that increase energy usage. For example, if physical memory size is inadequate to accommodate the working sets of two applications, their concurrent execution will trigger higher paging activity, possibly leading to increased energy usage. Clearly, the impact of concurrency can vary depending on the applications, their interleaving, and the machine on which they run. What is the effect of lowering ?delity? The measurements reported in Sections 4.4

4.8. EFFECT OF CONCURRENCY
25

51

20

Hardware-Only Power Mgmt. JPEG 50 JPEG 5

Energy (Joules)

15

10

5

0 0 50 100 150

Unsdistilled Image Size (KB)
Figure 4.17: Predicting Web browser energy use
This ?gure shows the relationship between system energy use, data ?delity, and undistilled image size. For each level of data ?delity, six data points show the total energy used to fetch and display different GIF images (including the ones from Figure 4.15)—the corresponding line represents the best linear ?t through these points. All measurements were taken with hardware power management enabled. The error bars show 90% con?dence intervals for energy use.

to 4.7 indicate that lowering ?delity tends to increase the fraction of energy consumption attributable to the idle state. Concurrency allows background energy consumption to be amortized across applications. It is therefore possible in some cases for concurrency to enhance the bene?t of lowering ?delity. To con?rm this intuition, I compared the energy usage of a composite application when executing in isolation and when executing concurrently with the video application described in Section 4.4. The composite application consists of six iterations of a loop that involves the speech, Web, and map applications described in Sections 4.5 to 4.7. The loop consists of local recognition of two speech utterances, access of a Web page, access of a map, and ?ve seconds of think time. The composite application models a user searching for Web and map information using speech commands, while the video application models a background newsfeed. This experiment takes between 80 and 160 seconds. Figure 4.18 presents the results of the experiments for three cases: baseline, hardwareonly power management, and minimal ?delity. In the ?rst two cases, all applications ran at full ?delity; in the third case, all ran at lowest ?delity. For each data set, the left bar shows energy usage for the composite application in isolation, while the right bar shows energy

52
2000

CHAPTER 4. ENERGY-AWARE ADAPTATION
Idle Video Speech
1500

Map Web X Server Odyssey WaveLAN

Energy (Joules)

1000

Other

500

0

Baseline

Hardware-Only Power Mgmt.

Lowest Fidelity

Each data set in this ?gure compares energy usage for the composite application described in Section 4.8 in isolation (left bar), with total energy usage when a video application runs concurrently (right bar). Each measurement is the mean of ?ve trials—the error bars are 90% con?dence intervals.

Figure 4.18: Effect of concurrent applications usage during concurrent execution. For the baseline case, the addition of the video application consumes 53% more energy. But with hardware power management, it consumes 64% more energy. This difference is due to the fact that concurrency reduces opportunities for powering down the network and disk. For the minimum ?delity case, the second application only adds 18% more energy. The signi?cant background power usage of the client, which limits the effectiveness of lowering ?delity, is amortized by the second application. In other words, for this workload, concurrency does indeed enhance the energy impact of lowering ?delity. Figure 4.19 provides more detail by showing background and dynamic energy use for each application. As described in Section 2.1, I de?ne background energy as the amount of energy that would have been used if the computer had remained idle instead of executing the application. Background energy is application-independent; it represents the cost of operating hardware components in their lowest power states; for example, keeping the display backlit and the disk in standby mode. I de?ne dynamic energy use to be the amount of energy consumed by an application above and beyond its background energy use. Thus,

4.9. SUMMARY
Composite Only Energy (J) Bkgd. Dyn. Total 819.9 261.5 1081.4 622.0 320.6 942.6 471.5 192.6 664.0 Video Only Energy (J) Bkgd. Dyn. Total 1148.1 260.5 1408.6 841.4 435.5 1276.9 505.3 50.1 555.4

53 Concurrent Energy (J) Bkgd. Dyn. Total 1148.1 503.8 1651.9 841.4 708.4 1549.8 505.3 279.2 784.5

Scenario Baseline Hardware Lowest

This table displays the background and dynamic energy usage for the results shown in Figure 4.18. It shows energy usage for the composite application and background video feed in isolation, and then the energy usage when the two applications execute concurrently. The three rows show energy use at full ?delity with hardware power management disabled, at full ?delity with power management enabled, and at lowest ?delity with power management enabled.

Figure 4.19: Background and dynamic energy use for concurrent applications dynamic energy use captures the application-speci?c component of the energy usage. As Figure 4.19 con?rms, dynamic energy use is a much better metric than total energy use when projecting the effect of executing two applications concurrently. The dynamic energy use when the two applications are executed concurrently is roughly equivalent to the sum of each application’s dynamic energy use when executed independently. Of course, the correlation is not perfect: the actual combined dynamic energy use varies 4–13% from the sum of the individual dynamic energy usages. This variation can be attributed to several factors, such as the ability to amortize transition costs for hardware components (i.e. spinning up the disk) across multiple applications, and memory effects when the working sets of all applications do not ?t in physical memory. Unfortunately, one would often like to project total rather than dynamic energy usage. Although background power levels are easily determined for a hardware platform, the calculation of background energy requires one to project how long an application will execute. The concurrent execution latency will often depend upon contention for shared resources such as CPU, network, and disk. Therefore, accurately projecting total concurrent energy usage requires knowledge of availability and application use of these shared resources.

4.9 Summary
My primary goal in performing this study was to determine whether lowering data ?delity yields signi?cant energy savings. The results of Sections 4.4 to 4.7 con?rm that such savings are indeed available over a broad range of applications relevant to mobile computing. Further, those results show that lowering ?delity can be effectively combined with hardware power management. Section 4.8 extends these results by showing that concurrency can magnify the bene?ts of lowering ?delity. In addition, the results of Section 4.5 show that remote resources can sometimes be

54 Application Video Speech Map Think Time (s.) N/A N/A 0 5 10 20 0 5 10 20

CHAPTER 4. ENERGY-AWARE ADAPTATION
Baseline 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 Hardware Power Mgmt. 0.90–0.91 0.66–0.67 0.80–1.01 0.81–0.91 0.74–0.84 0.76–0.78 0.85–1.06 0.74–0.78 0.75–0.78 0.74–0.77 Fidelity Reduction 0.84–0.84 0.22–0.36 0.06–0.13 0.38–0.67 0.53–0.77 0.69–0.89 0.40–0.75 0.88–0.97 0.93–0.98 0.96–0.99 Combined 0.65–0.65 0.20–0.31 0.07–0.18 0.31–0.54 0.42–0.58 0.51–0.67 0.32–0.54 0.66–0.71 0.70–0.74 0.72–0.73

Web

This table summarizes the impact of data ?delity on application energy consumption. Each entry shows the minimum and maximum measured energy consumption on the IBM ThinkPad 560X for four data objects. The entries are normalized to baseline measurements of full ?delity objects with no power management. This data was extracted from Figures 4.2, 4.5, 4.10, and 4.15.

Figure 4.20: Summary of the energy impact of ?delity pro?tably employed to reduce the energy usage of a mobile client. However, the results also provide a caution: performing computation remotely does not always lead to reduced client energy use. The hybrid mode of speech recognition uses less energy than the fully remote mode on the ThinkPad because the energy cost of performing a small amount of computation locally is outweighed by the bene?t of decreased energy usage for network transmission. The study also reveals that it is often possible to predict the energy impact of ?delity reduction. For the video, speech, and Web applications, simple linear models based on ?delity and input data size provide good ?ts for application energy use. This means that an adaptive system could observe an application execute on several data objects at a given ?delity level, construct a simple model, and use that model to project how much energy will be used when the application operates on new objects at that ?delity level. The map application shows that achieving accurate energy predictions sometimes requires more work. Filtering based upon feature type yields considerable variation in energy reduction. This variation can be accurately modeled, however, if the server stores a summary with each map listing the number of occurrences of each feature type. Cropping introduces still more variation in energy use. Thus, the map application con?rms that there can often be a tradeoff between complexity and accuracy in predicting energy use.

4.9. SUMMARY

55

At the next level of detail, Figure 4.20 summarizes the results of Sections 4.4 to 4.7. For clarity, the data in each row is normalized to the baseline values. The key messages of Figure 4.20 are:

?

There is signi?cant variation in the effectiveness of ?delity reduction across data objects. The reduction can span a range as broad as 29% (0.38–0.67 for the map viewer, at think time 5). The video player is the only application that shows little variation across data objects. As mentioned above, this variation is often quite predictable with the use of simple linear models. There is considerable variation in the effectiveness of ?delity reduction across applications. Holding think time constant at 5 seconds and averaging across data objects, the energy usage for the four applications at lowest ?delity is 0.84, 0.28, 0.51 and 0.93 relative to their baseline values. The mean is 0.64, corresponding to an average savings of 36%. Combining hardware power management with lowered ?delity can sometimes reduce energy usage below the product of the individual reductions. This is seen most easily in the case of the video application, where the last column is 0.65 rather than the expected value of 0.76, obtained by multiplying 0.9 and 0.84. Intuitively, this is because reducing ?delity decreases hardware utilization, thereby increasing the opportunity for hardware power management.

?

?

56

CHAPTER 4. ENERGY-AWARE ADAPTATION

Chapter 5 A proxy approach for closed-source environments
The previous chapter showed that energy-aware applications can often signi?cantly extend the battery lifetimes of the laptop computers on which they operate by trading ?delity for reduced energy usage. However, since only multimedia, source-code available applications running on the Linux operating system were studied, it is not clear whether energy-aware adaptation can help the Windows of?ce applications that users commonly run on laptop computers. Thus, several important questions remain: Can one show signi?cant energy reductions for of?ce applications commonly executed on laptop computers? Is it possible to add energy-awareness without access to application source code? Is this approach valid for applications executing on source-code unavailable operating systems (i.e. Windows)? In this chapter, I will answer these questions by studying the potential bene?ts of energy-aware adaptation for Microsoft’s popular PowerPoint 2000 application.

5.1 Overview
As with many of?ce applications, PowerPoint enables users to include increasing amounts of rich multimedia content in their documents—for example, charts, graphs, and images. Since these objects tend to be quite large, the processor, network, and disk activity needed to manipulate them accounts for signi?cant energy expenditure. Yet, when editing a presentation, a user may only need to modify and view a small subset of these objects. Thus, it may be possible to signi?cantly reduce PowerPoint energy consumption by presenting the user with a distilled version of a presentation: one which contains only the information that the user is interested in viewing or editing. One can use Puppeteer [14], component-based middleware developed by Rice University, to perform such distillation. Puppeteer takes advantage of well-de?ned interfaces exported by applications to change behavior without source code modi?cation. Using Puppeteer, one can create a distilled version of a presentation which initially omits all multi57

58 CHAPTER 5. A PROXY APPROACH FOR CLOSED-SOURCE ENVIRONMENTS

Tier 4

Data ever S 1

Data ever S 2

Data ever S 3

High W B Link

High W B Link

High W B Link

Tier 3
Low Link BW

Puppeteer Server Proxy

Low Link BW

Tier 2
API

Puppeteer Client Proxy

Puppeteer Client Proxy

API

Tier 1

Application

Application

Client 1

Client 2

Figure 5.1: Puppeteer architecture media content not on the ?rst or master slide. Placeholders are inserted into the document to represent omitted objects. Later, if the user wishes to edit or view a component, she may click on the placeholder, and Puppeteer will load the component and dynamically insert it into the document. Thus, when the user edits only a subset of a document, the potential for signi?cant energy savings exists. In the remainder of this chapter, I explore the feasibility of adding energy-aware adaptation to PowerPoint. The next section provides a more detailed description of Puppeteer, and Section 5.3 describes my energy measurement methodology for the Windows platform. Section 5.4 measures the impact of energy-aware adaptation while loading, editing, and saving PowerPoint documents, and Section 5.5 summarizes the results.

5.2 Puppeteer
Puppeteer adapts the behavior of component-based applications, such as Microsoft’s PowerPoint and Internet Explorer, in response to variation in resource availability in mobile environments. While Puppeteer’s design goals are similar to Odyssey’s, its implementation is quite different. It uses the exported APIs of applications and the structured nature of the documents they manipulate to change application behavior without source-code modi?cation. Puppeteer thus acts as a proxy for an application. Puppeteer currently supports two forms of adaptation: subsetting and versioning. Subsetting loads only a portion of the elements of a document. Versioning loads different versions of some elements, for example low-resolution distillations of images.

5.3. MEASUREMENT METHODOLOGY

59

Commercial applications such as Microsoft’s Of?ce suite are ideally suited for Puppeteer. These applications have well-de?ned document structures, allowing Puppeteer to parse documents to implement speci?c policies such as degrading all images of a certain size. Further, such applications have rich external interfaces that enable Puppeteer to incrementally modify loaded documents. For example, Puppeteer can insert additional data or higher-resolution versions of images using such APIs. Figure 5.2 shows the Puppeteer architecture. There are four tiers: applications, Puppeteer client proxies, Puppeteer server proxies, and data servers. Data servers are arbitrary repositories of data such as Web servers, ?le servers, or databases. All communication between applications and data servers pass through the Puppeteer client and server proxies. Applications and data servers remain completely unmodi?ed—adaptation is performed by client and server proxies working in concert. Puppeteer adjusts to reduced bandwidth availability between a mobile client and data servers by loading degraded versions of documents. Application-speci?c policies determine which subset of components should be fetched and which should be degraded. Puppeteer parses the document to uncover the structure of the data, fetches selected components at speci?c ?delity levels, and updates the application with the newly fetched data. Once a degraded document version is loaded, the application returns control to the user. Although the application believes that it has loaded the original, full-quality document, Puppeteer maintains a list of degraded and omitted components. While editing or viewing a document, a user may double-click on a degraded component or on a placeholder for an omitted component. Puppeteer then fetches the component at its highest quality and inserts it into the document using the application’s external programming interface. Puppeteer is currently being modi?ed to allow users to save changes made to degraded versions of documents. Since Puppeteer understands document structure, it can parse the modi?ed version on the client and send to the server only those components which have been modi?ed. The server proxy can then merge the modi?ed components into the fullquality version and save the resulting document.

5.3 Measurement methodology
All measurements reported in this chapter were collected in a client-server environment similar to the one used in the previous chapter. PowerPoint executes on the client: an IBM 560X laptop, as described in Section 2.2.1. The server is a 400 MHz Pentium II desktop with 128 MB of memory. Both machines run the Windows NT 4.0 operating system. The machines communicate using a 2 Mb/s 2.5 GHz Lucent WaveLan network. I added an additional 32 MB of memory to the client, bringing the total to 96 MB—this was necessary to run PowerPoint, Puppeteer, and Windows NT without thrashing. Since PowerPoint source code is unavailable, it made little sense to port PowerScope to Windows to perform these measurements. Instead, I simply measured the amount of total energy consumed by the laptop between two points in program execution. The hardware setup used for Windows measurements is identical to that used for PowerScope measure-

60 CHAPTER 5. A PROXY APPROACH FOR CLOSED-SOURCE ENVIRONMENTS Document Identi?er 24666 24758 24890 26295 26141 24773 25189 26388 26140 25132 Full-Quality Size (MB) 15.02 11.42 7.26 3.11 2.23 1.72 1.07 0.87 0.20 0.08 Distilled Size (MB) 1.67 0.47 0.83 3.11 1.32 0.11 0.36 0.75 0.20 0.08

Presentation A B C D E F G H I J

Ratio 0.11 0.04 0.11 1.00 0.59 0.07 0.34 0.86 1.00 1.00

Figure 5.2: Sizes of sample presentations ments. I attached the probes of the HP3458a digital multimeter to the external power input of the client laptop and removed the battery to eliminate the effects of charging. I also connected the output pin of the client’s parallel port to the multimeter’s external trigger. I created a dynamic library that allows applications to precisely indicate the start and end of each measurement. An application ?rst calls the start measuring function, which records the current time and toggles the parallel port pin. Once the pin is toggled, the multimeter samples current levels at its maximum rate of 1357.5 times per second. When the event being measured completes, the application calls the stop measuring function, which returns the elapsed time since the start measuring function was called. To calculate total energy usage, I ?rst derive the number of samples, ?, that were taken before stop measuring was called by multiplying the elapsed measurement time by the sample rate. The mean of the ?rst ? samples is the average current level. Multiplying this value by the measured voltage for the laptop power supply yields the average power usage. This is multiplied by the elapsed time to calculate total energy usage. I assume aggressive power management policies. All measurements were taken using a disk-spindown threshold of 30 seconds (the minimum allowed by Windows NT). The wireless network uses standard 802.11 power management. However, the display is not disabled during measurements since PowerPoint is interactive.

5.4 Bene?ts of PowerPoint adaptation
5.4.1 Loading presentations
I ?rst examined the potential bene?t of loading distilled PowerPoint presentations. I measured the energy used to fetch presentations from the remote server over the wireless network and render them on the client.

5.4. BENEFITS OF POWERPOINT ADAPTATION
4000

61

Total Energy (Joules)

3000

native full distilled

2000

1000

0

A

B

C

D

E

F

G

H

I

J

This ?gure shows the energy used to load ten PowerPoint presentations from a remote server. Native mode loads presentations from an Apache Web server. Full and distilled modes load presentations from a Puppeteer server, with full mode loading the entire presentation and distilled mode loading a lower-quality version. Each bar represents the mean of ?ve trials—90% con?dence intervals are suf?ciently small so that they would not be visible on the graph.

Figure 5.3: Energy used to load presentations I chose a sample set of documents from a database of 1900 PowerPoint presentations gathered from the Web as described by de Lara et al. [13]. From the database, I selected ten documents relatively evenly distributed in size. Figure 5.2 shows the sizes of these documents, as well as the size reductions achieved by distillation. For experimental repeatability, it also lists an identi?er that uniquely identi?es each document within the presentation database. The speci?c distillation policy used in these experiments omits all multimedia data that is not contained on either the ?rst or master slide. As might be expected, larger documents tend to have the most multimedia content, although there is considerable variation in the data. For three documents (D, I, and J), distillation does not reduce document size at all. For each document, I ?rst measured the energy used by PowerPoint to load the presentation from a remote server (I will refer to this as “native mode”). In this case, the document is loaded from an Apache Web server. I also investigated the cost of loading documents from a remote NT ?le system but found that the latency and energy expenditure was signi?cantly greater than using the Web server. I then measured energy used by PowerPoint when the document was loaded from the same server using Puppeteer. In this case, the document is served from a Puppeteer server proxy. I measured two modes of operation: “full” mode, in which the entire document is

62 CHAPTER 5. A PROXY APPROACH FOR CLOSED-SOURCE ENVIRONMENTS

Normalized Energy

2

native full distilled

1

0

A

B

C

D

E

F

G

H

I

J

This ?gure shows the relative energy used to load ten PowerPoint presentations from a remote server, as described in Figure 5.3. For each data set, results are normalized to the amount of energy used to load the document in native mode.

Figure 5.4: Normalized energy used to load presentations loaded, and “distilled” mode, in which a reduced-quality version is loaded. Figure 5.3 shows the total energy used to fetch the documents using native, full, and distilled modes. In Figure 5.4, I show the relative impact for each document by normalizing each value to the energy used by native mode. The energy savings achieved by distillation vary widely. Loading a distilled version of document A uses only 13% as much energy as native mode, while distilling document J uses 137% more energy. On average, loading a distilled version of a document uses 60% of the energy used by native mode. It is interesting to note that full mode can sometimes use less energy to fetch a document than native mode. This is because fetching a presentation with Puppeteer tends to use less power than native mode. Thus, even though native mode takes less time to fetch a document, its total energy usage can sometimes be greater. Without source code, it is impossible to know for certain why Puppeteer power usage is lower than native mode. One possibility is more ef?cient scheduling of network transmissions. The results in Figures 5.3 and 5.4 show that while most documents bene?t from distillation, some suffer an energy penalty. This indicates that Puppeteer might pro?t by predicting whether a document will bene?t from distillation. If it predicts that a document will bene?t, it could distill and fetch it; otherwise, it could forego distillation and fetch the entire document.

5.4. BENEFITS OF POWERPOINT ADAPTATION

63

Total Energy (Joules)

600

full-quality distilled
400

200

0

A

B

C

E

F

G

H

I

J

This ?gure shows the amount of energy needed to page through a presentation. For each data set, the left bar shows energy use for a full-quality presentation, and the right bar shows energy use for a reduced-quality version of the same presentation. Document D is omitted from this experiment since it contains only a single slide. Each bar represents the mean of ?ve trials—the error bars show 90% con?dence intervals.

Figure 5.5: Energy used to page through presentations One possible prediction method is to distill only presentations that have a size greater than a ?xed threshold, reasoning that small documents are unlikely to contain signi?cant multimedia content. Analysis of the documents used in this study suggests that a reasonable threshold is 0.5 MB. A strategy of distilling only presentations larger than 0.5 MB uses 52% of the energy of native mode to load the ten documents. Another possible prediction method is to distill only those documents with a percentage of multimedia content greater than a threshold value. As shown in Figure 5.2, distillation does not reduce the sizes of three documents. If Puppeteer does not distill these documents, it uses only 51% of the energy of native mode to fetch the ten documents.

5.4.2 Editing presentations
I next measured how document distillation affects the energy needed to edit a presentation. While it is somewhat intuitive that loading a smaller, distilled version of a document can require less energy, it is less clear that distillation also reduces energy usage while the document is being displayed or edited. Naturally, energy usage depends upon which activities a user performs. While a de?nitive measurement of potential savings requires a detailed analysis of user behavior, one can estimate such savings by looking at the energy used to perform common activities.

64 CHAPTER 5. A PROXY APPROACH FOR CLOSED-SOURCE ENVIRONMENTS

Total Energy (Joules)

600

400

full-quality - 2nd pass full-quality - 1st pass distilled - 2nd pass distilled - 1st pass

200

0

A

B

C

E

F

G

H

I

J

This ?gure shows the amount of energy needed to page through a presentation a second time. The energy needed to page through each presentation the ?rst time is also shown for comparison. For each data set, the left two bars show energy use for a full-quality presentation, and the right two bars show energy use for a reduced-quality version. Document D is omitted from this experiment since it contains only a single slide. Each bar represents the mean of ?ve trials—the error bars show 90% con?dence intervals.

Figure 5.6: Energy used to re-page through presentations One very common activity is paging through the slides in a presentation. I created a Visual Basic program to simulate the actions of a user performing this activity. The program loads the ?rst slide, then sends PageDown keystrokes to PowerPoint until all remaining slides have been displayed. After sending each keystroke, the program waits for the new slide to render, then pauses for a second to simulate user think-time. I measured the energy used to page through both the full-quality and distilled versions of each document. Figure 5.5 presents these results for nine of the ten presentations in the sample set—presentation D is omitted because it contains only a single slide. As shown by the difference in height between each pair of bars in Figure 5.5, distilling a document with large amounts of multimedia content can signi?cantly reduce the energy needed to page through the document. Energy savings range from 1% to 30%, with an average of 13%. After PowerPoint displays a slide, it appears to cache data allowing it to quickly rerender the slide, thereby reducing the energy needed for redisplay. This effect is shown in Figure 5.6, which displays the energy used to page through each document a second time. For ease of comparison, Figure 5.6 also shows the energy needed to page through each document the ?rst time. Comparing the heights of corresponding bars shows that sub-

5.4. BENEFITS OF POWERPOINT ADAPTATION
none auto-correct spell-checking grammar-checking office assistant

65

400 1500

Total Energy (Joules)

300

Total Energy (Joules)

1000

200

500

100

0

0

(a) no interval

(b) 100 ms. interval

This ?gure shows the amount of energy needed to perform background activities during text entry. The graph on the left shows energy use when text is entered without pause, and the graph on the right shows energy use with a 100 ms. pause between characters. Each bar shows the cumulative effect of performing background activities, so the leftmost bar in each graph was measured while no background activities were being performed, and the rightmost bar in each graph was measured while all background activities were being performed. Each bar represents the mean of ?ve trials—the error bars show 90% con?dence intervals.

Figure 5.7: Energy used by background activities during text entry sequent slide renderings use less energy than the initial renderings. Thus, the bene?t of distillation is smaller on subsequent traversals of the document: ranging from negligible to 20% with an average value of 5%.

5.4.3 Background activities
I next measured the energy used to perform background activities such as auto-correction and spell-checking. Whenever a user enters text, PowerPoint may perform background processing to analyze the input and offer advice and corrections to the user. When battery levels are critical, such background processing could be disabled to extend battery lifetime. I measured the effect of auto-correction, spell-checking, style-checking, and providing advice through the Of?ce Assistant (paperclip). I created a Visual Basic program which enters a ?xed amount of text on a blank slide. The program sends keystrokes to PowerPoint, pausing for a speci?ed amount of time between each keystroke. Figure 5.7(a) shows the energy used to enter text with no pause between keystrokes;

66 CHAPTER 5. A PROXY APPROACH FOR CLOSED-SOURCE ENVIRONMENTS Figure 5.7(b) shows energy usage with a 100 ms. pause between keystrokes. I ?rst measured energy usage with no background activities enabled, and then successively enabled auto-correction, spell-checking, style-checking, and the Of?ce Assistant. Thus, the difference between any bar in Figure 5.7 and the bar to its left shows the amount of additional energy used to perform a speci?c background activity. For example, the difference between the ?rst two bars in each graph shows the effect of auto-correction. Figure 5.7 shows that auto-correction expends negligible energy when entering text— the additional energy can not be distinguished from experimental error. Spell-checking and style-checking incur a small additional cost. With no pause between entering characters, employing these options adds a 5.0% energy overhead—with a 100 ms. pause between characters, the overhead is 3.3%. The Of?ce Assistant incurs a more signi?cant energy penalty. With no pause between typing characters, enabling the Assistant leads to a 9.1% increase in energy use. With a 100 ms. pause, energy use increases 4.9%. In fact, even when the user is performing no activity, enabling the Of?ce Assistant still consumes an additional 0.3 Watts, increasing power usage 4.4% on the measured system. Adaptively disabling the Of?ce Assistant can therefore lead to a small but signi?cant extension in battery lifetime.

5.4.4 Autosave
Autosave frequency is another potential dimension of energy-aware adaptation. After a document is modi?ed, PowerPoint periodically saves an AutoRecovery ?le to disk in order to preserve edits in the event of a system or application crash. Autosave may be optionally enabled or disabled—if it is enabled, the frequency of autosave is a con?gurable parameter. Since autosave is performed as a background activity, it often will have little effect upon perceived application performance. However, the energy cost is not negligible: the disk must be spun up, and, for large documents, a considerable amount of data must be written. Since periodic autosaves over the wireless network would be prohibitively slow, I assumed that documents are stored on local disk. I created a Visual Basic program to help quantify the energy impact of autosave. The program loads a PowerPoint document, makes a small modi?cation (adds one slide), and then performs no further activity for twenty minutes. To avoid spurious measurement of initial activity associated with loading and modifying the presentation, I waited for ten minutes after the modi?cation was made to the document before I began measuring power usage. Figure 5.8 shows power usage for three autosave frequencies. For each presentation, the ?rst bar shows power usage when the full-quality version of the document is modi?ed and a one minute autosave frequency is speci?ed. The next bar in each data set shows the effect of a ?ve minute autosave frequency. The ?nal bar shows the effect of disabling autosave—this is the maximum power reduction that can be achieved by modifying autosave parameters. As can be seen by the difference between the ?rst two bars of each data set, changing the autosave frequency from 1 minute to 5 minutes reduces power usage 5–12%, with an average reduction of 8%. The maximum possible bene?t is realized when autosave is

5.5. SUMMARY
1 min. autosave 5 min. autosave no autosave

67

8

Average Power (Watts)

6

4

2

0

A

B

C

D

E

F

G

H

I

J

This ?gure shows how the frequency of PowerPoint autosave affects power usage. The three bars in each data set show power use with a 1 minute autosave frequency, with a 5 minute autosave frequency, and with autosave disabled. Each bar represents the mean of ?ve trials—the error bars show 90% con?dence intervals.

Figure 5.8: Effect of autosave options on application power usage disabled. As shown by the difference between the ?rst and last bars in each data set, this reduces power usage 7–18% with an average reduction of 11%. Thus, depending upon the user’s willingness to hazard data loss in the event of crashes, autosave frequency is a potentially useful dimension of energy-aware adaptation.

5.5 Summary
I began this chapter by asking several questions: Can one show signi?cant energy reductions for of?ce applications commonly executed on laptop computers? Is it possible to add energy awareness without access to application source code? Is this approach valid for applications executing on source-code unavailable operating systems (i.e. Windows)? As the results of Section 5.4 show, the answer to all these questions is “yes”. Using Puppeteer’s component-based adaptation approach, one can add energy-awareness without application or operating system source code. Distillation reduces the energy needed to load PowerPoint presentations from a remote server by 49%. Distillation can also lead to signi?cant energy savings when editing and saving presentations. Finally, I showed how modi?cations such as disabling the Of?ce Assistant and lowering autosave frequency can lead to further energy conservation.

68 CHAPTER 5. A PROXY APPROACH FOR CLOSED-SOURCE ENVIRONMENTS

Chapter 6 System support for energy-aware adaptation
The previous two chapters demonstrated the feasibility of energy-aware adaptation. They showed that applications can modify their behavior to signi?cantly reduce the energy usage of the platforms on which they execute. However, since energy-aware adaptation comes at the cost of degraded ?delity, applications should only adapt their behavior when necessary. This chapter explores how system support can guide applications to make appropriate adaptation decisions. The next section explores goal-directed adaptation, a feedback technique which estimates the importance of reducing energy usage. By measuring energy supply and demand, the operating system is able to determine the correct balance between energy conservation and application ?delity. Section 6.2 then shows how the system can improve the agility of adaptation decisions by learning from a history of past application energy usage.

6.1 Goal-directed adaptation
Current operating systems provide little support for energy management. Typically, the system reports the expected remaining battery lifetime at the current rate of energy usage. In addition, the system usually provides a limited number of device-speci?c parameters that modify hardware power management policies. Users who wish to extend their computer’s battery lifetimes must manually adjust several parameters such as the screen brightness and disk spin-down timeout to select the correct level of energy conservation. In order to set these parameters correctly, they must develop an intuitive notion of how each parameter affects energy use. Further, they must periodically observe the estimate for remaining battery lifetime in order to verify that the chosen settings are correct. Goal-directed adaptation inverts this process by making energy management the responsibility of the operating system. The user speci?es only desired battery lifetime. The system assumes responsibility for making the correct tradeoffs between quality and energy 69

70

CHAPTER 6. SYSTEM SUPPORT FOR ENERGY-AWARE ADAPTATION

conservation that ensure that the battery lasts for the speci?ed duration. Rather than adjusting power management parameters, the system adjusts application ?delity, using feedback to determine correct settings. The system also performs the task of monitoring remaining battery lifetime to ensure that the chosen ?delity settings are correct.

6.1.1 Design considerations
The most important consideration in the design of goal-directed adaptation is ensuring that the speci?ed time goal is met whenever feasible. Clearly, users will only trust the system to manage their battery energy if it proves that it can reliably meet the speci?ed goals. When a user speci?es an infeasible duration, one so large that the available energy is inadequate even if all applications run at lowest ?delity, the system should detect the problem and alert the user as soon as possible. An important secondary goal is providing the best user experience possible. This translates into two requirements: ?rst, applications should offer as high a ?delity as possible at all times; second, the user should not be jarred by frequent adaptations. Goal-directed adaptation balances these opposing concerns by striving to provide high average ?delity while using hysteresis to reduce the frequency of ?delity changes. When the user speci?es a battery lifetime, it is important to recognize that the duration represents the amount of work the user wishes to accomplish while on battery power. Adaptive strategies that extend battery lifetime but accomplish less total work should therefore be avoided. For example, consider a workload of scienti?c calculations performed on the Itsy v1.5. Reducing processor clock frequency decreases the average power used to perform a calculation, but increases total energy consumption. Thus, although battery lifetime is extended, the total number of calculations performed will be decreased. This scenario makes it clear that ?delity should only be reduced when applications use less total energy per unit of work at the reduced ?delity. All of the ?delity reductions studied in the previous two chapters have this property.

6.1.2 Implementation
The Odyssey platform for mobile computing, described in Section 2.3, provides the basis for implementing goal-directed adaptation. I created a simple user interface that allows users to specify goals for battery duration and receive feedback about the feasibility of the speci?ed goals. Further, I modi?ed Odyssey to perform three tasks periodically. First, Odyssey determines the residual energy available in the battery. Second, it predicts future energy demand. Third, based on these two pieces of information, it decides if applications should change ?delity and noti?es them accordingly.

6.1. GOAL-DIRECTED ADAPTATION

71

Figure 6.1: User interface for goal-directed adaptation User interface Often the estimate for needed battery lifetime will be driven by external criteria—for example, the expected duration of a ?ight, commute, or meeting. Since the exact duration of such events is often unknown, Odyssey allows the user to respecify the time goal when necessary. Whenever this happens, Odyssey either adapts to meet the new goal or noti?es the user that it is infeasible. Figure 6.1 shows the user interface for goal-directed adaptation. This interface is designed to be as simple as possible so as to avoid unnecessary user distraction. The user speci?es the goal for battery duration by adjusting the slider in the middle of the dialog. When conditions change, the user may readjust the slider to specify a new battery goal. As time passes and the battery drains, the slider moves to the left to express the change in expected remaining battery lifetime. For example, the dialog in Figure 6.1 shows that the expected remaining battery lifetime is 138 minutes. After one hour, the dialog will show 78 minutes of remaining lifetime, assuming that the user does not adjust the battery goal in the meantime. While current operating systems provide similar dialogs, they are output-only; they do not provide users with the ability to change the expected battery lifetime. The dialog also displays the state of battery management to the user. Figure 6.1 shows the normal state; the battery lifetime is being managed by Odyssey and the speci?ed goal appears feasible. If Odyssey determines that the goal is infeasible, the interface changes the dialog title and displays a red background. This warns the user that the battery will expire sooner than the speci?ed time—the user may then decrease activity or specify a shorter duration. Similarly, when Odyssey determines that the battery will last longer than the speci?ed duration even if all applications execute at their highest ?delities, the inter-

72

CHAPTER 6. SYSTEM SUPPORT FOR ENERGY-AWARE ADAPTATION

face changes the dialog title and displays a green background. If the mobile computer is connected to wall-power, the slider is disabled since goal-directed adaptation is only meaningful when the battery is discharging. Determining residual energy Odyssey determines residual energy by measuring the amount of charge in the battery. Many modern mobile computers ship with a Smart Battery [79], a gas gauge chip which reports detailed information about battery state and power usage. On such platforms, Odyssey periodically queries the Smart Battery chip to determine remaining battery capacity. The speci?c query interface may be machine-dependent. For example, the Itsy v2.2 contains a Dallas Semiconductor DS2437 Smart Battery chip [12]. Odyssey queries the battery through a one-wire bus connected to a general purpose input/output (GPIO) pin of the StrongArm 1100 processor. A Linux device driver for this chip was developed by Compaq Western Research Lab. Once per minute, Odyssey performs an ioctl on the device to query remaining capacity. The Advanced Con?guration and Power Interface (ACPI) speci?cation [36] provides a more standard interface to Smart Battery information. The Linux ACPI driver exports battery status information through the /proc interface. On the IBM ThinkPad T20 laptop, Odyssey reads the remaining capacity through this interface once every ten seconds. While the above methods are best for deployed systems, they may not be ideal in laboratory settings. Older mobile computers such as the Itsy v1.5 and IBM ThinkPad 560X lack the necessary hardware support for measuring battery capacity. Additionally, since an actual battery must be used to obtain measurements, it is dif?cult to control for non-ideal battery behavior. Finally, running large numbers of experiments is dif?cult because one must recharge the battery after each trial. To facilitate evaluation, Odyssey can optionally use a modulated energy supply. At the beginning of evaluation, the initial value for the modulated supply is speci?ed to Odyssey. As the evaluation proceeds, a digital multimeter samples the actual power usage of the mobile computer and transmits the samples to Odyssey. When it receives a new sample, Odyssey assumes constant power consumption between samples and decrements the modulated energy supply by the product of the sampled power usage and the sampling period. When the value of the modulated energy supply reaches zero, Odyssey reports that the battery has expired. Predicting future energy demand To predict energy demand, Odyssey assumes that future behavior will be similar to recentlyobserved behavior. Odyssey uses smoothed observations of present and past power usage to predict future power use. This approach is in contrast to requiring applications to explicitly declare their future energy usage—an approach that places unnecessary burden on applications and is unlikely to be accurate.

6.1. GOAL-DIRECTED ADAPTATION

73

Odyssey uses methods similar to those described in the previous section to observe power usage. On the Itsy v2.2, Odyssey queries the Smart Battery to measure current and voltage levels once per second. It multiplies the two values to calculate power usage. On the ThinkPad T20, an artifact of the Linux ACPI driver prevents Odyssey from querying current and voltage levels. Instead, Odyssey estimates power usage by periodically sampling remaining battery capacity and dividing the difference in capacity by the sample period. When the battery supply is modulated, Odyssey uses the power samples collected by the external digital multimeter. To smooth power estimates, Odyssey uses an exponential weighted average of the form: new

?? ? ???this sample? · ????old?

(6.1)

where ? is the gain, a parameter that determines the relative weights of current and past power usage. Once future power usage has been estimated, it is multiplied by the time remaining until the goal to obtain future energy demand. Odyssey varies ? as energy drains, thus changing the tradeoff between agility and stability. When the goal is distant, Odyssey uses a large ?. This biases adaptation toward stability by reducing the number of ?delity changes—there is ample time to make adjustments later, if necessary. As the goal nears, Odyssey decreases ? so that adaptation is biased toward agility. Applications now respond more rapidly, since the margin for error is small. Currently, Odyssey sets ? so that the half-life of the decay function is approximately 10% of the time remaining until the goal. For example, if 30 minutes remain, ? is chosen so that the present estimate will be weighted equally with more recent samples after approximately 3 minutes have passed. Speci?cally, if one sample is collected per second, and T is the time remaining to the goal:

?

??

? ?? ?

?

??

(6.2)

The choice of 10% is based on a sensitivity analysis, discussed in Section 6.1.4. Triggering adaptation When predicted demand exceeds residual energy, Odyssey advises applications to reduce their energy usage. Conversely, when residual energy signi?cantly exceeds predicted demand, applications are advised to increase ?delity. In this chapter, ?delity reduction is the sole method of energy conservation—Chapter 7 describes the system support necessary to support remote execution as a further dimension of energy conservation. The amount by which supply must exceed demand to trigger ?delity improvement is indicative of the level of hysteresis in Odyssey’s adaptation strategy. This value is the sum of two empirically-derived components: a variable component, 5% of residual energy, and a constant component, 1% of the initial energy available. The variable component re?ects

74

CHAPTER 6. SYSTEM SUPPORT FOR ENERGY-AWARE ADAPTATION

the bias toward stability when energy is plentiful and toward agility when it is scarce; the constant component biases against ?delity improvements when residual energy is low. As a guard against excessive adaptation due to energy transients, Odyssey caps ?delity improvements at a maximum rate of once every 15 seconds. To meet the speci?ed goal for battery duration, Odyssey tries to keep power demand within the zone of hysteresis. Whenever power demand exceeds those bounds, adaptation is necessary. Odyssey then determines which applications should change ?delity, as well as the magnitude of change needed for each application. Ideally, the set of ?delity changes should modify power demand so that it once again enters the zone of hysteresis. Often, there will be many possible sets of adaptations that yield the needed change in power demand. In these cases, Odyssey must refer to an adaptation policy to decide which alternative is best. Although one can implement many different adaptation policies, my initial prototype used a simple one based on user-speci?ed priorities. Each application speci?es a list of supported ?delities—lower ?delity levels are assumed to use less energy than higher ?delities. The user assigns static priorities to arbitrate between applications. Odyssey always degrades a lower-priority application before degrading a higher-priority one—upgrades occur in reverse order. Once the lowest-priority application is degraded to its lowest ?delity level, Odyssey degrades the ?delity of the next lowest-priority application. After an application adapts, Odyssey veri?es whether the change in power demand is suf?cient. It resets the estimate of power demand to a value that is precisely in the middle of the zone of hysteresis—this represents an optimistic assumption that the change in ?delity will prove suf?cient. As Odyssey takes new power measurements, the estimate of power demand will either stay within the zone of hysteresis, indicating that the change in ?delity was correct, or stray outside the bounds, indicating that additional adaptation is necessary. In the later case, Odyssey will continue to adapt application behavior until it reaches a value where power demand stays within the zone of hysteresis. In the rest of this chapter, this adaptation policy will be referred to as the incremental policy, because it changes ?delity one step at a time. The incremental strategy is suf?cient to evaluate goal-directed adaptation. However, it is clearly inadequate to model many of the policies that a user may wish to express. Section 6.2 describes the system support necessary to support more detailed policies.

6.1.3 Basic validation
Experiment design To validate goal-directed adaptation, I executed the two energy-aware applications described in Section 4.8: a composite application involving speech recognition, map viewing and Web access, run concurrently with a background video application. To obtain a continuous workload, the composite application executes every 25 seconds. This has the effect of holding the amount of work (number of recognitions, image views, and map views) constant over time.

6.1. GOAL-DIRECTED ADAPTATION

75

The video application supports four ?delity levels: full-quality, Premiere-B, PremiereC, and the combination of Premiere-C and reduction of the display window. The speech recognition component of the composite application supports full and reduced quality recognition on the local machine. The map component has four ?delities: full-quality, minor road ?ltering, secondary road ?ltering, and the combination of secondary road ?ltering and cropping. The Web component supports the ?ve ?delities shown in Figure 4.15. I prioritized these components so that speech had the lowest priority, and video, map, and Web had successively higher priority. Client applications execute on the IBM 560X laptop, and communicate with servers using a 900 MHz 2 Mb/s Lucent WaveLAN wireless network. To isolate the performance of goal-directed adaptation, Odyssey uses a modulated energy supply. In addition, Odyssey uses the incremental adaptation policy described in the previous section. At the beginning of each experiment, I provided Odyssey with an initial energy value and a time goal. I then executed the applications, allowing them to adapt under Odyssey’s direction until the time goal was reached or the residual energy dropped to zero. The former outcome represents a successful completion of the experiment, while the latter represents a failure. I noted the total number of adaptations of each application during the experiment. I also noted the residual energy at the end of the experiment—a large value suggests that Odyssey may have been too conservative in its adaptation decisions and that average ?delity could have been higher. All experiments used a 12,000 Joule modulated energy supply. This lasts 19:27 minutes when applications operate at highest ?delity, and 27:06 minutes at lowest ?delity. The difference between the two values represents a 39.3% extension in battery life. I deliberately chose a small initial energy value so that I could perform a large number of experiments in a reasonable amount of time. The value of 12,000 Joules is only about 14% of the nominal energy in the IBM 560X battery. Extrapolating to full nominal energy, the workload would run for 2:18 hours at highest ?delity, and 3:13 hours at lowest ?delity. Results Figures 6.2 and 6.3 show detailed results from two typical experiments: one with a 20 minute goal, and the other with a 26 minute goal. Figure 6.2 shows how the supply of energy and Odyssey’s estimate of future demand change over time during the two experiments. In both trials, Odyssey meets the speci?ed goal for battery duration. In addition, once the time goal is reached, residual energy is quite low. The graph also con?rms that estimated demand tracks supply closely. The most visible difference between the two trials is the slope of the supply and demand lines. After an initial adaptation period of approximately three minutes, Odyssey adjusts power usage in the 26 minute trial in order to extend battery lifetime. The four graphs of Figure 6.3 show how the ?delity of each application varies during the two experiments. For the 20 minute goal, the high priority Web and map applications remain at full ?delity throughout the experiment; the video degrades slightly; and speech runs mostly at low ?delity. For the 26 minute goal, the highest priority Web application runs

76

CHAPTER 6. SYSTEM SUPPORT FOR ENERGY-AWARE ADAPTATION
15000

Energy (Joules)

10000

Supply - 26 Minute Goal Demand - 26 Minute Goal Supply - 20 Minute Goal Demand - 20 Minute Goal

5000

0 0 500 1000 1500

Elapsed Time (s)
This ?gure shows how Odyssey meets user-speci?ed goals for battery durations of 20 and 26 minutes when running the composite and video applications described in Section 4.8. It shows how the modulated supply of energy and the estimated energy demand change over time.

Figure 6.2: Example of goal-directed adaptation—supply and demand mostly at the highest ?delity, while the other three applications run mostly at their lowest ?delities. The difference in ?delity levels between the two trials, shown most clearly for the video application, accounts for the difference in power demand shown in Figure 6.2. For both time goals, applications adapt less frequently at the beginning of the experiment and more frequently as the time goal nears. This shows the effect of the variable gain which makes the system stable when the goal is distant and agile when the goal is near. Figure 6.4 summarizes the results of ?ve trials for each time goal of 20, 22, 24, and 26 minutes. These results con?rm that Odyssey is doing a good job of energy adaptation. The desired goal was met in every trial. In all cases, residual energy was very low: the largest average residue, for the 20 minute experiment, is still only 1.2% of the initial energy value. The average number of adaptations by applications is generally low, but there are some cases where it is high. However, the cases that exhibit high number of adaptations are an artifact of the small initial energy value, since the system is designed to exhibit greater agility when energy is scarce.

6.1. GOAL-DIRECTED ADAPTATION

77

Speech Fidelity

26 Minute Goal 20 Minute Goal
Max

Min 0 500 1000 1500

Elapsed Time (s) Video Fidelity
Max

Min 0 500 1000 1500

Elapsed Time (s) Map Fidelity
Max

Min 0 500 1000 1500

Elapsed Time (s) Web Fidelity
Max

Min 0 500 1000 1500

Elapsed Time (s)
This ?gure shows how Odyssey meets user-speci?ed goals for battery durations of 20 and 26 minutes when running the composite and video applications described in Section 4.8. It shows changes in application ?delity. The applications are prioritized with speech having the lowest priority, and video, map, and Web having successively higher priority.

Figure 6.3: Example of goal-directed adaptation—application ?delity

78

CHAPTER 6. SYSTEM SUPPORT FOR ENERGY-AWARE ADAPTATION
Goal Met 100% 100% 100% 100% Residue Number of Adaptations Energy (J) Time (s) Speech Video Map 145.2 (25.3) 15.3 (1.9) 10.8 (1.6) 11.0 (4.0) 0.4 (0.9) 107.5 (61.5) 12.9 (7.2) 2.8 (0.4) 28.2 (5.2) 1.6 (2.6) 101.2 (22.3) 13.0 (4.5) 5.0 (7.9) 22.6 (9.8) 9.6 (3.8) 60.2 (28.7) 8.7 (5.9) 1.0 (0.0) 6.0 (2.8) 15.4 (4.6)

Speci?ed Duration (s) 1200 1320 1440 1560

Web 0.0 (0.0) 0.0 (0.0) 1.2 (1.8) 7.6 (5.9)

This ?gure shows system behavior when the composite application executes concurrently with the video player. Each experiment uses a 12,000 Joule modulated energy supply. Each row shows the result of specifying a different battery-duration goal. The second column shows the percentage of trials in which

相关文章:
时序约束下电池状态感知的电压自适应调整.pdf
JournalofComputerAidedDesign&ComputerGraphics V01....TN47 BatteryAwareAdaptiveVoltage AdjustmentwithTime...Extendingthebatterylifetimeisenergyconsumptionand...
Extending Mobile Computer Battery Life throughEnerg....pdf
Extending Mobile Computer Battery Life throughEnergy-Aware Adaptation Abstract Energy management has been a critical problem since the earliest days of mobile...
...WLANs with Collision Aware Rate Adaptation Algorit_论文_....pdf
Throughput Analysis of IEEE 802.11 Multirate WLANs with Collision Aware Rate Adaptation Algorit_专业资料。ItrainlJunlftmainadCouignentoa ora oto n mpt ...
The Add-on Impact of Mobile Applications in Learning ....pdf
experience by extending their interaction with ... battery capacity, input interface and network ...(2004). Context-aware adaptation for mobile ...
An Efficient Energy Aware Routing Protocol for Wireless ....pdf
An Efficient Energy Aware Routing Protocol for ...adaptation of dynamic situation changes and inter...battery power while extending the life time of ...
...and network-aware adaptation in mobile environments.pdf
network-aware adaptation in mobile environments_专业...When an object or a computer moves, the ... which track object movement by extending and ...
An Energy-aware Synthesis Methodology for OS-driven Multi-....pdf
An Energy-aware Synthesis Methodology for OS-...battery-driven embedded systems has led to signi?...energy adaptation of one or more steps in the ...
Contents_图文.pdf
Perfect link adaptation: Mean data rate of ...Lifetime=Battery energy / Power consumption Life...Raghavendra, "Power-aware routing in mobile ad ...
Enhanced Rate Adaptation Schemes with Collision Awareness_....pdf
Korea Department of Electrical and Computer Engineering, Iowa State University...awareness capability, called CARA (Collision Aware Rate Adaptation) in [1]....
A mobility-aware file system for partially connected ....pdf
Since PFS provides adaptation at the le system ...The portable computer then caches les from the ... Energy-Aware File Syst... 暂无评价 70页 免费...
A Context-aware and Adaptive Learning Schedule framework for ....pdf
A Context-aware and Adaptive Learning Schedule ... Mike Joy Department of Computer Science, ... context-awareness and adaptation in mobile ...
机电英语1.doc
computer User firm company which applies CAD/CAM ...aware of primary fire condition carbon dioxide CO2...adaptation to a particular condition Preventive ...
Extending mobile computer battery life through ener....pdf
Extending mobile computer battery life through energy-aware adaptation or any other entity. Keywords: Energy-aware adaptation, application-aware adaptation, ...
PROACTIVE ENERGY-AWARE VIDEO STREAMING TO MOBILE HANDHELD ....pdf
of Information & Computer Science, University of California, Irvine, CA 926...On the proxy, it performs a energy-aware video stream adaptation through ...
Energy-aware routing in cluster-based sensor networ....pdf
Energy-aware routing in cluster-based sensor ...is crucial in extending the life of the sensor....Since sensors are battery- operated, keeping the ...
Application-aware adaptation for mobile computing.pdf
Application-aware adaptation for mobile computing_... Morgan Price School of Computer Science Carnegie ... cache space, battery power, and communication ...
POWER-AWARE METRICS FOR WIRELESS SENSOR NETWORKS.pdf
Although many developers have looked at extending ...battery life, power-aware routing is a relatively...Communication is the major consumer of energy in ...
Transparent Caching of Web Services for Mobile Devi....pdf
Energy-aware web caching... 暂无评价 6页 免费... ADAPTATION WITH CONST... 暂无评价 4页 免费 ...Mobile devices are constrained by Battery, CPU, ...
Performance analysis of power-aware route selection....pdf
Performance analysis of power-aware route ...GARCIA-LUNA-ACEVES AND KATIA OBRACZKA Computer ... it might not succeed in extending the lifetime...
Battery-aware Static Scheduling for Distributed Real-time ....pdf
battery lifespan evaluation metric which is aware ...Thus, reducing energy consumption and extending ... Department of Electrical and Computer Engineering,...