Sample, Neal (2005) Scheduling And Resource Allocation In Autonomous Service Networks. PhD thesis, Stanford University.
Fee-based computing models are gaining attention in both cooperative and commercial computing environments. Grid computing and web services are models where organizations can charge a fee or trade resources with clients for use of the service, making it possible for clients to use resources and services that are too expensive, too esoteric, or simply inefficient for individual clients to manage or maintain. Unfortunately, benefits of the Grid model come with a loss of control over job execution and real uncertainty about job completion: distributed services are not under the control of the client. Clients cannot control resource allocation to recover from inaccurate runtime estimates, runtime delays, and execution hazards. We introduce 'surety' as the probability that a composed program will finish execution within a deadline window forecast by the service provider. Short runtime services with a high surety (high confidence of success) would be more valuable than services with longer running times or lower surety. Client selection of service providers depends on what values they place on time, cost, and surety, simultaneously. Earlier schedulers treated the remote service problem as a multivariate optimization accounting for only two variables: cost and time. We extend these schedulers by accounting for the uncertainty introduced when services are not under the client's control. Ultimately, control over the service remains at the provider's site. This is the distinction from software models where components are subservient to calling routines. Distributed computation schedulers often assume a cooperative environment where delays are rare, and that initial estimates come from oracles. Our work addresses systems that may be rife with uncertainty, affecting the reliability of schedules generated a priori. We present a new scheduler (the Surety-Based Scheduler (SBS)) that uses a statistical notion of surety to mitigate uncertainty in distributed computing environments and to select strategies to recover from runtime hazards. The goal of our scheduler is to take programs composed of multiple distributed services and completes them within client specified soft deadlines by targeting a minimum level of surety throughout program execution. We demonstrate scheduling, monitoring runtime progress, and schedule repair when surety drops below a threshold value. Monitoring coupled with reactive rescheduling is the key to providing clients with the surety of distributed job completion, similar to performance they would expect from a program running on local resources. By making programs composed of distributed services more reliable, these compositions become an increasingly viable solution for a wide range of problems, and become an appropriate solution for a larger class of clients. Our work is broadly applicable to any system where estimates may be gathered a priori and where clients may monitor runtime progress. In this work, we cover several broad contributions to service computing, including the CLAM compositional language, the CPAM runtime, an address of data extraction and mediation considerations, and (of course) scheduling and rescheduling in uncertain environments.
|Item Type:||Thesis (PhD)|
Computer Science > Distributed Systems
Computer Science > Data Integration and Mediation
|Related URLs:||Project Homepage, Project Homepage||http://infolab.stanford.edu/, http://infolab.stanford.edu/CHAIMS/|
|Deposited By:||Import Account|
|Deposited On:||02 May 2005 17:00|
|Last Modified:||22 Dec 2008 18:38|
Repository Staff Only: item control page