Research

Programming Language Design for Service-Oriented Systems

Salman Ahmad. MIT EECS Dissertation.

Abstract: Designing systems in a service-oriented manner, in which application features are decoupled and run as independently executing services over a network, is becoming more commonplace and popular. Service-oriented programming provides a natural way to model and manage many types of systems and allows software development teams to achieve operational flexibility, scalability, and reliability in a cost-effective manner. In particular, it has been used quite successfully for Web and mobile ap- plications. However, building, deploying, and maintaining service-oriented systems is challenging and requires extensive planning, more effort during development, a de- tailed understanding of advanced networking techniques, and the use of complicated concurrent programming.

This thesis presents a new programming language called Silo. Silo integrates features that address key conceptual and pragmatic needs of service-oriented systems that, holistically, are not easily satisfied by existing languages. Broadly, these needs include: a unified distributed programming model, a simple yet efficient construct for concurrency, a familiar yet extensible syntax, and the ability to interoperate with a rich ecosystem of libraries and tools.

In this dissertation, I describe how Silo’s features, constructs, and conventions satisfy these needs. Then, I present various compiler and runtime techniques used in Silo’s implementation. Lastly, I provide a demonstration, through a variety of programming patterns and applications, of how Silo facilitates the design, implemen- tation, and management of service-oriented systems.

Dissertation. MIT EECS 2014.

The Dog Programming Language

Salman Ahmad and Sep Kamvar

Abstract: Jabberwocky is a social computing stack that consists of three components: a human and machine resource management system called Dormouse, a parallel programming framework for human and machine computation called ManReduce, and a high-level programming language on top of ManReduce called Dog. Dormouse is designed to enable cross-platform programming languages for social computation, so, for example, programs written for Mechanical Turk can also run on other crowdsourcing platforms. Further, machines and people are both first-class citizens in Dormouse, allowing for natural parallelization and control flows for a broad range of data-intensive applications. Lastly, Dormouse includes notions of real identity, heterogeneity, and social structure. We show that the unique properties of Dormouse enable elegant programming models for complex and useful problems, and we propose two such frameworks. ManReduce is a framework for combining human and machine computation into an intuitive parallel data flow that goes beyond existing frameworks in several important ways, such as enabling functions on arbitrary communication graphs between human and machine clusters. And Dog is a high-level procedural language written on top of ManReduce that focuses on expressivity and reuse.

Paper. UIST 2013.

Webzeitgeist

Ranjitha Kumar, Arvind Satyanarayan, Cesar Torres, Maxine Lim, Salman Ahmad, Scott R. Klemmer, and Jerry O. Talton

Advances in data mining and knowledge discovery have transformed the way Web sites are designed. However, while visual presentation is an intrinsic part of the Web, traditional data mining techniques ignore render-time page structures and their attributes. This paper introduces design mining for the Web: using knowledge discovery techniques to understand design demographics, automate design curation, and support data-driven design tools. This idea is manifest in Webzeitgeist, a platform for large-scale design mining comprising a repository of over 100,000 Web pages and 100 million design elements. This paper describes the principles driving design mining, the implementation of the Webzeitgeist architecture, and the new class of data-driven design applications it enables.

Paper. CHI 2013 - Best Paper.

Data-Driven Web Design

Ranjitha Kumar, Jerry O. Talton, Salman Ahmad, and Scott R. Klemmer

Abstract: This short paper summarizes challenges and opportunities of applying machine learning methods to Web design problems, and describes how structured prediction, deep learning, and probabilistic program induction can enable useful interactions for designers. We intend for these techniques to foster new work in data-driven Web design.

Paper. ICML 2012.

Jabberwocky

Salman Ahmad, Alexis Battle, Zahan Malkani, Sep Kamvar

Abstract: Jabberwocky is a social computing stack that consists of three components: a human and machine resource management system called Dormouse, a parallel programming framework for human and machine computation called ManReduce, and a high-level programming language on top of ManReduce called Dog. Dormouse is designed to enable cross-platform programming languages for social computation, so, for example, programs written for Mechanical Turk can also run on other crowdsourcing platforms. Further, machines and people are both first-class citizens in Dormouse, allowing for natural parallelization and control flows for a broad range of data-intensive applications. Lastly, Dormouse includes notions of real identity, heterogeneity, and social structure. We show that the unique properties of Dormouse enable elegant programming models for complex and useful problems, and we propose two such frameworks. ManReduce is a framework for combining human and machine computation into an intuitive parallel data flow that goes beyond existing frameworks in several important ways, such as enabling functions on arbitrary communication graphs between human and machine clusters. And Dog is a high-level procedural language written on top of ManReduce that focuses on expressivity and reuse.

Paper. UIST 2011.

Flexible Tree Matching

Ranjitha Kumar, Jerry O. Talton, Salman Ahmad, Tim Roughgarden, and Scott R. Klemmer

Abstract: Tree-matching problems arise in many computational domains. The literature provides several methods for creating correspondences between labeled trees; however, by definition, tree-matching algorithms rigidly preserve ancestry. That is, once two nodes have been placed in correspondence, their descendants must be matched as well. We introduce flexible tree matching, which relaxes this rigid requirement in favor of a tunable formulation in which the role of hierarchy can be controlled. We show that flexible tree matching is strongly NP-complete, give a stochastic approximation algorithm for the problem, and demonstrate how structured prediction techniques can learn the algorithm’s parameters from a set of example matchings. Finally, we present results from applying the method to tasks in Web design.

Paper. IJCAI 2011.

Bricolage

Ranjitha Kumar, Jerry O. Talton, Salman Ahmad, Tim Roughgarden, and Scott R. Klemmer

Abstract: The Web today provides a corpus of design examples unparalleled in human history. However, leveraging existing designs to produce new pages is currently difficult. We introduce a novel structured-prediction algorithm, Bricolage, for automatically transferring design and content between Web pages. Bricolage learns to create coherent mappings between pages by training on human-generated exemplars. The produced mappings can then be used to automatically transfer the content from one page into the style and layout of another. We show that Bricolage can learn to accurately reproduce human page mappings, and that it provides a general, efficient, and automatic technique for retargeting content between a variety of real Web pages.

Paper CHI 2011 - Best Paper

Weblines

Neema Moraveji, Salman Ahmad, Chigusa Kita, Frank Chen, Sep Kamvar

Full Title: Weblines: Enabling the Social Transfer of Web Search Expertise using User-Generated Short-form Timelines

Abstract: Web search encompasses more than fact retrieval; it is a primary entry point for learning. Exploratory search tasks are attempts at such learning and require cognitive, strategic, and interpretive work from the user. The pathways of such searches are likewise complex and nuanced. The present study attempts to enable the human work that goes into conducting exploratory searches to be efficiently captured and transmitted to other learners. By this method, web search expertise can transfer socially and implicitly between users instead of developing individually or through directed learning. The system we deployed uses an existing metaphor, the timeline, to structure insights from searches. We refer to these semantically meaningful representations as ‘weblines’. We deployed a live system to 81 users in three user populations. The resulting weblines were delineated into four types. Successful weblines were those that participants used to iteratively reflect upon the insights of their searches.

Paper CSCL 2011

Understanding How Users Map Regions Between Web Pages

Abstract To assist novice users in creating web pages, we envision a tool that automatically transforms an existing web page into the layout and design of another. In order to achieve this vision, one must develop a learning algorithm that is capable of mapping regions between web pages. To do so, an in-depth understanding of user behavior is necessary. Thus, we seek to understand how users map regions between web pages. There were two fundamental questions that were asked: (1) do users map web pages consistently and (2) what motivates their mapping decisions? A custom interface, which asked users to select corresponding regions between two web pages, was utilized in a Mechanical Turk study and in a lab study. We found evidence that indicates that users do map web pages consistently and discovered that users tend to create mappings in a manner that preserves the hierarchy of the pages while also pairing semantically salient elements.

Report

Affective Learning Companion

Abstract There have been many studies demonstrating the link between affective and cognitive learning, but this link is often ignored in our, “one-size fits all,” education system. This paper introduces an on-screen virtual agent that can assist educational professionals with achieving a balance between a student’s affective state and cognitive learning; an affective learning companion. The affective learning companion ascertains a user’s emotions and responds in a manner that improves the user’s affective state in the short-term, and improves learning in the long-term. It does so using a platform consisting of three sensors (skin conductance glove, pressure sensitive mouse, and posture sensing chair) and facial recognition software. With this system we hope to be able to analyze users and determine their current affective state including boredom, fatigue, frustration, and excitement.

Paper