The University of Arizona
banner image

Justin Cappos

Real World Systems Research

Stork

Stork provides intelligent package management for the real world systems of today. Over the past 5 years software management systems have made small incremental improvements while more substantial changes have occurred on end systems and servers. Worms have become a constant disruption and threat. This further stresses the need for a package management system where the package maintainers can quickly distribute security updates without involving third parties (like repository maintainers).

In the past few years virtualization has become a popular technique for separating processes or/and users. This separation increases the costs of virtualization while providing better security. Stork provides most of the security benefits of virtualization while providing the performance benefits of sharing software between virtual machines. This results in saving disk space, network bandwidth, and even the amount of memory used on system.

To learn more about the other features and benefits of Stork, please visit our webpage.

We've also examined the security of different package managers. We looked at ten popular package managers and found they are vulnerable to a variety of attacks. Furthermore, launching these attacks is trivial because we demonstrate it is simple to obtain an official mirror for a distribution.

Ryan Globbi at the CERT was kind enough to contact the developers to responsibly disclose the vulnerabilities and also published a blog entry on the subject. There was also a spirited discussion about this research on slashdot.

San Fermín

San Fermín is a system for aggregating large amounts of data from the nodes of large-scale distributed systems. Each San Fermín node individually computes the aggregated result by swapping data with other nodes to dynamically create its own binomial tree. Nodes that fall behind abort their trees, thereby reducing overhead. Having each node create its own binomial tree makes San Fermín highly resilient to failures and ensures that the internal nodes of the tree have high capacity, thereby reducing completion time.

PSKI

Current large-scale authentication and non-repudiation systems offer various security measures, but do not meet the needs of today's Internet-scale applications. Though several designs exist, there have been no significant deployments of Internet-scale security infrastructures. We propose a novel concept called the public-space that makes complete information of digital entities actions publicly available to every user. It is a structured framework that maintains a large number of entities, their actions, relationships, and histories. Posting such information in public does not endorse the information's correctness, but it does provide users with a quantifiable set of information that enables them to detect faults and make informed security decisions. Combined with traditional cryptographic techniques, the public-space system can support the intrinsic heterogeneity of user security requirements in Internet-scale infrastructures and applications.

Information we have chosen to make public is at: PSKI Project Webpage.

Prodeva

In order to evaluate systems like the PSKI, we need to collect large data sets from semi-anonymous networks with the right properties. We are gathering data from a number of different sources (more to come).

Our project page is at: Prodeva Project Webpage.

BACKS

There are a number of problems when allocating resources on global networks without central administrative control. We provide the underlying mechanisms that are necessary for resource allocation systems to provide users with a secure trading environment. We do this while supporting the policies used on popular testbeds today, proposed future auction systems, and a spectrum of design choices in between.

This project is still in the development phase. More information to come...

Public information is at: BACKS Project Webpage.

Net-X

Net-X focuses on making the Internet data-centric rather than server-centric as it is today. My primary interest are the security implications of such a system.

More information to come...

Current public information is at: Net-X Project Webpage.

Seaweed Cost Estimation

Seaweed is a scalable query infrastructure supporting on-line aggregation and completeness prediction of data stored on end systems. Seaweed leverages Pastry for failure resilient metadata replication, query dissemination, and result aggregation.

Unlike some similar solutions (such as PIER), Seaweed does not replicate information. I modified Seaweed so that it may selectively replicate information either within other peers in the network or by streaming it to a central source. A query engine may retrieve estimates of the cost and benefit of replicating subsets of information. If the cost of replication is outweighed by the benefit of decreased query overhead, decreased query latency, and improved query completion percentage, the information will be replicated in the most appropriate manner.

Misc Research

Simultaneous Embedding

I worked on the problem of simultaneously embeddings a tree and a path. I proved that for arbitrary trees and paths, the tree edges can be straight lines if the path is allowed to be either circular arcs or have 2 bends.

For a cool program (I didn't write) that allows a user to play around with example graphs with simultaneous embeddings, see this webpage

DsCats

DsCats focuses on providing students with visual representations of data structures as a teaching aid. Students can use a "VCR"-like interface to move forward and backward through the actions of the data structure. They can also zoom in and out to increase or decrease the level of detail they see. DsCats provides an interface so that students can code new data structures for it and have the benefit of visual debugging of their data structure.

Misc. Low Level Code

I've written drivers or other low level code for devices in the context of HCI related projects. I've written camera driver code for infrared cameras that is used to do gesture recognition in unlit environments. I also wrote mouse driver code that allows arbitrary HCI programs to portably move the mouse pointer on screen.

For the DiamondTouch project I wrote application specific code which allows multiple mice to be used on a single system in an independent manner. Unfortunately, I didn't have an opportunity to work with the DiamondTouch table itself (so I've never played TetraTetris).