next up previous
Next: 4. The Query Language Up: 3. System Overview Previous: 3. System Overview

   
1. Extending the Database

To extend the database with new problem classifications, a user downloads a checklet template, modifies and tests it, and uploads the new checklet into the server where it is added to the checklet coop. This is illustrated by points \fbox{5}-\fbox{7} in Figure 3.

To the best of our knowledge, A$\lambda $goVista is the first search engine on the web to allow arbitrary users to upload executable code into its database. Obviously, there are a number of security issues that have to be addressed.

      \begin{figure*}[htbp]
\centering
\caption{Evil and stupid checklets.}\subfigure[...
...ipage} \\ \hline
\end{tabular}\end{tt}}
\latexonly {\\ \hrulefill}
\end{figure*}

Figure 5 shows some examples of hostile checklets. Figure 5 (a) shows an overly general checklet evil1 that was uploaded in an attempt to promote someone's web site. Regardless of the input query, evil1 will always accept and return a link to the bogus site. Checklet evil2 in Figure 5 (b) launches a denial-of-service attack by stealing as many CPU cycles or as much memory as possible. Checklet evil3 in Figure 5 (c) attempts to compromise the security of the A$\lambda $goVista server by reading from or writing to the local file system. Checklet stupid1 in Figure 5 (d), finally, while not being outrightly hostile, uses an extremely slow result checking algorithm which results in effects similar to that of a denial-of-service attack.

A$\lambda $goVista checklets are written in Java and are executed with the same security privileges as an applet would. This allows us to rely on Java's built-in security features to prevent checklets from compromising the security of the A$\lambda $goVista server.

Denial-of-service attacks [14] are more difficult to deal with. While time-outs are used to stop checklets from stealing too many CPU cycles, as far as we are aware, Java does not provide the means to limit the dynamic memory allocation of a process.

It is unclear whether there are any strong technical means to prevent attacks by overly general checklets. The same problem plagues keyword search engines such as AltaVista: to promote their own web-pages unscrupulous users will ``submit pages with numerous keywords, or with keywords unrelated to the real content of the page'' [1]. Currently, we require every checklet to provide a list of accepting examples, as shown in Figure 5 (e). When a checklet is uploaded A$\lambda $goVista ensures that

(a)
the checklet accepts every one of the example queries it has provided, and
(b)
the checklet only accepts a small fraction of all the example queries provided for all other checklets in the coop.
While not foolproof, this policy provides a reasonable level of security.
next up previous
Next: 4. The Query Language Up: 3. System Overview Previous: 3. System Overview
Christian S. Collberg
2000-01-27