Censorship Resistant Web Publishing Systems.
Swatiben B. Shah
The Internet is a means of a limitless exchange of thought and information around the world. It is a strong network built based on the acceptance of an unwritten rule of cooperation between the publisher and receiver. This property of the Internet makes security a major concern to both the parties. In the case of the receiver, security is the ability to access the information on the web without resulting in any damage to him. This is provided by the current technology, which can filter the unwanted content based on the requirement and to some extent by the online publishing act, which controls the flow of the published content instead of censoring it. A publisher's security on the other hand is not just the protection of the documents from prying eyes, it also includes protection of the freedom to communicate. There has never been any universally accepted rule as to what is to be published and what is not and hence the freedom to publish can be controlled by any higher authority or system administrator who can deny or modify the publishing in the name of censorship. As it is straightforward to trace any document back to a specific web server and usually to an individual, the publisher’s thin shell of privacy is even more evaded. Hence a publisher friendly web publishing system should create an environment where the publisher remains anonymous, which protects his privacy, and has complete control over his documents, which makes him censorship resistant, and makes documents that are tamper resistant and fault tolerant.
Anonymity on a distributed file system like the web can be either to hide the identity of the individual requesting the web page or to hide the identity of author of the web page from the retriever. Publisher anonymity is provided by means of a URL rewriting service. An individual who would like to keep his URL “U” anonymous submits his URL to such a service provider and in turn receives another URL of the form
Where Ek(U) represents URL “U “ encrypted with the service provider’s public key. This new URL hides U’s true value and therefore keeps the publisher anonymous. Upon receiving a request for the encrypted URL, the service provider decrypts the encrypted part of the URL with its private key identifying the web page’s true location, which it retrieves and sends to the requesting client. One level of encryption and decryption using one set of public and private keys may make it easier for the attacker to break through. A network of such service providers working together would make the system more robust. Hence the network would consist of a set of computers, each of which runs a HTTP proxy server and possess a public private key pair. Each HTTP proxy server is addressable via a unique URL. An individual wishing to hide the true location of a file “F“ first decides on a set of servers through which a request for the file “F” is to be routed. Using the encryption technique, each service provider encrypts the URL it gets and sends it to the next provider in the route. Each service provider knows only to encrypt and decrypt the URL at hand, which in turn becomes an input to another service provider and so on. None of the service providers knows the original URL and the final encrypted URL together, thus making the publisher untraceable.
Designing a system that is censorship resistant has to be done taking into account the potential of the adversaries. A document that is available on only one server is vulnerable not only to a potential adversary but also to natural calamities and the network connectivity provider. Replication of data on to many servers around the world is one solution to overcome this problem. Copies of the file will be stored on a number of servers round the world. Like the Internet, this service will depend on the cooperation of a large number of systems whose only common element will be a protocol. There will be no head office which could be coerced or corrupted, and the diversity of ownership and implementation will provide resilience against both error and attack. One way to implement would be to consider the network as peer to peer network in which any new server is allowed to join and exit the network dynamically simply by notifying an existing member of the network which inturn would notify the next and so on.This strategey of notifying every other server when one change occurs protects the system form the possibility of the adversary joining the group and manipulating the system .
With these technologies put together, the user can publish a document remaining anonymous and be resistent to censorship.
A censorship resistant system being built has to provide a means for tamper detection because the adversaries can become a part of the existing network and try to manipulate any document stored on his server. Considering the worst-case scenario when the network would consist of many adversaries working together to tamper with a document there should be a means to protect it. Tamper detection is usually accomplished by applying some encryption on the data. A publisher encrypts the content and places a copy of it in many servers. The key used for encrypting the data is to be preserved and so the key is encrypted or distributed among the servers in such a way that the complete key for decrypting the document cannot be obtained unless the user has access to a specified number of servers .The number of servers to which the user should have access to is set based on size of the network and its credibility. At this point the server has no idea what it is hosting. When a retriever requests a document and it is found to be untampered, the publishing system produces a special URL using a URL rewriting service that is used to recover the data and the shares. Any modification to the stored content or the URL results in a failed tamper check. If all tamper checks fail the content cannot be read.
The publisher in order to have complete control over the documents published should also have a means to modify or delete a document. During the initial posting of the document on to the web, the publisher generates a password or a unique ID for each document to authenticate his documents .The servers also maintain the copy of this ID along with the document and verifies it every time a publisher makes changes to the document.
Putting together all the techniques, technologies and algorithms, censorship resistant tamper resistant systems have been built and are currently being implemented in some parts of the world. Eternity service is one of the early developments in this field to design a censorship resistant system in which the documents were indestructible even by the publisher. Publius is another system of the same kind went a step ahead by providing the provision of updating and deleting the document by the publisher but suffers from the drawback that the set of servers maintaining the data have to be static. Another development in this is the creation of a tamper resistant system without replicating: ”Dagster “- which accomplishes this by “intertwining” legitimate and illegitimate data, so that a censor can not remove objectionable content without simultaneously removing legally protected content. The Dagster system was designed to be as simple as possible, so that it can be easily scaled to numerous single servers and distributed system based models.
Case Study: Publius
Publius is a system designed for publishing content on the web, which is censorship resistant and keeps the publisher anonymous. In this system, encrypted document is stored on a subset of Publius servers. The encryption key is split and spread across multiple servers. A retriever downloads the encrypted document from one server and key shares from multiple servers, reconstructs the key and retrieves the original document by decrypting it. Publius also allows for the updating and deletion of a document by the publisher. Publius is based on the assumption that there is a system-wide list of m available, geographically dispersed, Servers.
Publishing a Document: When a publisher wishes to publish a document F, he first generates a symmetric key K and encrypts F with it. Then, he splits K into N parts such that any P of them can reproduce the key, but any P-1 of them will not give any hint of K. (it can be done using Shamir’s secret sharing scheme).
For each of the N parts, the publisher computes
namei = wrap(H(F . parti))
where ‘.’ represents concatenation and wrap is the xor of the two halves of hash.
Then, he computes the locations of the servers that will host the document by calculating
locationi = (namei MOD m) + 1
The publisher now uses these N values of locationi as an index into the list of servers and publishes the encrypted file(stored in a file called file in namei directory of the server) along with the part of key (stored in a file called share of the same directory).
Retrieving a Document: To retrieve a document F published in the system, the retriever must have the Publius URL, U, of F. He then parses out the namei values from U and, for each one, computes the locations of the servers hosting that file. Next, he selects P of these arbitrarily and retrieves the encrypted file and one part of K from each of these. He then combines the parts to form the key K and decrypts the file to get F.
To verify that everything went well while retrieving, the retriever re computes namei values using the values of F and K he retrieved. If all the P namei ’s match the ones in the URL, he can be satisfied that the document is intact. If any of the namei values do not match, he starts over again with a different set of P key parts and a different encrypted file stored on one of the other N-1 servers until either the temper check passes or all the possible combinations are exhausted (in which case, the document is indeed irretrievable.) This is Publius’ tamper-check mechanism.
Deleting a Document:The system allows the publisher to delete the file he published from all the servers in the system. It also makes sure that no one can remove files published by others. To achieve this, just before the publisher publishes a file, he generates a password, PW. He then sends the encrypted document, key-part, and H(server domain name .PW) to the servers that will be hosting the document. To delete the document, the publisher needs to send H(server domain name . PW) to each hosting server. The server compares the password received to the one stored, and if they match, removes the file.
At the time of publication, the publisher has an option of specifying that a document should be undeletable. This causes all delete and update requests to fail. This mechanism prevents an attacker from trying to delete the file via the Publius Delete protocol.
Updating a document: The system also allows a publisher to update the file he has published. This is done in such a way that the URL of that file is not changed. When a publisher wants to update his file, he specifies a file containing the new content, the original URL, the original password PW, and a new password. The update program first simply publishes the new content. Next, the original URL is used to find the N servers hosting the original content. Each of these servers receives the original password and the new URL, each server compares the password with the stored one, and if they match, creates a new directory called update and places the new URL in it. When any retriever tries to access a URL, if the update file is missing, the document is retrieved as described before, but if the update file exists, the servers return the update URL.
There are many software components that implement the Publius system on top of HTTP.
Publius URLs: Each successfully published document in the system is assigned a Publius URL, which has the following form
http://!publius!/options/encode(name1) … encode(namen)
Where the encode function generates an ASCII representation of the namei value.
The options section encodes four fields: the Publius version number, the number of parts needed to form the key, the size of the server list, and the update flag. The update flag determines whether or not the update operation can be performed on the document associated with this URL. The initial !publius! flag distinguishes a Publius URL from other “normal” URLs.
Server Software: A server just needs to install a CGI script available from the Publius website to participate as a Publius server. The client software communicates with the server by executing an HTTP POST operation on the server’s CGI URL, specifying the operation (publish, retrieve, update or delete). The server performs the specified operation and sends a status code back to the client.
Client Software: The client software is an HTTP proxy, which implements all the Publius operations. When it encounters a non Publius URL, it transparently sends it to appropriate server and passes the result content back to the browser. When it encounters a Publius URL, it performs the retrieve operation along with error checking and sends appropriate response back to the browser.
Limitations of Publius
Publius, in its current form, has many limitations concerning its content as well as extent of security provided.
Content Type: Only static content such as html, pdf, images, postscript are supported by Publius. There is no support for interactive scripting such as CGI, and limited support for Java Applets.
Key-part Deletion or Corruption: If all N copies of the encrypted file are deleted, corrupted or otherwise irretrievable, then it is impossible to recover the original document. Similarly, if N-P+1 key-parts are deleted, corrupted, or cannot be retrieved, it is impossible to recover the key. In either case, the published document is effectively censored.
Redirection Attack: If several malicious server administrators succeed in inserting an update file in a large number of servers (more than P servers) for a given file F, then the retrieve request for F will return the file inserted by the malicious administrators rather than the original document, effectively censoring the document.
During the publication process, the publisher has the option of declaring a URL as updateable. When a client attempts to retrieve content from a non-updateable URL, all update URLs are ignored. This can defeat the redirection attack.
Denial of Service Attack: An adversary could use the system to publish content until the disk space on all servers is exhausted, resulting in a denial of service attack to genuine publishers. We can take a simple measure of limiting each publishing command to 100K or charging for space using some anonymous e-cash system to prevent this attack.
Threats to Publisher Anonymity: The system does not anonymize all hyperlinks in a published HTML file. Thus, if a published document contains hyperlinks back to the publisher’s Web server, the publisher’s anonymity could be in danger. Also, there is no provision for connection-based anonymity. So an adversary eavesdropping on the network segment between the publisher and the Publius servers could easily determine the publisher’s identity.
Pros and Cons
Censorship resistant systems can be a useful tool in preventing denial of service attacks, like those that shut down Yahoo and eBay earlier this year. It can also be used as a corporate backup system by having the backup data stored in different parts of the world using such systems, which would make it indestructible instead of storing on backup tape in a safe, which is susceptible to natural calamities. These systems, that are supposed to act as an instrument in helping individuals voice their opinions and protecting democratic values, may also turn out to be a means for criminal activities. Child pornography could be displayed brazenly. Copyrighted material could be republished openly and anonymously. Creating a system, which gives freedom of speech and expression and only have it work for things, which are universally accepted, is contradicting and impossible. So the users are to decide if it is an acceptable trade-off for protecting against privacy and human rights violations while opening a door for pirated songs and software, and other illegal material to be left uncensored.
1. The Online Cooperative Publishing Act (SafeSurf's Proposal for a Safe Internet Without Censorship)
2.W. Wayt Gibbs ,Speech without Accountability,oct 2000
3.The Eternity Service, Ross J. Anderson,
4.Lorrie Cranor, Avi Rubin, Marc Waldman,2000, Publius Censorship Resistant Publishing System http://publius.cdt.org/
5. How to Share a Secret, Adi Shamir, http://szabo.best.vwh.net/secret.html