Wednesday, June 15, 2011

What is GSS?

Online file storage is a very hot buzzword; everybody does it or wants to do it. If you add the 'cloud' keyword to the description, the hotness goes way up. We describe the GSS project as the open source cloud file storage platform - can't get more hot than that - but it seems that this description is perceived in many different ways by different people.


So what exactly is GSS?

Let's start with what GSS is not. It is not a distributed file system and it is not a block level storage solution.

A single sentence definition could be: GSS is a distributed application that implements a file system abstraction with full text search and rich semantics, amd supports access via a REST API.

In more than one sentences, GSS:

  • is an application level layer that can scale to multiple servers (for handling fluctuating loads),
  • handles users with unique file name-spaces that contain folders and files, a trash bin, tags, versions, permissions for sharing to users & groups of users and a powerful full text search service,
  • stores meta-data in a back-end database (we run RDBMS in production installations and we are implementing NoSQL architectures for trully massive scalability - the source repo already contains a prototype mongoDB branch),
  • stores file bodies to third party file storage systems which are treated by GSS as a black box. For example (a) in the Pithos service installation file storage is based on a large scale, redundant, hardware based SAN which is accessed by the GSS workers as a mounted file system, (b) MyNetworkFolders on the other hand uses Amazon S3 via the AWS API.

I see, but how is it useful to me?

GSS can be useful in various scenarios:
  1. Out of the box, you can easily setup a service that offers an online "file manager" interface to groups of users (in your enterprise / organization, in your customer or across enterprises). Users can access the service via the GSS web client, android / iPhone client apps, webDAV, the Firefox plugin / XUL application, or your custom clients (via the REST API).
  2. You can use the GSS server as is, for setting up the server side platform for supporting the file storage & handling requirements of your custom mobile, desktop or web applications (via the REST API)
  3. You can extendn the GSS back-end or just one of the clients to build a new application that implements your own specific requirements.
We are keeping the GSS server generic in order to be able to use it as a core building block at least in the  three cases discussed above, although we plan to add new functionality, mostly related to file-based collaboration. Along these lines, our high level roadmap priorities are (priority decreases from top to bottom):
  • Improve performance, resource utilization efficiency and scalability. Plans to move to a NoSQL back-end and efficient use of distributed second-level-caching (e.g. infinispan) fall into this category.
  • Simplify installation and out of the box support in different environments. Currently the code base requires a Shibboleth infrastructure for user authentication. It is a priority to add LDAP-based and simple DB-based login mechanism and a user admin back-office application.
  • Implement v2 of the API aiming for simplicity better usability and performance.
  • Add features and new functionality.

No comments:

Post a Comment