Saturday, June 04, 2011

Publish Maven site with Amazon S3 and CloudFront

Amazon S3 now supports static website hosting. As a 10 years Maven user, I wonder how easy it is to deploy Maven generated site to Amazon S3 and let the rock-solid storage provider to host my project websites.

There are several existing s3 wagon providers, which all seem to have the same problem, not supporting directory copy. This is understandable since before S3 new website hosting feature, I guess people mostly expect to deploy artifacts rather than website to S3. So my first task is to write an AWS S3 wagon that supports directory copy.

With AWS Java SDK, task becomes as simple as one single class. I made my S3 wagon available in Maven central repository at org.cyclopsgroup:awss3-maven-wagon:0.1. The source code is hosted in github:jiaqi/cym2/awss3.

The next thing is to create an S3 bucket in console. To avoid trouble, bucket name is set to the future website domain name according to this discussion. Website feature needs to be explicitly enabled. I also created an IAM account with limited permission just for website management.

Comparing to other S3 wagon configuration, it's pretty much the way to configure a project to use S3 wagon. Add awss3-maven-wagon extension:
<extensions>
  <extension>
    <groupId>org.cyclopsgroup</groupId>
    <artifactId>awss3-maven-wagon</artifactId>
  </extension>
</extensions>

Set distribution management with s3 as communication protocol
<distributionManagement>
  <site>
    <server>
      <id>my-server-id</id>
      <url>s3://my-s3-bucket/project/path</url>
    </server>
  </site>
</distributionManagement>

And configure AWS credentials in settings.xml
<settings>
  <servers>
    <server>
      <id>my-server-id</id>
      <username>AWS_ACCESS_KEY_ID</username>
      <password>AWS_SECRET_KEY</password>
    </server>
    ......

Now after a site:deploy target, entire site is uploaded to S3 bucket. The website is available now under default domain name http://<bucket name>.s3-website-us-east-1.amazonaws.com. S3 bucket doesn't allow me to configure CNAME to match. This is why bucket name needs to match domain name as I plan to create friendly CNAME under my own domain. Obviously my bucket will share the same IP in Amazon cloud with others. Without explicit configuration, they only way to figure out which bucket to serve request is to match bucket name.

The last thing is CloudFront. No reason why not to take advantage of CloudFront. It's easy to setup, and it accepts CNAME configuration(unlike S3 website).

One problem about CloudFront is that since it's not designed to work as a website, request with path like http://mysite.com/a/dir is not mapped to content a/dir/index.html. S3 does not have directory concept, a/dir/index.html and a/dir can be two different objects coexist in the same bucket. Request like http://mysite.com/a/dir is mapped to object a/dir instead of a/dir/index.html. Such problem does not exist in S3 static website, while it exists when I hook up CloudFront and S3 bucket, since the hook has nothing to do with website feature anymore. In the end, I had to create object a/dir with html redirection page to redirect to a/dir/index.html in order to work around this problem.
Post a Comment