ITsyndicate’s Linux system administrators and AWS specialists with more than 10 years of experience have successfully completed pretty much projects of building AWS Environments. Our team is willing to share our best practice of implementing some kind of scaling and load balancing.
For the WordPress installation in Amazon WS farm, a load balancer and standard instance scaling features can be used to achieve proper scalable environment setup. This setup would enable the AWS scalable instance feature to add in more instances serving WordPress to the users, as the demand increases and on the other hand decrease of the amount of costs when the volume of use is lower.
The setup will use various AWS features, which are here discussed separately. Like in the world generally, there are many ways to do the same thing, so is the case with load balanced web application architecture implementations too. However, the fundamental issue and set of challenges remain the same. When discussing scalable architecture, the cardinal issue is that one is trying to multiply something which by definition is just one. Thus, one is aiming to create multiple instances of one, so that they are more than one, but still remain one. Complex, isn’t it? Luckily, no-one needs to solve the paradox how one can be two and two can be one. The AWS infrastructure does the magic to provide multiple instances of the one, so that each of them is separate, but still the same in a sense that the very same application can be hosted by each of the nodes.
The key feature when creating a scalable architecture is to create real-time synchronisation between the separate nodes. Thus separate nodes, because of the instant replication of any changes to all of the nodes, they can act as one even when being separate. The scalability feature is such a valuable asset, that it surpasses the tiny performance decrease as a result to the real-time replication.
The most outer edge of any application in the networked world is the domain name system. That is the very first contact point a user will hit when working with an application. The Amazon Route53 name service provides a unique set of domain registration features for the demanding customers. One of the most favoured features of Route53 has been the ability to distribute users based on assumed geographical location. Using this, web users coming from a certain known country-specific network block can be routed into dedicated application instances, for example one running nearby.
As the client has resolved the domain name of the service to an appropriate network address, is the next point a load balancer. This one, or multiple ones, will distribute the traffic to a cluster of shared instances (nodes). Typically the traffic at this level is distributed randomly to achieve equal load for the nodes. Alternative load distribution schemes could be used to eg. run down one node, or to prioritise nodes with specific features distinct to the cluster.
Amazon infrastructure provides good load balancing features, as IaaS (Infrastructure as Service) to be used for WordPress installation. In addition to the distribution of the load, a load balancer additionally provides up time improvement, as nodes can be added and removed seamlessly and in a managed fashion. Thus potential downtime will only affect a portion of the users, as the instance pool is getting reorganised.
The core of the scalable WordPress installation is the Amazon EC2 Scalable Instance. That will provide the basic worker node for the load balancer and be in touch with other nodes to synchronise any updates to the data and application across all of the nodes. By doing this, there is virtually no limit to the amount of instances hosting the application. As we know, only the sky is the limit, and when we are already up in the cloud, it is easy to see the vastness of the space.
To share the files between the instances, a common shared file system is required. Our team uses NFS. An essential thing for a file system cluster is adequate performance and availability for all of the nodes. As the application components, eg. php-scripts are read from the file system frequently, a slow file share could become the bottleneck of any otherwise well performing architecture. In any case, it is essential that recent and equal set of files are mounted for all of the instances. A file system might also host user generated data, such as images and attachments, so having one of them missing from some of the nodes would cause severe issues for the user, should one request the resource at that time from that specific node.
Effective web application needs a good cache solution to ensure proper performance and user experience. An industry-wide approach to solve the caching needs has become to be either Memcached or Redis, both found in the selection of tools in Amazon WS portfolio.
A cache is essential to have for any high volume WordPress installation. It providers a fast and modern key-value store, available across the instance nodes. With Amazon WS Cache solutions, a fully qualified Memcached instance can be provided for the WordPress worker nodes. With the shared cache, all nodes are able to instantaneously retrieve the same information.
For networked and distributed application a cache is as important as it is for any locally run native software. The cache provides fast access as it is held in the memory, in contrast to a database or disk surface. All the instance nodes may benefit from the same contents of the cache.
Approaching the essential component of the application, the database, leads us to the final part of the AWS EC2 Instance, a replicated MySQL instance. Since the database is the core storage of any content in the web application, in our case WordPress, it is essential that all the nodes have the same data and that the integrity of the data is not compromised.
Complex databases usually grow in size during time, and the data structures require a fine tuned approach, something only in-memory storage can provide. In real-life web application scenarios, often the large portion of the database queries are reading rather than writing to the database. For that, MySQL and many other popular database clusters were designed form the conceptual approach of multiple “read-only” nodes and a single write-node. While this architecture provides virtually unlimited scalability for the read-only transactions, it cannot provide more than one write transaction at time.
For the WordPress architecture in AWS cloud, an advanced and modern Master-Master replication architecture has been chosen. This is common from modern NoSQL databases, where all the nodes share the same dataset, but due to the lack of transactions, the problem of concurrent writes does not surface. Using Master-Master replication in the AWS hosting architecture for WordPress, both nodes will be able to act as masters while not ending in transactional deadlock where both parties are waiting for the same lock to be released.
Essentially the question is about distributed transactions, for which all the mature database engines provide a configurable interface to cope with. The inherent problem is the same, if not more difficult, than it was when discussing the scaling of the application instances. With the proper usage of SQL level functionalities, the application can be sure that some information written to the replicated database is available to other nodes at the proper moment in time. However as many nodes can write at the same time, independently, in theory remains the possibility that updates remain in conflict and either one needs to give up. This is all solved on the database level by ensuring that writes are ordered in sequential order even when occurring in distributed independent cluster of nodes.
The chosen master-master replication scheme ensures that the same committed data is readable on both of the nodes, and that no dirty data is distributed, in effect implementing a fully serializable distributed transactional environment. This means in the layman’s terms, that the architecture ensures that only things that could have been occurred in reality in sequential order, even independent of each other, are committed to the database.
The final component in the architecture is a plain disk for any static files the application needs. This is provided using Amazon CloudFront service. This enables the effective delivery of static media files and other content which is embedded on the WordPress pages. Amazon Cloud Front content delivery service has been designed from the bottom up to face the challenges of global businesses.