Citrix ADC on AWS – Deploying HA Setup

Reading Time: 9 minutes

Hello everybody and sorry for being a little bit quiet recently on the blog but COVID-19 happened and there is still a ton of work to-do. Guess you can feel me because we are all sitting in the same boat 🙂 At the beginning of this year I jumped into a cool project where I had to migrate the F5 ICA-Proxy of a customer to a Citrix ADC high availability pair in AWS. I have done this before on Microsoft Azure and learned the hard way that stuff is working different in a public cloud. Shotout to Daniel Weppeler & Ben Splittberger: That was some “fun” until everything was running 😉 I was expecting building the HA setup on AWS with the knowledge I collected during the deployment on Azure would be easier but unfortunately everything is different on AWS. It really took me some time putting the pieces together. While writing this blog post we already migrated all the users to the Citrix Gateway running on AWS (across two zones) and everything is running smoothly (fail-over included). I guess some of the readers are now thinking: “Why the hell are you doing High-Availability in a public cloud? Its so much easier to do GSLB across to zones!” Well that was out of question because the customer already bought the VPX licenses (Standard > No GSLB) and in addition they didn’t want to maintain two dedicated appliances where the configuration is not synchronized across both nodes. Alright lets get started on how to deploy HA which is a no-brainer for on-premise deployments but so much different when running your instances on AWS.

Overview

If we take a look at the architecture diagram from the Citrix documentation we can see that the VPX instance has connected three network interfaces. Don’t try to deploy an one-arm deployment. This will not work! Many people like one-arm configurations because its the most easy one and you don’t need to think about the routing or maybe even make use of PBR. In AWS a three-arm configuration this is the recommended way. This means we need to create the following networks and attach it to the instance:

1.) Management
2.)VIP (Client Facing)
3.) SNIP (Backend Communication)

From the technical perspective it would be enough to work with two interfaces (Management & Data Traffic) but its recommended to segregate between public and private networks. I never tried it and focused on the 3-arm configuration. In addition its important to understand that there is a huge difference between:

a.) deploy high availability in the same zone
b.) deploy high availability across different zones

In this post we are going to focus on option b. Keep in mind that before firmware 13.0.41x the fail-over was handled different with the ENI (elastic network interface) migration on the AWS site. Make sure you are running a newer build because this method is deprecated. The ENI is nothing else as network interface attached to the VM, just picture a vNIC on VMware or any other hypervisor If you are considering to work with private IP addresses (VIP) you need to have firmware 13.0 build 67.39 and onwards. Do not get confused yet more about that later.

Ways of Deployment

Before we can start configuring the ADC we need to provision the instances in our AWS VPC. If you never heard of VPC this stands for “Virtual Private Cloud” and it is a logical isolated section where you can run your virtual machines. Lets assume our VPC is located in the segment “10.161.69.0/24”. This would mean we can have subnets in that ip address range. Since we strive for a deployment across two zones we would need the following subnets in our VPC.

NameIPv4 CIDRIP-RangeDescription
Management (Private-A)10.161.69.32/2810.161.69.33 – 10.161.69.46NSIP
Services (Public-A)10.161.69.0/2710.161.69.1 – 10.161.69.30VIPs
Transfer (Private-A)10.161.69.48/2810.161.69.49 – 10.161.69.62SNIP
Citrix Infrastructure (Private-A)10.161.69.64/2610.161.69.65 – 10.161.69.126VDAs

Management (Private-B)10.161.69.160/2810.161.69.161 – 10.161.69.174NSIP
Services (Public-B)10.161.69.128/2710.161.69.129 – 10.161.69.158VIPs
Transfer (Private-B)10.161.69.176/2810.161.69.177 – 10.161.69.190SNIP
Citrix Infrastructure (Private-B)10.161.69.192/2610.161.69.193 – 10.161.69.254VDAs
Please be aware that you can not use the first four ip addresses in a subnet because this is reserved for AWS!

Regarding the provisioning of the instances you have several options. I am not naming all but I think these are the most common ones:

I am not going into detail about the provisioning process itself but if your goal is to deploy a HA-pair across two AWS zones you need to work with the HA INC mode. If you never heard of the INC mode let me give you a short summary. If your NSIPs are located in different subnets you need to enable the INC (Independent Network Configuration) mode when creating the high availability setup. With the enabled INC mode not everything from the configuration will get synced across both nodes, in this case exceptions would be for example:

  • Subnet IPs
  • Routes
  • VLANS
  • Route Monitors

Since our primary and secondary node are hosted in different data centers you need to work with INC. When building a HA-pair in a single AWS zone you will not need INC. Just make sure that you do not mix up the order of the instance interfaces.

Working

VPX Instance #1

eth1: Management (NSIP)
eth2: Frontend (VIP)
eth3: Backend (SNIP)

VPX Instance #2

eth1: Management (NSIP)
eth2: Frontend (VIP)
eth3: Backend (SNIP)

Not-Working

VPX Instance #1

eth1: Management (NSIP)
eth2: Frontend (VIP)
eth3: Backend (SNIP)

VPX Instance #2
eth1: Management (NSIP)
eth2: Backend (SNIP)
eth3: Frontend (VIP)

Prerequisites

  • Make sure that you create the needed IAM role with the following permissions. This is mandatory and without this the fail-over in AWS will simply not work. Attach the created role to both ADC instances.
  • The NSIP addressed for each instance must be configured on the default primary ENI (Elastic Network Interface)

High Availability with Elastic IP Addresses

You will need to work with Elastic IP addresses (EIP) if you want to publish a VIP to the Internet. This will be our Citrix Gateway vServer and should be reachable from any location around the globe. The Elastic IP address is a reserved public IPv4 address (static) and is associated with an EC2 instance. They are called elastic because you can detach & attach them to another instance what is exactly what we are going to need when initiating a fail-over and the secondary node will promoted to primary. To make the flow more visual I created the following diagram of the architecture.

(1) A client is accessing the Citrix Gateway service. The user is browsing to “apps.corp.com” which is resolving to the AWS Elastic IP address 18.158.10.199.

(2) The Virtual IP which is hosting the Citrix Gateway Service is located in the private ip range of the VPC subnet “Services (Public-A)”. In this case: 10.161.69.13.

(3) After the successful authentication the use the user will launch a published application. The ICA/HDX connection to the VDA is happening over the SNIP “10.161.69.53” which is located in the network “Transfer (Private-A)”.

(4) If you take a closer look at the VIP (10.161.69.13) you can see that the Virtual IP is the same in both zones. To make the routing work we need to work with ip sets on the ADC. The IPset will allow us to use additional IP address for this service. From an AWS perspective you will have the IP “10.161.69.13” on the network interface in eu-central-1(a) and the IP “10.161.69.148” on the network interface in eu-central-1(b).

If a HA fail-over is happening the EIP will be migrated to the instance in Zone B. We did several testings and the HDX session (TCP-based) will reconnect in under 10 seconds.

In this example our backend servers and VDAs are located in the network “10.161.69.64/26” and “10.161.69.192/26”. Make sure that you create the needed routing entry’s on the ADC. Since we are working with HA-INC this needs to happen independent in each zone. The Gateway IP will always be the gateway address from your transfer network in each zone. You can see there are a lot of additional networks in the routing table because most of the VDAs are still running in the on-premise datacenter of the customer.

Routing Table in Zone A

Now lets take a closer look on the configuration that you know how the final configuration looks like.

  • Here we can see the ip mapping of the Elastic IP-Address to the VIP
  • Summary of the ADC Instance in Zone A. You can see the assigned private IPv4 addresses
  • Summary of the ADC Instance in Zone B. You can see the assigned private IPv4 addresses
  • Overview of the Citrix Gateway Virtual Servers
  • Handling of IPset

Hints:

  • If you are having issues with the failover process there is a dedicated logfile available. This is located under: /var/log/cloud-ha-daemon.log
  • If you are not configuring an IPset the configuration will not synchronize. Make sure to configure this before checking the secondary node.

High Availability with Private IP Addresses

In the previous chapter I covered how to publish a VIP to the Internet with the help of Elastic IP addresses. In some use-cases you might just want to load balance a service and access is it from inside the VPC or maybe even in another subnet which is connected via AWS Direct Connect or a Site-To-Site VPN. Lets assume our requirement is to not publish the StoreFront LB to the internet we can work with private ip addresses as well. This can be done on the same ADC appliances but you will need firmware 13.0.67.39 or higher. I assume High-Availability in INC mode is already configured and the IAM-Role applied to both instances. Before we can create an internal private VIP we need to determine a subnet which can be used for our vServers. The requirement is that this network is not overlapping with the VPC network. To translate this in our example: IP Addresses of the network range “10.161.69.0/24” can not be used. We need to create a “dummy” network.

NameIPv4 CIDRIP-RangeDescription
LB-Internal10.161.47.192/2610.161.47.193 – 10.161.47.254VIPs for Internal Load Balancing
SNIP: 10.161.47.254

It is important to add a SNIP for this LB-Internal network otherwise the internal communication will break! Do not miss this step because is not mentioned in the documentation. If you skip this part and you create a Session Profile for Citrix Gateway with the StoreFront LB VIP you never will be able to reach the StoreFront service. You will receive error messages like internal server error after the authentication process.

After the network range is defined we need to create a route inside the VPC which is pointing to the primary ADC instance.

The last configuration on the AWS side is to disable Source/Dest.Check on the ENI of the primary instance which should be used to access the VIPs. We are going to disable this on the ENI of the network “Transfer (Private-A)”. If you are asking yourself why this only needs to be done on the primary instance: during the fail-over this will be handled automatically configured on the secondary (new primary) and the routing table will be modified as well (IAM-Role needed).

With the prerequisites in place we now we can create our first Load Balancing vServer with a private ip address. This can be done without adding IPsets. Nothing to take care of.

After creating the LB vServer you can already access the VIP inside the VPC. For networks outside the VPC for example a client network in the HQ or a branch office you need to point your routing for the “LB-Internal” network to AWS. Contact the networking team if its not you having control over everything.

Summary

First of all I want to say thank you to Farhan Ali and Arvind Kandula from Citrix, who helped me to get the final configuration working. I hope this post is helping people out there to understand the deployment architecture on AWS and gives a more realistic view of the deployment steps beside the Citrix documentation. In the end its not that hard to configure but you need to know how everything is playing together. If you have any questions or improvements please feel free to contact me.

31 comments

  1. Hi,
    we did the same configuration last year (but only with one virtual Gateway Server) and I can understand how big the difference is between an OnPrem HA configuration and the one in AWS. and how many effort it costs.
    I’m interested, how did you configure the second virtual gateway server? we also wanted to install a second virtual server, but now we have problems with the HA not working anymore.

    Thanks in advance
    Annette

    1. Hello Annette, the Gateway vServer configuration is synced via HA. The key is to bind an IPset to make everything addressable in the second AWS zone.

      1. Hi,
        thanks for your fast answer.
        we have already create a IPSet for the productive virtual gateway server and it worked fine. we could observe failover multiple times. But now we would create a second virtual Gateway server with its own EIP.. since this HA failover doesn’t works anymore. I not sure if this is the reason.
        the question is – must I create a second IPSet for the second virtual gateway server?
        Thanks

        Annette

        1. The IPset is synchronized via ADC HA. Create it on the primary appliance.
          Make sure to add the used VIP for each zone as a secondary private IPv4 address on the AWS instance.

  2. We found we had to add the IP Set to each ADC as they don’t sync if using INC. The name of the IP Set must be the same on each ADC too. Lastly, the IP Set need to be created *before* you add the VIP to the ADC under Traffic Management/Load Balancing Servers. If you added a VIP then went back to add the IP Set, it would not work during a failover. Each VIP needs its own IP Set as well. Here are some notes I took when we set this up a few years ago:

    1. Login to the Primary ADC
    a. In System\Network\IPSets create an entry called ipset_director (for example)
    b. Add the internal IP address of eth 1 of the Secondary ADC that corresponds to the Director VIP

    2. Login to the Secondary ADC
    a. In System\Network\IPSets create an entry called ipset_director (Note: ipset names must be the same on each ADC!)
    b. Add the internal IP address of eth 1 of the Secondary ADC that corresponds to the Director VIP

    Note: the IPSets on both ADCs must have the same names and reference the same IP addresses!

  3. Hello Julian,

    thanks for sharing this configuration with us. I also only have a standalone ADC running in AWS.
    You mentioned that you have already configured a Citrix ADC HA with ICA proxy in Azure. This is exactly the challenge I am facing right now. Maybe you have some tips for me?

    Regards
    Thomas

  4. Hi,
    i followed all the required configuration on my HA pair across availability zones. i am able to force failover, and i can see the secondary VIP which is bound to the IP set becomes active upon failover. However, in my configuration the Elastic IP doesn’t reattach to the newly primary VPX. Therefore, client traffic to the external facing VIP remains mapped to the old primary (now secondary) unit therefore blackholing traffic. Any suggestions

    1. 1.) Check the logfile: /var/log/cloud-ha-daemon.log
      2.) Make sure you have configured all the needed IAM permissions
      3.) IPset Name is identical on both nodes

  5. thank you for your quick response. As it turns out while troubleshooting a different issue, I realized that the NSIP interfaces didn’t have internet connectivity. So, they weren’t able to make the necessary API calls to re-attach the elastic IP. Once fixed, failover worked right away (within a 5-7 second delay) and I was able to continuously ping my external facing VIP. Thank you Citrixguyblog!!!

  6. Hi, first if all, great job for this blog!

    I’m currently testing a new installation of à ha pair of Citrix adc in aws, same AZ, same subnet. The problem that I got is for the vip after failover, i’m mot able to reach the vip.

    After reading this blog, you said define a dummy network mot overlapping vpc subnet.

    Is there a configuration that I need to do in Aws or in ADC or this is only that I need to use the range when creating VIP?

    Thanks!

      1. Hi, the first HA pair is in place for the internal LB and it’s working well (Inter AZ / Private subnet). Thanks for the information !

        Now, I’m configuring the external pair in HA Multi AZ on public subnet but I’ve some trouble.

        When you wrote “If you are not configuring an IPset the configuration will not synchronize. Make sure to configure this before checking the secondary node.” do you mean that we need to configure the ipset before configuring HA ?

        When I configure ipset, somethime it break the HA…..I’m suspecting that maybe I miss something in the steps order…

        Thanks,

  7. this saved my a** 😉
    The citrix KB was not very helpful but you pointed me in the correct directions.
    Running a HA stack across to AZ’s. Was missing some IAM policies which were not set by the cloudformation stack and a correct default route to the internet. But now it’s working fine.
    nearly gone insane. ;P

  8. Hi Julian,

    We followed your guide for deploying ADCs in HA on AWS using private IPs. When the failover is completed, the route table is not being updated to point to the ENI of the secondary ADC. We confirmed the IAMRoles exist and are allocated to the ADCs… any pointers?

    1. Hello Kelvin, did you check the cloud-ha-daemon.log for any hints ? Does your IAMRole contains the „ec2:CreateRoute“ and „ec2:DeleteRoute“? Is it working when modifying the Route table manual (after succesfull failover).
      Julian

      1. Hi Julian,
        We have checked the cloud-ha log but cannot see any failures. Our IAM role has the Create and Delete permissions also. Connectivity works when we manually update the ENI in the route table.

  9. Hello Everyone,
    A very nice article indeed
    While configuring HA Private IP solution across Multi Zone HA these are the key points to consider
    * IAM permissions should be proper
    * INC mode enabled
    * Same number of interfaces on both Primary and Secondary and Same Sequencing of Interfaces which is explained below
    * Interface device index number must match. AWS console-> Network&Security-> ENI-> Device Index number. The number for Management will be 0 for both and similarly for client interface its 1 and then server its 2 and so on. So if primary has device index 0,1,2 then Seocndary ENI should also be in same sequence of 0,1,2 etc
    * Device index mismatch occur due to detach and attach of more than 1 interface. For e.g if you detach same interface and attach same interface they doesn’t change but if you attach a new interface and detach an old interface that will cause in dev index number being change. AWS doesn’t automatically correct the Dev index number. To do so you need to detach and attach the wrong interface again in right sequence so that the dev index number matches with the other HA node
    *Since IAM permission are kicked in when the device failover that’s why ENI src/dst check should be disabled manually first time on the Primary node interface to which route is pointing for traffic to flow OR do a HA failover 2 times and IAM permission will allow ADC to do so on its own.
    * Make sure cloudhadaemon is running in Shell–> ps -aux | grep cloudha
    * Make sure the firmware version is above 13.0.70.x
    * Check the logs in /var/log/cloud-ha-daemon.log

  10. Hi citrixguyblog,
    HA is working in our configuration across two different AZ using INC mode. However, we are unable to access published application via the Storefront VIP configured on the custom/non-overlapping subnet. External clients are able to connect via EIP, authentication to the Netscaler Gateway is successful, published apps are correctly enumerated, clients are able to download the ICA file. We are unable to launch any published apps and the connection eventually times out. The main question i have based on your blog, for external clients to connect to published apps via Netscaler Gateway, does the storefront VIP need to be in the Public-A subnet, or does it have to be in the custom/non-overlapping subnet ?

    1. Hello Marcio. From a technical view this shouldn’t make a difference if the VIP is being placed in the Public/custom subnet. From a security point Storefront is an internal service and shouldn’t be put in a public network. You are saying the apps and desktops are being enumerated and you receive the .ica file. This means StoreFront communication is working and you are probably missing the route/firewall rules to the VDAs.

      1. i did nstrace and no traffic enumerating from Snip to VDAs. If i use LB SF VIP i can open apps fine.
        If i add host file the gateway ip so i am comming from internal it has the same behaviour , ica downloaded but not app running. If i add a vda ip as a service on port 2598 was going UP state.
        Any ideas where to look ?

    2. Marcio , we have the same issue. We getting the ica and after click on it nothing found on traffic from SNIP to VDAs. After timeout we get error unknown 0.

      Did you found the problem ?

      thank you.

  11. Phenomenal article and comments section. I am UP and failing over a Sharefile content switching vserver between AZs in AWS. Thanks!

  12. We used the Terraform to configure HA across Zones in AWS and all seems fine but we have an issue where the elastic (public) IP moved from node1 to node2 during a HA failover test but now is stuck on node2. This is true for the VIP that was created with Terraform and another we created with IPSets. Any ideals what this could be. Since they are both using the same IAM role and it was all configured initially with Terraform and the initial failover worked from node1 to node2 not sure if it would be the IAM role. Thanks.

      1. Thanks. I did see errors on one of the nodes and it was DNS. Not really sure why but the AWS default dns did not work on one of the nodes. Configured a custom DNS externally and it fixed the issue. The issue was one of the nodes could not resolve the API URL. Thanks.

  13. Hi Julian,
    Great Article!

    I have few questions. (sorry for my ignore, the ADC setup in AWS is a lot different than on-prem, and this is my first time working on AWS, so it’s a lot of trail and error along the way 🙂

    With this proposed solution (HA with Private IP Addresses), is it possible to have for example Primary ADC in zone 1a be able to communicate with backend servers (Storefront, DDC) in both zones 1a and 1b?
    When I try this, ADC is only able to communicate with backend servers in same zone as the ADC and not across AZs.
    From network/VPC perspective, we have proper routing in place across AZs, just that ADC is not able to communicate with backend servers across AZs.

    Also, is it possible to configure DNS servers in ADC that are in a separate VPC?
    ADC – MGMT VPC (192.168.4.0 /24)
    DNS – Apps VPC (192.168.8.0 /24)

    1. Hello Salman, it sounds like you are missing routes on the NetScalers. Do you have routes in place for the networks in zone a and zone b?
      The route table needs to be maintained on both nodes, since you are using HA-INC.

      1. Thanks Julian, yes I suspected this to be routing issue within the Netscaler since the servers in both AZs were able to communicate without any issues.
        I followed this article from Citrix to configure the static routes and modify the default gateway. (steps 1 to 5)
        https://support.citrix.com/article/CTX200509/how-to-manually-change-the-default-gateway-from-the-management-network-to-the-data-network-on-the-netscaler-instance-of-a-cloudbridge-40005000-appliance

  14. =======
    daemon logs:
    root@netscaler-az1-ha-new# cat /var/log/cloud-ha-daemon.log
    2024-01-04 18:46:56: INFO: If MultiZone PrivateIp HA is configured then : ec2:DeleteRoute, ec2:CreateRoute, ec2:ModifyNetworkInterfaceAttribute, IAM Permission missing on ADC
    2024-01-04 18:46:56: INFO: Found route in VPC for 192.168.10.10/32 for ENI eni-0f48344ad61a8b612
    2024-01-04 18:46:56: INFO: Route Deleted for 192.168.10.10/32 with eniid eni-0f48344ad61a8b612
    2024-01-04 18:46:56: INFO: Route created for 192.168.10.10/32 with eniid eni-05fded82841e08a11

    -====-
    post ha failover , vip goes down when secondary becomes primary, reporting above logs .
    iam policy is allowing required policies , cloudtrail logs confirm the same ,seems like netscaler is reporting permission issues as seen on daemon logs

Leave a Reply

Your email address will not be published. Required fields are marked *