cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Saradhi.Potharaju
AppDynamics Team (Retired)

When the Controller is set up for High Availability, users see a "Controller disabled on this host" message on the secondary Controller when a partial replication is run and full replication with the -f flag is not yet complete.

 

When a replication runs without the -f flag, it copies the MySQL data and index files from the primary server to the secondary while App Server (Glassfish) runs on the primary server. There may be tables updated by App Server thus changing the underlying data files, but these files may have already been copied over to the secondary server. This is an incremental sync.

 

Running a replication with the -f flag stops the Controller on the primary server, ensuring that no files will change, then compares the files between the primary and secondary servers and only copies those that are different. This is final sync.


More incremental syncs create less downtime when running a replication with the -f flag, as there are fewer files to copy in the direct sync.

 

To ensure MySQL does not start on secondary server until the final sync is complete, replicate script deletes the <controller_home>/bin/controller.sh file on the secondary server until the replication is finished and hence the message "Controller disabled on this host" so that controller cannot be started on secondary. Once the final replication completes,<controller_home>/bin/controller.sh is automatically copied from primary to secondary server and MySQL on secondary server is started.

Comments
Curt.Mayer
AppDynamics Team

be very careful when throwing around terms like 'backup' and 'primary'. these are not servers. they are roles. they switch. when controller.sh is renamed, it is because of a hard (persistent) failure. you had a failover where the primary was not accessible at the time of the failover. the logic here is that you cannot trust the state of the database on the former primary, so we disable the database from starting and potentially corrupting the system state. on failover, if you can't set the primary database passive, then you disable it to prevent the possibility of a split brain. recovery without re-replicate requires all of the following: make sure skip-slave-start=true is in both db.cnf, this prevents any replication from running while doing the checks. then, make sure that replication is stopped by running 'stop slave' on the primary. then, start the secondary database by renaming controller.sh-disabled to controller.sh and chmod +x the file. check the db health. update global_configuration_local to passive and secondary <- do NOT forget this step. then start the slave on the secondary. after running for a while, seconds behind master will be zero, and you can remove skip-slave-start from db.cnf on both machines and you can start the slave on the primary. if you don't understand every word of the foregoing, you can easily trash your installation. open a ticket if you need help.

 

Version history
Last update:
‎09-14-2018 12:59 PM
Updated by:
On-Demand Webinar
Discover new Splunk integrations and AI innovations for Cisco AppDynamics.


Register Now!

Observe and Explore
Dive into our Community Blog for the Latest Insights and Updates!


Read the blog here