• If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.


No announcement yet.

Valence Instance not restarting correctly

  • Filter
  • Time
  • Show
Clear All
new posts

  • Valence Instance not restarting correctly

    Every night we have a backup program run, and during that we end the instances and start them back up after the backup runs. We have been having a recurring issue where our live instance of valence is sometimes down in the mornings. Our development and base instances never have this issue, its only ever been the live. When I look at the instances in the base environment, it shows that the live is running, but trying to open it just results in a white screen. The only difference I can think of would be that there is possibly active users on the live environment when it is stopped.

    Any ideas on a possible fix for this?

  • #2
    How do you fix the issue in the morning when you notice you can't access production? Do you just restart the instance again and then it works?


    • #3
      Yeah. As i said, the base instance shows that its running, but i just stop and start it from there and it fixes the issue every time.

      We added a couple minute delay in the nightly program between stopping and starting the instance incase the issue had something to do with a race condition between the two commands, but that seems to have not fixed anything


      • #4
        In the mornings before you restart the production instance, have you checked the start times on the production Valence jobs to see if they are indeed the new jobs? Maybe the instance isn't ending all the way and a check of the times would confirm that. Also, have you checked the job logs on the production Valence CGI jobs? There might be something else running that is interfering with the startup of Valence that is resolved by the next morning when you do it manually. After you restart the instance manually, have you checked the Errors app to see if it shows any specific error?

        If your overnight script that starts and stops the instances is doing the exact same thing you are doing manually the next morning then there must be some issue there that only digging into the logs will reveal.


        • #5
          Well it happened again this morning so we went through all the live environment active job logs, but couldn't find anything of note. We also checked the error app after restarting but there wasn't anything of note: just a "Security violation: no session id" warning that happened hours after the restart program would have finished.

          Some of the active jobs did have start times from the last time i had to restart the instance manually, while some were from this morning. So it looks like you might be right about it failing to end all the way. I changed the order of server shutdowns/startups and added some extra delays after the full shutdown and startup programs just incase a different part of the nightly backup is messing with the commands.

          Hopefully that will fix it, but I wont know for sure for a while since this is an issue that only happens sporadically.