Mass communication loss overnight?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Jemster
    Automated Home Guru
    • Dec 2018
    • 123

    #16
    Originally posted by DBMandrake View Post
    No, the controller only sends set point changes to HR92's when there is a change in set point. Additionally identical set points in a row do not send out redundant set point changes.

    So if you have 5C at 11pm and then 5C again at 2am (which I do on downstairs zones, to catch any after 11pm manual override I make if I'm staying up late) then the second one won't be sent unless a manual set point change occurred between the two to some other temperature.

    This not sending "redundant" set point changes also causes issues of it's own, if an HR92 set point override isn't registered by the controller properly it won't be cancelled. This is the case with a multi-room zone's - so if you have a multi-room zone with 5C at 11pm and 5C at 2am, and manually turn the HR92 up at midnight it will not be turned down to 5C again at 2am as the controller is not aware of the set point change and thinks there is nothing to revert! This can happen in single room zones occasionally if there are any minor loss of comms. (The workaround is to alternate "duplicate" set points in the schedule slightly, like 5.5/5.0 so that it always considers there to be a change in set point that needs to be transmitted)
    Interesting - I was under the impression the system did a round-robin to bring everything into sync, thus any communication failure on a single update didn't matter because it would get re-sync'd on the next update. Also, if I reboot an HR92, I thought it picked up the correct set point a few minutes afterwards...

    My symptom doesn't quite match up with the Set-Point change 'window of opportunity' as that would only be a few minutes discrepancy, but I've just noticed the switch-back jumps seem to be about an hour after the Set-Point change. I hadn't noticed that before. The hour must be significant in some way to this issue.

    I'm trying to see if the first overrides align with my scheduled setup BUT really helpfully it looks like Honeywell are having problems with the app and I can't get logged in at the moment. Think they are trying to rub salt in the wound...

    Comment

    • Jemster
      Automated Home Guru
      • Dec 2018
      • 123

      #17
      Ok, App is back. Breaking this down per-room... and given the granularity of the samples is no better than every 5 minutes:


      TV Room is set to go from 20.5 -> 10.0 @ 22:30. It returned to 20.5 at 23:25, and at 00:30 it went to 20.0 (not a temperature anywhere on it's schedule) and then at 1:30 stopped reporting temperature. So that's an hour between each jump.

      Living Room is set to go from 20.5 -> 10.0 @ 23:00. It returned to 20.5 at 23:45 and at 00:50 it went to 20.0 (again, not anywhere on its schedule). So 45 minutes, then an hour.

      Bedroom 2 is set to go from 20.0 -> 10.0 @ 23:00. It gave up on temperature reports at 01:25. So that was 2.5 hours.

      Bedroom 1 is set to go from 19.0 -> 10.0 @ 23:00. It went to 20.0 @ 1:40 (no 20 degrees on it's schedule). After being reset @ 2:45 it bounced back to 20 again at 3:45... so there's a 2:40 followed by an hour

      Bedroom 3 is set to go from 20.0 -> 10.0 @ 23:00. It went to 20.0 @ 23:20. After 2:45 reset it bounced back to 20.0 at 3:20. Only coincidence I can see here are both were at 20-past-the-hour.


      Don't know what this shows. From the TV room, living room and bedroom 2, I thought I was on a half-hour problem. But Beds 1 and 3 look to be separate.


      Gawd why's it so tough to debug ... and still no ideas on how come one of my DTS92s went off for both incidents.

      How are Honeywell on consumer support via email? Thinking it would be easier to explain written down with my log spreadsheets than to try to do it over the phone.

      Comment

      • paulockenden
        Automated Home Legend
        • Apr 2015
        • 1719

        #18
        Do you have anything that might be drowning out 868MHz in your house? Wireless speakers / headphones? Baby monitor? Wireless video doorbell?

        Something that streams continuously rather than a device like a weather station that only transmits in short bursts?

        (really clutching at straws here!)

        P.

        Comment

        • Jabes
          Automated Home Sr Member
          • Aug 2017
          • 68

          #19
          How are you using domictz? Do you have an HGI80? You can log the actual messages in debug mode to see what happens over the air if you are so interested.

          Comment

          • Jemster
            Automated Home Guru
            • Dec 2018
            • 123

            #20
            Originally posted by paulockenden View Post
            Do you have anything that might be drowning out 868MHz in your house? Wireless speakers / headphones? Baby monitor? Wireless video doorbell?

            Something that streams continuously rather than a device like a weather station that only transmits in short bursts?

            (really clutching at straws here!)

            P.
            Nope, the only wireless devices we have (that I can think of) are:

            Loop - this is 868 MHz but others are using without issues and it's only a very minimal use of the spectrum so any collisions would be swiftly resolved
            Cordless BT phone - certainly wasn't in use at the time, but pretty sure it's DECT anyway

            No baby monitors or wireless speakers (beyond a Bluetooth speaker), no video doorbell...

            I appreciate the help with the straws. they're kinda hard to grab hold of.


            Originally posted by Jabes
            How are you using domictz? Do you have an HGI80? You can log the actual messages in debug mode to see what happens over the air if you are so interested.
            I don't have the HGI80 so it's the simpler form of integration, i.e. the one built in to Domoticz. I don't believe it gives me any more info than the set point and actual temperature.

            I could upload the Excel files I exported tonight somewhere if anybody's interested in seeing the fine grained detail of each zone...

            Comment

            • Jemster
              Automated Home Guru
              • Dec 2018
              • 123

              #21
              I've been thinking about this a bit more...

              So the theory is that a continuous transmission extra device would cause an issue, but a momentary transmission one, such as Loop, shouldn't. And I was happy with that as a given, the comms method has to be pretty bulletproof... right??

              However, reading up on the phantom override issue, it seems to me there are many situations where the loss of a single Set Point command from the controller would play absolute havoc with the system. If the controller sends out a set-point, and it is missed by the hr92, then the hr92 will report back a different set point up to an hour later and that'll surely make the controller think it has an override?

              So if Loop happened to be pulling data from it's meters at just the moment that the EvoHome controller signalled out the Set Points to the zones, the madness could happen.

              If multiple zones change set-point at a single time, do these happen simultaneously or are they each signalled out at some point in the 4-minute window? i.e. would there be one big burst of 868MHz data or would I have 'n' bursts scattered over the 4 minutes depending on the hr92? Just wondering because at least one zone with an 11pm set-point did switch ok. And one of the trouble zones made it's last switch at 10:30...

              Much as I really like Loop, it may be that compatibility is more out of luck than anything else??

              Comment

              • DBMandrake
                Automated Home Legend
                • Sep 2014
                • 2361

                #22
                Originally posted by Jemster View Post
                Interesting - I was under the impression the system did a round-robin to bring everything into sync, thus any communication failure on a single update didn't matter because it would get re-sync'd on the next update. Also, if I reboot an HR92, I thought it picked up the correct set point a few minutes afterwards...
                Some things are periodically resent, some are not.

                If you make an override directly at an HR92 it sends a set point change to the controller within a few seconds - and every hour (counting from the time the HR92 is booted, not when the set point is changed!) it sends another set point change to the controller to "refresh" it with the same value.

                In fact this hourly set point update is what causes the phantom override if it happens within the window of opportunity between when a scheduled set point change occurs and when the controller actually gets around to sending the update out to the HR92. (which can be up to 4 minutes, so the window of opportunity can be up to 4 minutes within a 60 minute period) When this happens the set point gets reverted to the previous one and it appears to be a manual override. (clock icon)

                So if you reboot the controller at a time when manual overrides are in force the next hourly HR92 set point transmission to the controller sync the controller with the status of that set point.

                For BDR91 (and probably the Opentherm bridge as well, but I don't have one to test) a heat demand request is sent from the controller to BDR91 every time there is a change in heat demand. If the heat demand is constant a "refresh" of the last value is sent every 20 minutes. If a BDR91 doesn't receive any updates for 45 minutes the red light will start blinking but it will continue to operate. After 60 minutes the red light will go solid and the relay turns off.

                Temperature sensors like DTS92 and HR92 send to the controller periodically - so if one message doesn't go through it just causes a delay in the temperature measurement being updated. The HR92 has a variable update rate for sending temperature readings which sends more often the faster the temperature is changing. I think the DTS92 has a fixed update rate of about every 4 minutes.

                The HR92 sends a heat demand update to the controller every time it's heat demand changes (movement of the valve pin position) and I'm not 100% sure but I don't think there is any retransmission of that heat demand if it remains the same for a long period of time. (So a lost heat demand message from HR92 to controller is problematic)

                In the controller to HR92 direction two things are sent - temperature sensor updates and set point updates. Both are sent on a regular schedule of about once every 4 minutes. The protocol actually allows the controller to specify a custom "time to next transmission" in each message so in theory it could be varied from one transmission to the next but current firmware sends the same (or very similar) delay every time of a little under 4 minutes.

                The temperature sensor transmission from controller to HR92 is made in every 4 minute window however the set point change is only sent if a set point change has occurred. Probably the reason for this is that older Evohome controllers and firmware did not support "local override display", so when you made an override with an HR92 the controller was not aware of the override and did not display it. So if it was to keep resending it's version of the set point it would keep overriding the override you made on the HR92.

                Comment

                • DBMandrake
                  Automated Home Legend
                  • Sep 2014
                  • 2361

                  #23
                  Originally posted by Jemster View Post
                  Much as I really like Loop, it may be that compatibility is more out of luck than anything else??
                  I've had Evohome for a year without Loop and two years with Loop now and I can't say I've noticed any real difference in problem rate with Evohome. I have had occasional comms issues before and after.

                  My Loop gas sender is in the boiler closet as the gas meter is directly above the boiler, so this means my loop gas sensor is within a metre of all three BDR91's and my CS92A as well.

                  I also have a Bresser 5in1 outdoor weather station on 868.3Mhz on the same side of the house as the boiler, which transmits approximately 6 times per minute. I've only had that since last Christmas and I haven't noticed any worsening of Evohome's performance.

                  As far as I know devices on 868Mhz are very restricted in how long they're allowed to transmit by regulation - they're only allowed to send very short bursts, not transmit continuously.

                  I've found the biggest source of comms problems to be standing wave related where one or more devices ends up sitting in a standing wave null point resulting from multi-path reflections - a null that may only be present when moveable objects in the house are in a certain location or not in a certain location!

                  One example is that I had placed a portable dehumidifier (with a good chunk of metal in it) on the kitchen floor about a metre from the wall where the BDR91's are mounted, and that caused a complete comms loss to one of the BDR91's (but not the other two nearby ones!) for a couple of hours before I noticed it and moved the dehumidifier. It wasn't even plugged in - it was just sitting in the middle of the floor and not even in direct line of sight between controller and BDR91! (presumably the reflection from it was causing a null)
                  Last edited by DBMandrake; 29 April 2019, 08:32 PM.

                  Comment

                  • G4RHL
                    Automated Home Legend
                    • Jan 2015
                    • 1580

                    #24
                    One example is that I had placed a portable dehumidifier (with a good chunk of metal in it) on the kitchen floor about a metre from the wall where the BDR91's are mounted, and that caused a complete comms loss to one of the BDR91's (but not the other two nearby ones!) for a couple of hours before I noticed it and moved the dehumidifier. It wasn't even plugged in - it was just sitting in the middle of the floor and not even in direct line of sight between controller and BDR91! (presumably the reflection from it was causing a null)[/QUOTE]

                    Just a nearby device or electronic panel with a circuit board can cause issues even when not powered up. My garage door control panel would always prevent another wirelessly operated device from working even when there was no power to the door panel. My Netgear Orbi satellite did not function properly and was intermittent when I first had it near to my TV. I have a radio in my study which, if position to close to my iMac, upsets the iMac. The problem is proximity as well as frequency. As DBMamdrake suggests, it is worth carefully checking what other devices are nearby that may cause an issue.

                    Comment

                    • Jemster
                      Automated Home Guru
                      • Dec 2018
                      • 123

                      #25
                      Originally posted by DBMandrake View Post
                      If you make an override directly at an HR92 it sends a set point change to the controller within a few seconds - and every hour (counting from the time the HR92 is booted, not when the set point is changed!) it sends another set point change to the controller to "refresh" it with the same value.

                      In fact this hourly set point update is what causes the phantom override if it happens within the window of opportunity between when a scheduled set point change occurs and when the controller actually gets around to sending the update out to the HR92. (which can be up to 4 minutes, so the window of opportunity can be up to 4 minutes within a 60 minute period) When this happens the set point gets reverted to the previous one and it appears to be a manual override. (clock icon)
                      This has taken me a while to get my head around, but yes, I can see how that happens. However, wouldn't the set-point (as known by the controller) remain on the original setting rather than go to the new setting for up to an hour? i.e. step by step we have:

                      1. Controller wants to go from T1 -> T2
                      (delay of up to 4 minutes while controller thinks about sending out T2 Set-Point)
                      2. Controller receives T1 from HR92 hourly refresh
                      3. Controller thinks zone is on override to T1
                      4. Controller finally sends set-point T2 (but still thinks it's at T1 as this queued-send is after the controller internal state mechanism)
                      (delay of up to an hour waiting on HR92 refresh)
                      5. Controller receives T2 from HR92 ... problem is corrected

                      So the controller *thinks* the set-point is T2 only for a couple of minutes between (1) and (2). The rest of the time it thinks the set-point for the zone is T1.

                      This doesn't echo the log files I'm seeing as they record the set-point as being T2 for periods of an hour (approx. mostly). So I'm thinking what's happening in my situation is:

                      1. Controller wants to go from T1 -> T2
                      (delay of up to 4 minutes while controller thinks about sending out T2 Set-Point)
                      2. Controller finally sends set-point T2 <==== THIS MESSAGE IS SCRAMBLED
                      (delay of up to an hour waiting on HR92 refresh, controller thinks zone is happily on T2)
                      3. Controller receives T1 from HR92 ... zone appears to be on override
                      4. User ends up crying, because the set-point to T2 is never re-sent. Has to wait until next scheduled Set Point before anything changes.

                      Is the protocol one-way each time? Is it ACK'd by the receiver? i.e. if a collision occurred, would it necessarily be detected and the message re-sent, or is there scope in the protocol for a collision to occur that is not detected?

                      Originally posted by DBMandrake View Post
                      I've had Evohome for a year without Loop and two years with Loop now and I can't say I've noticed any real difference in problem rate with Evohome. I have had occasional comms issues before and after.

                      My Loop gas sender is in the boiler closet as the gas meter is directly above the boiler, so this means my loop gas sensor is within a metre of all three BDR91's and my CS92A as well.

                      I also have a Bresser 5in1 outdoor weather station on 868.3Mhz on the same side of the house as the boiler, which transmits approximately 6 times per minute. I've only had that since last Christmas and I haven't noticed any worsening of Evohome's performance.

                      As far as I know devices on 868Mhz are very restricted in how long they're allowed to transmit by regulation - they're only allowed to send very short bursts, not transmit continuously.

                      I've found the biggest source of comms problems to be standing wave related where one or more devices ends up sitting in a standing wave null point resulting from multi-path reflections - a null that may only be present when moveable objects in the house are in a certain location or not in a certain location!

                      One example is that I had placed a portable dehumidifier (with a good chunk of metal in it) on the kitchen floor about a metre from the wall where the BDR91's are mounted, and that caused a complete comms loss to one of the BDR91's (but not the other two nearby ones!) for a couple of hours before I noticed it and moved the dehumidifier. It wasn't even plugged in - it was just sitting in the middle of the floor and not even in direct line of sight between controller and BDR91! (presumably the reflection from it was causing a null)
                      My Loop transmitter is outside the front of the house in the gas meter cabinet. My loop receiver is in the hallway in the middle of the house. My EvoHome controller is around the middle of the house also in the Living Room. Initially the Loop receiver was in our 2nd living room but it was too far from the Gas Meter to be reliable. Tricky trying to get all these things within range - our electric meter is almost diagonally opposite our gas meter.

                      If it were an obstruction, the failure point would have to be around the controller (multiple zones, both incidents), and this is situated in a constant location on a bookcase in a room that is not in daily use. There is a light on the bookcase, but it's regularly on and off and doesn't move around, so I would expect a lot more failures if it were this object. However, I will try moving a few things around and see how it goes.
                      Last edited by Jemster; 30 April 2019, 09:31 AM.

                      Comment

                      • Jemster
                        Automated Home Guru
                        • Dec 2018
                        • 123

                        #26
                        Originally posted by G4RHL View Post
                        Originally posted by DBMandrake View Post
                        One example is that I had placed a portable dehumidifier (with a good chunk of metal in it) on the kitchen floor about a metre from the wall where the BDR91's are mounted, and that caused a complete comms loss to one of the BDR91's (but not the other two nearby ones!) for a couple of hours before I noticed it and moved the dehumidifier. It wasn't even plugged in - it was just sitting in the middle of the floor and not even in direct line of sight between controller and BDR91! (presumably the reflection from it was causing a null)
                        Just a nearby device or electronic panel with a circuit board can cause issues even when not powered up. My garage door control panel would always prevent another wirelessly operated device from working even when there was no power to the door panel. My Netgear Orbi satellite did not function properly and was intermittent when I first had it near to my TV. I have a radio in my study which, if position to close to my iMac, upsets the iMac. The problem is proximity as well as frequency. As DBMamdrake suggests, it is worth carefully checking what other devices are nearby that may cause an issue.
                        I will look into this... so hard to find a central position away from everything - what kind of distance are we talking here? There's nothing within a couple of feet of it in any direction at the moment.

                        Comment

                        • dty
                          Automated Home Ninja
                          • Aug 2016
                          • 489

                          #27
                          Originally posted by Jemster View Post
                          Is the protocol one-way each time? Is it ACK'd by the receiver? i.e. if a collision occurred, would it necessarily be detected and the message re-sent, or is there scope in the protocol for a collision to occur that is not detected?
                          There is no positive acknowledgement in the protocol. Some messages are requests which receive an immediate reply, but most are just sent with the assumption that they've arrived or that the periodic repetition will fix any missed messages.

                          The radios detect if something else is sending, and won't try and transmit at the same time, so the chances of collision are minimal. There is still a small window where two radios both decide that nothing is transmitting and so start transmitting simultaneously, but they should still be able to detect that. Whether the firmware on the device will do anything about it is another matter.

                          Comment

                          • Jemster
                            Automated Home Guru
                            • Dec 2018
                            • 123

                            #28
                            Originally posted by dty View Post
                            There is no positive acknowledgement in the protocol. Some messages are requests which receive an immediate reply, but most are just sent with the assumption that they've arrived or that the periodic repetition will fix any missed messages.

                            The radios detect if something else is sending, and won't try and transmit at the same time, so the chances of collision are minimal. There is still a small window where two radios both decide that nothing is transmitting and so start transmitting simultaneously, but they should still be able to detect that. Whether the firmware on the device will do anything about it is another matter.
                            And if periodic repetition doesn't occur for Set-Points then this is where it could all go wrong. If there's no acknowledgement on a sent message and both transmitters started at the same time, I doubt they'd detect the message being lost. I'm just not familiar with 868MHz, all my development is UDP and TCP based, but it sounds more like a UDP protocol than the guaranteed delivery you get with TCP.

                            I'm wondering if what I have here is a fault in the EvoHome control unit hardware.

                            I'm also wondering if disabling Local Override (read about this, assume it's on the Zone configuration menu) would help... although if the set-point message is lost, it's lost. The HR92 would still be wrong, the Controller would just think it was right that's not much help.

                            I'm also wondering if the dts92 that went off both times is faulty and started spamming the network, blocking the signals from the controller, before it died.

                            Comment

                            • dty
                              Automated Home Ninja
                              • Aug 2016
                              • 489

                              #29
                              Yes. Honeywell make a thing in their literature about a robust communication protocol, when the reality is that it's anything but. The protocol is completely proprietary, but the UDP analogy is useful.

                              As for the DTS92 spamming, your Domoticz will have logged that (assuming it's using a radio interface, not the Honeywell API). But it seems unlikely that you've got multiple catastrophic failures simultaneously.

                              Comment

                              Working...
                              X