Firmware flaw in recent Stirling SU780XLE -80C freezers

Wow, I never thought I’d learn so much about a freezer company, but here we are. I took a deep dive on this issue with the Stirling SU780XLE ULT freezers. It’s still second-hand (through company reps and people on social media) and I don’t know if I believe everything about the explanations I’ve received (for example, I count roughly 8 instances of freezer firmware getting stuck through various contacts, and I vaguely remember a company rep saying this has happened <= 10 times), but this is my understanding of the situation:

The issue is indeed a firmware problem, and it affects all units produced between ~ Aug 2019 and ~ Sep / Oct 2020. Aug 2019 is when they switched one of their key electronic components to a Beagle Bone (apparently a circuit-board akin to a Raspberry Pi). Part of its job is to relay messages from one part of the circuitry to another. The firmware they wrote for it had a flaw, where — in certain circumstances that the company still does not understand — one part of the relay no longer works, and the other part of the relay just keeps piling up commands that go unexecuted. So that’s the initial issue. There is also supposed to be “watchdog” code that recognizes these types of instances, but this was not working either. Thus, the freezer becomes stuck in the last state it was in before the relay broke. If it was in a “run the engine to cool down the freezer mode”, then it would have been stuck in a state that kept things cold. If it was in a “stay on but don’t do anything b/c it’s cool enough” mode, then it would have been stuck in a state where it didn’t cool the freezer at all. This is the state my freezer was stuck in**.

[** I’m actually not 100% convinced on this. My freezer stopped logging temperatures / door openings, etc at the end of August. If I look at number of freezer hours, it says ~8,000 hrs (consistent with Oct’19 through Aug’20) rather than the ~10,000 hrs for Oct’19 to Nov’20). It is definitely within the realm of possibility that my Stirling has been a zombie for the last 70+ days, and either slowly reached 5*C over time or had a second event over the last weekend that triggered the thaw in its susceptible state.]

It sounded like they had seen numerous freezers get stuck in the former format, which was the less devastating mode since it didn’t result in freezer thawing and product loss. They had seen one freezer get stuck in the catastrophic format before me, back in Aug 20th. They brought it back to their workspace, and couldn’t recreate the failure. They could artificially break the relay to reproduce the condition, allowing them to create additional firmware that actually triggers the “watchdog” (and other failsafes) to reset the system when it has sensed that things have gone wrong, event though they still don’t know what the original cause of the issue is. The reason the freezers produced after Sep / Oct 2020 are unaffected, is that these have already been programmed with the new firmware. The firmware I had when it had encountered the problem was 1.2.2, while it became 1.2.7 after it got updated.

Freezers made / distributed(?) within the last month were pre-programmed with the updated firmware, and are supposedly not susceptible to the GUI freezes. Apparently they’re having trouble updating the firmware in the units b/c the update requires a special 4-pin programming unit that is in short supply due to the pandemic.

I won’t get into the details of my experience with Stirling (it apparently even includes a local rep who contracted COVID). They completely dropped the ball in responding, and they know that (and I’m sure they regret it). What will remain a major stain on this situation is that THEY HAVE KNOWN ABOUT THIS FLAW FOR MONTHS AND DID NOT WARN ANY OF THEIR CUSTOMERS. I received an email ~ 8 days ago saying they were going to schedule firmware updates to “improve engine performance at warmer set points, enhance inverter performance and augment existing functionality to autonomously monitor and maintain freezer operation”. Other customers with susceptible units did not even receive this vague and rather misleading email. My guess is that they chose to try to maintain an untarnished public perception of their company over the well-being of the samples stored by their customers. My suspicion is that their decisions may have been exacerbated by the current demand for -80*C freezers for the SARS-CoV-2 mRNA vaccine cold chain distribution (Stirling has a major deal with UPS, for example), though there is no way I will ever confirm that.

After my catastrophic experience, they bungled their response, and only jumped to action after I tweeted about my experience. I really wanted to like this company, as they are local and not one of the science supply mega-companies (eg. ThermoFisher). My fledgling lab is still out almost $3k in commercial reagents, and many of my non-commercial reagents and samples were compromised. They did make a special effort to update my firmware today and answer my questions, but I still can’t help to feel like a victim of poor manufacturing and service. All of the effort I’ve put in the last few days was to get to some answers and help others avoid the same situation I was put in.

I’ll post any updates to this page if I learn any more, but I’m now satisfied with my understanding of what happened. Now back to some actual science.

Stirling -80C Freezer Failure

I’m getting really tired wasting time and brain-power on this, but unlike buying regular consumer goods (like the items on Amazon with hundreds to thousands of reviews) buying and dealing with research equipment is subject to really small sample sizes, so the more information that’s out there the better. Thus, I’ll keep this page as a running log of my experience with Stirling’s XLE Ultra Low Temperature (aka. -80*C) freezer.

TL;DR -> My 1 year-old freezer failed in the most catastrophic way: the firmware froze and displayed -80*C while the contents slowly thawed as it had reached 5*C by the time I noticed it wasn’t working. No alarms, as the firmware had crashed and was frozen (again, displaying -80*C the whole time). While I’ve had no issue with their mechanics, I suspect their firmware is potentially critically flawed.

Part 1) Discovering that the freezer had failed: I purchased a Stirling Ultracold SU780XLE, a little over a year now (purchased ~ October 2019), shortly after I started up my lab at CWRU. I’ve been in labs that had poor experiences with the ThermoFisher TSU series freezers, and the reviews for the Stirling seemed pretty good on twitter. Furthermore, CWRU has a rebate program with Stirling due to their energy efficiency, and probably also because they are local (they are based in Ohio).

I went into the lab last Sunday evening (Nov 8) to do some work. I went to retrieve something from my the Stirling -80*C, and saw that the usual ice on the front of the inner doors were gone. I opened up the inner doors and looked at the shelves, and there was water pooled on every shelf. I looked at some of the most recent preserved cryovials of cells we had temporarily stored on one of the shelves, and they were all liquid. Things had clearly thawed inside the freezer. I closed the outer door and looked at the screen at the top, and it was displaying -80*C. The screen is actually a touchscreen, so I tried to flip through its settings, but it was completely unresponsive to my touch. It became pretty clear to me in that moment that the freezer firmware had crashed with the screen displaying -80*C. Ooof.

The picture I took of the frozen screen, timestamped Sun, Nov 8, 7:25pm.

I pulled the freezer out from the wall, found the on/off switch, and switched it to OFF. The first time, I actually flipped the switch too soon to ON, as the screen never reset. I’m guessing there must be some short term battery / capacitor that allows the freezer to keep running with momentary interruptions in power. So I then set it to OFF, waited for the screen to go blank, and then set it back to on. After booting up, the screen displayed 5*C. So there we go. It was indeed stuck on that screen, and rebooting the firmware showed it to show the real temperature again. Which is a VERY BAD real temperature.

The picture I took of the screen after resetting the freezer, timestamped Sun, Nov 8, 7:28pm.

I immediately emailed Stirling (email timestamped Sun, Nov 8, 7:37 PM). I received a response from a customer service representative Mon, Nov 9, 8:01 AM saying “I’m sorry to hear that you are having issues.” and that they were referring me to the service dept. Got an email from the Stirling service department Mon Nov 9, 8:39 AM asking for more information and a picture of the device’s service screen. I replied to this email with all requested information Mon, Nov 9, 10:43 AM. I got an email telling me I was “Incident-7576” on Mon, Nov 9, 11:00 AM. Complete radio silence from them as of writing this section of this post, which is ~ 72 hours later (Thurs, Nov 12, ~ 11:00 AM), even after I sent them a pretty strongly worded email yesterday at 6:00 AM. I’ll follow up on my continued experience interacting with the company in section 3 of this post.

Otherwise, the mechanics for the freezer seemed to be fine. It look me about an hour to mop up all of the water, and look through my boxes to see what had thawed (which was everything except the 15ml conicals, which seemed to have enough mass to them to have not fully thawed). I was still very aggravated and in a bit of shock to have had to deal with this, but still went about my work. Two hours later, the freezer was back down to -30*C. The next morning, it was back at -80*C. So the reset was clearly sufficient to make the freezer operational* again. ( *since it presumably still encodes the same firmware glitch which caused the problem in the first place).

Part 2) Taking stock of my lost items and forming my interpretation of what happened: Over the next couple of days, I had a chance to take stock of everything I had lost during the thaw. Being a new lab (and thus with a ~ 1 year old freezer) we didn’t have a ton of items in there, but they were not inconsequential. The commercial reagents were largely competent bacterial cells, which amounted to ~ $2,110 of lost material. There were also ~ $720 worth of chemicals, which upon freeze thaw cycles, are of somewhat questionable potency, and will likely need to be purchased again before use in publication. There were also dozens of cryovials of cell lines made in house. There were also a few cryovials of cells, dozens of tubes of patient serum, and viral stocks for SARS-CoV-2 research either given by other labs or provided by BEI resources, which would need to be replaced as we have no backups. While there is no monetary value associated with these reagents, the amount of work-time used in creating them and now replacing them is a major loss.

As a scientist, I think it’s natural for me to try to synthesize all the information I have to piece together what happened. There was no power loss (it was a sunny weekend without any storms, and no other equipment in the lab had any aberrant behavior). Nobody had gone into it for any extended amount of time, especially since it was over the weekend. The last time I had gone into it was Friday afternoon, when it seemed fine. That said, it is very well possible it had already crashed at that time. I don’t think I can visually tell the difference between a freezer at -80*C, -40*C, or maybe even -10*C. Frozen looks frozen. In lieu of any alarms or temperature readings provided by the freezer itself, the only visual clue was going to be water from the thawed ice in the freezer, which by that point was going to be too late.

To see if I could figure out when the freezer may have crashed / failed, I tried going back into the freezer log. This is all the information I could glean from the freezer:

So, uh, that history feature wasn’t all that informative, but still a couple of points I could glean from looking at it.
1) It goes from -80*C in the data points directly preceding the event, to being > 0*C to when I restarted it. So it completely stopped logging during the event. This is entirely consistent with the software having crashed, and the reason it was still showing -80*C on the screen while it had thawed.
2) Uhhhh. I can’t actually figure out what day and time it failed b/c it had apparently logged its most recent operation as August 26th. Clearly it wasn’t August 26th when it had failed, since August 26th was 72 days before Fri, Nov 6, which was the last time I had looked in the freezer before the event, when it was clearly still completely frozen. Weirdly, I didn’t have to tell it what day it was after I reset it, so it must have had an internal clock that knew it was Nov 8th upon the reset. So here’s another indication of there being something glitchy with their firmware.

Ironically, I had a separate low-temperature thermometer plugged into it TraceableLIVE® ULT Thermometer, Item#: LABC3-6510, which really isn’t a bad thermometer, but it eats up batteries and I ran out of disposable AAA batteries (I don’t think it takes a wall plug, which it should also do so it only needs to use the batteries during power-outage situations!), so I was waiting for some rechargeable AAAs to come in from Amazon. TBH, they had already come in a week or two ago, but the freezer was operating perfectly fine until this so it wasn’t high up on my to-do list to charge and replace the batteries and get the secondary thermometer up and running again. In hindsight, a very naive and critical mistake!

Part 3) Stirling’s response to this:

Thurs, Nov 12, 11:00 AM: So far, it’s been pretty nonexistent. I wrote them an email yesterday (Nov 11) saying 1) Everything I’ve seen is telling me this is a catastrophic failure of the freezer itself, so are you going to take responsibility for it? 2) I’m still quite worried about the freezer’s operation, since the glitch that caused this has not been addressed. I’m yet to get any non-automated response from them past the most recent email on Nov 9, 11 AM.

Thurs, Nov 12, ~ 5:00 PM: Tweeting about my experience seemed to have escalated things, as I got two phone calls. The first was from the technician handling my case (“Incident-7576”), who asked if anyone had been in touch with me about scheduling the fix on the previous Monday and Tuesday. I said no, this is the first response I had gotten. I also pointed out hat I had emailed him yesterday with some questions. Apparently he had not seem the email. So, a rather poorly managed customer and technical service response.

As soon as I got off the phone, the VP of Global Services called me (this is where I think the tweets likely made a difference). Provided apologies (as expected), but I also got to ask for answers to my specific questions. Here are things I learned:
1) “We’re not responsible for sample loss”. So they won’t cover anything that you lose if the freeze fails and thaws, even if it was in the most spectacularly bad way completely due to flaws in freezer design or production that torpedoes its operation.
2) The mechanics are covered for 7 yrs, but the material and labor warranty is only for 2 yrs. This includes things like “door handles and electronics”, with electronics clearly being the most relevant item here. They offered to extend this warranty to 3 yrs. I don’t think I’m unreasonable to feel like that is a pretty weak gesture based on the freezer failing the way it did.
3) I’ve had people tell me I should ask for a refund to get it replaced. Well, they don’t do that.
4) Apparently there are three parts to their firmware. One of them is called the “Beagle Bone”, which they said is responsible for making the real-time connection between the freezer settings and the parts. Quick google search suggests it’s something like this.

The saga continues. Let’s see what the technicians tomorrow say.

Fri, Nov 13th: Causing a stir on twitter apparently kicked things into action. I also put my detective hat on and I think I figured out what was going on. Too much to bury way down here, so I made a new post.

Miniprep efficiency

The SARS CoV-2 pandemic -caused research ramp-down period was a weird time for me / the lab. I sent Sarah to work from home for 10 or so weeks, meaning I had to do the lab work myself if I wanted to make any progress on any of the existing grant-work, or for any of the SARS CoV-2 research I was trying to boot up. This has resulted in some VERY long weeks over the last few months, as I was really trying to do everything at that point. Cognizant of this, I even started timing myself doing some of the more routine / mundane tasks, to see if I could try to maximize my efficiency. Perhaps the most consistent / predictable of the tasks were minipreps. In particular, I was curious whether doing more minipreps simultaneously saved me time in the long run.

So short answer was yes. 24 is a very comfortable / logical number for me (I just fill up my mini-centrifuge, and the result is divisible by three so easy for processing as complete 8-strip PCR tubes for Sanger later on), and I consistently processed those in about an hour. Dong fewer would be somewhat less efficiency, though sometimes you have to do that if you’re in a rush to get some particular clone of recombinant DNA plasmid. Then again, doing more than 24 — while somewhat exhausting — does save me some time overall. Thus, I found out that was a worthwhile strategy to plan for during that period.

That said, I’m very glad to have Sarah back in the lab helping me with some of the wet-lab work again. Not only does it save me time, but also saves me focus; I’ve gotten pretty good at multi-tasking, but I still do hit a limit in terms of the number of DIFFERENT things I can do / think about at the same time.

Plasmid Lineages

Recombinant DNA work is integral to what we’re doing here, so I’ve become extremely organized with keeping track of the constructs we are building. This includes having a record of how sequences from two constructs were stitched together to create a new construct. Here’s a network map showing how one or more different plasmid sequences were combined to create each new construct.

[The series of letters and numbers prefixed with G (for Gibson) are unique identifiers I started giving new constructs when it became clear partway through my postdoc that I was going to need a better way of tracking everything I was building. Those prefixed with A are constructs obtained through addgene. Those prefixed with R are important constructs I had built before this tracking system, where I had to start giving them identifiers retroactively.]

Edit 9/1/2020: Even if some of my code / script-writing is kind of haggard, I figure I’ll still publicly post them in case it’s useful for trainees. Thus, you can find the script + data files to recreate the above plot at this page of the lab GitHub.