Feel The Support Tier 2 Pain

Slack

#support-escalations

Mara

12 new replies

Northstar is asking again, the export still fails for finance admins, the CSV stops at 42 percent, and they want a technical answer now because their VP is already asking whether we lost data or whether this is another incident we have not acknowledged publicly yet. They say the dashboard hangs, the retries hang, the fallback hangs, the CSV preview hangs, the admin panel hangs, and the only message they see is a spinner with no error. They also want to know whether yesterday's partial export is trustworthy, whether they need to re-run every report, whether they should warn their own finance team, whether this affected scheduled deliveries, whether we have an ETA, whether this is tied to the deploy, and whether they should pause their month-end close until somebody from engineering confirms root cause in writing.

Gmail

Inbox

2m

ops@northstar.dev

RE: customer update needed before leadership escalation call starts

Can you confirm whether this is actual data loss, delayed processing, a permissions regression, or a queue backlog before we answer them, because the customer has already asked three different teams for three different explanations and they are comparing them side by side. At this point they have screenshots, timestamps, names from previous calls, and a long thread where support said one thing, success said another thing, and somebody from sales promised this would never happen again. They want one final answer with exact impact, exact timeline, exact mitigation, exact next step, and exact wording they can quote internally without us correcting ourselves again in the next email.

Linear

Support Ops

SUP-482

Billing screen blank for workspace admin downgrade path after seat reassignment, cache refresh, role sync retry from last night's deploy, follow-up refresh after invite acceptance, and second session bootstrap when legacy permissions payload returns stale access fields instead of the new billing entitlement map support thought had already been migrated

Investigating

Slack

DM with Luis

Luis

typing...

I need something support-safe to send back in the next few minutes. Not the full story, not an engineering paragraph, just the cause, whether the customer did anything wrong, whether this can spread to other accounts, and exactly what support should tell them to do next. Also tell me whether this is a bug, a configuration mismatch, a role propagation issue, a stale session issue, a browser issue, or an outage issue, because they are asking all of those directly now. Also tell me if we can say fixed, or if we have to say mitigated, or if we have to say still investigating, and if we say still investigating I need one more line that sounds confident enough that they stop asking for engineering to join the thread immediately.

Gmail

Inbox

5m

csm@acmehq.com

Customer asked again and attached more screenshots from multiple admins

Attaching another screenshot set. They say the same blank state is visible on every admin seat, across two browsers, in incognito, after logout, after cache clear, and after a fresh role invite, so they are now assuming the product is broken globally. They also checked staging against production, compared two workspaces, retried from a mobile hotspot, and asked why our status page says operational if their admins still cannot complete billing tasks. They want us to explain why the page sometimes loads, why it sometimes flashes and goes white, why some seats can open it once and then never again, and why the customer had to prove all of this before somebody technical looked at it.

Linear

Infra

INC-221

Webhook retry queue climbing after downstream 503s, auto-disable threshold approaching, support still waiting on a customer-safe explanation, retry workers reprocessing the same payload families, backlog spilling into the next batch window, and alert fatigue making everybody assume someone else already checked whether this is customer-side or platform-side

High priority

Slack

#leadership

Evan

urgent

Need exact blast radius before the next external update, including affected workspaces, failed job counts, first seen timestamp, whether this started before or after the deploy, and who actually has the logs because three people just said they were looking and nobody has posted a real answer yet. Also need whether revenue is at risk, whether renewals are blocked, whether any enterprise accounts are impacted, whether support is sending inconsistent messages, whether legal needs to review wording, whether success needs a script, whether product needs to disable anything temporarily, and whether we have enough confidence to say root cause instead of probable cause in the next update.

Gmail

Inbox

7m

support@brightpath.io

Need approved wording before customer forwards this thread internally

Please avoid saying outage if this is actually a permissions issue, avoid saying permissions issue if this is actually a regression, and avoid implying customer fault unless you are certain, because the customer copied leadership and legal has asked us not to create another contradictory written record. Also avoid saying resolved unless all impacted seats are verified, avoid saying isolated unless we checked the related workspaces, avoid saying expected behavior unless product signed off, and avoid saying temporary because the customer is now asking whether temporary means minutes, hours, days, or until the next deploy. We need wording that is cautious, specific, defensible, and still somehow reassuring enough that they stop escalating this thread to additional executives.

Linear

Core App

APP-903

Viewer role can still navigate to billing route, trigger blank state, create misleading support tickets that look like billing outages instead of access problems, bypass the intended guard on first load when stale client permissions race the newer entitlement response, and continue generating contradictory reproduction steps that make support believe there are multiple bugs when the product is actually failing in one very embarrassing and highly visible way

Todo

Slack

#support-escalations

Mara

12 new replies

Northstar is asking again, the export still fails for finance admins, the CSV stops at 42 percent, and they want a technical answer now because their VP is already asking whether we lost data or whether this is another incident we have not acknowledged publicly yet. They say the dashboard hangs, the retries hang, the fallback hangs, the CSV preview hangs, the admin panel hangs, and the only message they see is a spinner with no error. They also want to know whether yesterday's partial export is trustworthy, whether they need to re-run every report, whether they should warn their own finance team, whether this affected scheduled deliveries, whether we have an ETA, whether this is tied to the deploy, and whether they should pause their month-end close until somebody from engineering confirms root cause in writing.

Gmail

Inbox

2m

ops@northstar.dev

RE: customer update needed before leadership escalation call starts

Can you confirm whether this is actual data loss, delayed processing, a permissions regression, or a queue backlog before we answer them, because the customer has already asked three different teams for three different explanations and they are comparing them side by side. At this point they have screenshots, timestamps, names from previous calls, and a long thread where support said one thing, success said another thing, and somebody from sales promised this would never happen again. They want one final answer with exact impact, exact timeline, exact mitigation, exact next step, and exact wording they can quote internally without us correcting ourselves again in the next email.

Linear

Support Ops

SUP-482

Billing screen blank for workspace admin downgrade path after seat reassignment, cache refresh, role sync retry from last night's deploy, follow-up refresh after invite acceptance, and second session bootstrap when legacy permissions payload returns stale access fields instead of the new billing entitlement map support thought had already been migrated

Investigating

Slack

DM with Luis

Luis

typing...

I need something support-safe to send back in the next few minutes. Not the full story, not an engineering paragraph, just the cause, whether the customer did anything wrong, whether this can spread to other accounts, and exactly what support should tell them to do next. Also tell me whether this is a bug, a configuration mismatch, a role propagation issue, a stale session issue, a browser issue, or an outage issue, because they are asking all of those directly now. Also tell me if we can say fixed, or if we have to say mitigated, or if we have to say still investigating, and if we say still investigating I need one more line that sounds confident enough that they stop asking for engineering to join the thread immediately.

Gmail

Inbox

5m

csm@acmehq.com

Customer asked again and attached more screenshots from multiple admins

Attaching another screenshot set. They say the same blank state is visible on every admin seat, across two browsers, in incognito, after logout, after cache clear, and after a fresh role invite, so they are now assuming the product is broken globally. They also checked staging against production, compared two workspaces, retried from a mobile hotspot, and asked why our status page says operational if their admins still cannot complete billing tasks. They want us to explain why the page sometimes loads, why it sometimes flashes and goes white, why some seats can open it once and then never again, and why the customer had to prove all of this before somebody technical looked at it.

Linear

Infra

INC-221

Webhook retry queue climbing after downstream 503s, auto-disable threshold approaching, support still waiting on a customer-safe explanation, retry workers reprocessing the same payload families, backlog spilling into the next batch window, and alert fatigue making everybody assume someone else already checked whether this is customer-side or platform-side

High priority

Slack

#leadership

Evan

urgent

Need exact blast radius before the next external update, including affected workspaces, failed job counts, first seen timestamp, whether this started before or after the deploy, and who actually has the logs because three people just said they were looking and nobody has posted a real answer yet. Also need whether revenue is at risk, whether renewals are blocked, whether any enterprise accounts are impacted, whether support is sending inconsistent messages, whether legal needs to review wording, whether success needs a script, whether product needs to disable anything temporarily, and whether we have enough confidence to say root cause instead of probable cause in the next update.

Gmail

Inbox

7m

support@brightpath.io

Need approved wording before customer forwards this thread internally

Please avoid saying outage if this is actually a permissions issue, avoid saying permissions issue if this is actually a regression, and avoid implying customer fault unless you are certain, because the customer copied leadership and legal has asked us not to create another contradictory written record. Also avoid saying resolved unless all impacted seats are verified, avoid saying isolated unless we checked the related workspaces, avoid saying expected behavior unless product signed off, and avoid saying temporary because the customer is now asking whether temporary means minutes, hours, days, or until the next deploy. We need wording that is cautious, specific, defensible, and still somehow reassuring enough that they stop escalating this thread to additional executives.

Linear

Core App

APP-903

Viewer role can still navigate to billing route, trigger blank state, create misleading support tickets that look like billing outages instead of access problems, bypass the intended guard on first load when stale client permissions race the newer entitlement response, and continue generating contradictory reproduction steps that make support believe there are multiple bugs when the product is actually failing in one very embarrassing and highly visible way

Todo