BCN Hosted - Notice history

Remote Desktop Platform - Operational

100% - uptime
Jun 2024 · 100%Jul · 100.0%Aug · 100.0%
Jun 2024
Jul 2024
Aug 2024

Hosted Platform - Operational

100% - uptime
Jun 2024 · 100%Jul · 100.0%Aug · 100.0%
Jun 2024
Jul 2024
Aug 2024

Hosted Email Services - Operational

100% - uptime
Jun 2024 · 100%Jul · 100.0%Aug · 100.0%
Jun 2024
Jul 2024
Aug 2024

External Connectivity - Operational

100% - uptime
Jun 2024 · 100%Jul · 100.0%Aug · 100.0%
Jun 2024
Jul 2024
Aug 2024

Veeam Cloud - Operational

100% - uptime
Jun 2024 · 100%Jul · 100.0%Aug · 100.0%
Jun 2024
Jul 2024
Aug 2024

Storage Platform - Operational

100% - uptime
Jun 2024 · 100%Jul · 100.0%Aug · 100.0%
Jun 2024
Jul 2024
Aug 2024

Azure Services - Operational

100% - uptime
Jun 2024 · 100%Jul · 100.0%Aug · 100.0%
Jun 2024
Jul 2024
Aug 2024

Vendor/3rd Party - Operational

100% - uptime
Jun 2024 · 100%Jul · 100.0%Aug · 99.28%
Jun 2024
Jul 2024
Aug 2024

Notice history

Aug 2024

No notices reported this month

Jul 2024

Unable to access Microsoft 365 Services
  • Resolved
    Resolved

    This incident has been resolved by Microsoft. Final update from them below via Azure status history | Microsoft Azure

    What happened?

    Between approximately at 11:45 UTC and 19:43 UTC on 30 July 2024, a subset of customers may have experienced issues connecting to a subset of Microsoft services globally. Impacted services included Azure App Services, Application Insights, Azure IoT Central, Azure Log Search Alerts, Azure Policy, as well as the Azure portal itself and a subset of Microsoft 365 and Microsoft Purview services.

    What do we know so far?

    An unexpected usage spike resulted in Azure Front Door (AFD) and Azure Content Delivery Network (CDN) components performing below acceptable thresholds, leading to intermittent errors, timeout, and latency spikes. While the initial trigger event was a Distributed Denial-of-Service (DDoS) attack, which activated our DDoS protection mechanisms, initial investigations suggest that an error in the implementation of our defenses amplified the impact of the attack rather than mitigating it.

    How did we respond?

    Customer impact began at 11:45 UTC and we started investigating. Once the nature of the usage spike was understood, we implemented networking configuration changes to support our DDoS protection efforts, and performed failovers to alternate networking paths to provide relief. Our initial network configuration changes successfully mitigated majority of the impact by 14:10 UTC. Some customers reported less than 100% availability, which we began mitigating at around 18:00 UTC. We proceeded with an updated mitigation approach, first rolling this out across regions in Asia Pacific and Europe. After validating that this revised approach successfully eliminated the side effect impacts of the initial mitigation, we rolled it out to regions in the Americas. Failure rates returned to pre-incident levels by 19:43 UTC - after monitoring traffic and services to ensure that the issue was fully mitigated, we declared the incident mitigated at 20:48 UTC. Some downstream services took longer to recover, depending on how they were configured to use AFD and/or CDN.

    What happens next?

    Our team will be completing an internal retrospective to understand the incident in more detail. We will publish a Preliminary Post Incident Review (PIR) within approximately 72 hours, to share more details on what happened and how we responded. After our internal retrospective is completed, generally within 14 days, we will publish a Final Post Incident Review with any additional details and learnings. To get notified when that happens, and/or to stay informed about future Azure service issues, make sure that you configure and maintain Azure Service Health alerts – these can trigger emails, SMS, push notifications, webhooks, and more: https://aka.ms/ash-alerts. For more information on Post Incident Reviews, refer to https://aka.ms/AzurePIRs. Finally, for broader guidance on preparing for cloud incidents, refer to https://aka.ms/incidentreadiness.

  • Update
    Update

    Latest update from Microsoft

    30 Jul 2024, 16:51 BST

    Telemetry shows that the service has remained stable. We're continuing to monitor the service for an extended period of time to confirm resolution.

  • Monitoring
    Monitoring

    Latest update from Microsoft

    30 Jul 2024, 15:31 BST

    We've implemented a networking configuration change and some Microsoft 365 services have performed failovers to alternate networking paths to provide relief. Monitoring telemetry shows improvement in service availability, and we're continuing to monitor to ensure full recovery.

  • Identified
    Identified

    Latest update from Microsoft

    30 Jul 2024, 14:14 BST

    We've identified a potential networking issue and we're investigating this further.

    Next update by:

    Tuesday, 30 July 2024 at 16:00 BST

  • Update
    Update

    Latest update from Microsoft

    30 Jul 2024, 13:54 BST

    Our investigation is currently focused on networking infrastructure. This quick update is designed to give the latest information on this issue.

  • Update
    Update

    Latest update from Microsoft

    Issue ID: MO842351
    Affected services: Microsoft 365 suite
    Status: Service degradation
    Issue type: Incident
    Start time: Jul 30, 2024, 1:21 PM GMT+1

     

    User impact
    Users may be unable to access multiple Microsoft 365 services.

     

    More info
    Users who are able to access Microsoft 365 service may experience latency or degraded feature performance.

     

    Scope of impact
    Impact is specific to some users who are served through the affected infrastructure, attempting to utilize multiple Microsoft 365 services or features.

     

     

    Current status
    Jul 30, 2024, 1:28 PM GMT+1
    We're reviewing service monitoring telemetry to determine our next troubleshooting steps.
    Next update by:
    Tuesday, July 30, 2024 at 3:30 PM GMT+1

     

     

    History of updates
    Jul 30, 2024, 1:24 PM GMT+1
    We're investigating a potential issue and checking for impact to your organization. We'll provide an update within 30 minutes.

  • Investigating
    Investigating

    Microsoft have published that they're seeing issues with access across multiple Microsoft 365 services. Microsoft incident ID MO842351.

Global Crowdstrike Outage
  • Resolved
    Resolved

    All BCN customers have been brought back online following the Crowdstrike outage. BCN will continue to monitor.

  • Update
    Update

    BCN to monitor over the weekend to ensure no further issues arise.

  • Update
    Update

    The vast majority of customers are now back up and running following the Crowdstrike outage. BCN will continue to remediate remaining customers and monitor the situation over the weekend.

  • Monitoring
    Monitoring

    There have been no further issues noted from Crowdstrike, with remediation work continuing. 72% of BCN customers impacted by the outage are now fully operational.

  • Update
    Update

    BCN engineers continue to work with impacted customers, with 40% of those now fully resolved.

    ******IMPORTANT******

    Please do not engage with any other 3rd parties claiming they are able to help remediate the issue. If you have any concerns about the outage, please contact BCN directly on 0345 095 7001 if you are not already in dialogue with us regarding this matter.

  • Identified
    Identified

    Crowdstrike are continuing to provide further clarity on ways to remediate the impact including specific versioning of files that are problematic (that caused this issue) and those which aren't. Our technical teams are leveraging this insight and are continuing to work with impacted customers.

  • Monitoring
    Monitoring

    BCN have identified customers impacted, and have engineering resource from across all teams reaching out to effected customers to assist with mitigation/remediation.

  • Identified
    Identified

    Crowdstrike have advised the following:

    Tech Alert | Windows crashes related to Falcon Sensor | 2024-07-19

     

    Cloud:  US-1EU-1US-2

    Published Date: Jul 19, 2024



    Summary

    • CrowdStrike is aware of reports of crashes on Windows hosts related to the Falcon Sensor.


    Details

    • Symptoms include hosts experiencing a bugcheck\blue screen error related to the Falcon Sensor.

    • This issue is not impacting Mac- or Linux-based hosts

    • Channel file "C-00000291*.sys" with timestamp of 0527 UTC or later is the reverted (good) version.


    Current Action

    • CrowdStrike Engineering has identified a content deployment related to this issue and reverted those changes.

    • If hosts are still crashing and unable to stay online to receive the Channel File Changes, the following steps can be used to workaround this issue:

    Workaround Steps:

    • Reboot the host to give it an opportunity to download the reverted channel file.  If the host crashes again, then:

      • Boot Windows into Safe Mode or the Windows Recovery Environment

      • Navigate to the C:\Windows\System32\drivers\CrowdStrike directory

      • Locate the file matching “C-00000291*.sys”, and delete it. 

      • Boot the host normally.

    Note:  Bitlocker-encrypted hosts may require a recovery key


    Latest Updates

    • 2024-07-19 05:30 AM UTC | Tech Alert Published.

    • 2024-07-19 06:30 AM UTC | Updated and added workaround details.

    • 2024-07-19 08:08 AM UTC | Updated

  • Investigating
    Investigating

    BCN are aware of global issues Crowdstrike. We are in the process of contacting customers impacting to discuss a remediation plan. Further updates will be posted here.

Jun 2024

No notices reported this month

Jun 2024 to Aug 2024

Next