GUEST BLOG | December 13, 2022

Interdependencies – Handshakes Between Critical Infrastructures | Ernie Hayden

As of this writing, the United States was recently threatened by a major railroad union strike. The railroads are a major element of the country’s critical infrastructure. Their shutdown could lead to multiple, cascading impacts on the delivery of goods and services, not only in the US but also in Canada and Mexico. Shipping lines could also be impacted by a railroad strike, since they will not be able to receive or offload containers and cargo to and from rail cars.

Per a CNN article, a rail strike would cost the US economy about $1 billion  in the first week, according to the Association of American Railroads. The ripple effect of a shipping stoppage of that magnitude could lead to global food shortages and more: the American Chemistry Council said that around $2.8 billion in chemical shipments would be impacted each week. A separate article by the Association of American Railroads also outlines how other transportation and logistics partners are not positioned to take up the slack left by a railroad strike to keep goods moving efficiently across the nation, should a service interruption occur. Specifically, in the first half of 2022, more than 75,000 rail shipments began their journey each day; in the event of a shutdown, these shipments would sit idle.

This scenario demonstrates the fundamental concept of cascading issues following the failure of one element of the country’s critical infrastructure. This demonstrates the interdependencies in play between transportation, manufacturing, chemical production, and other vital systems.

What is Critical Infrastructure?

The concept of infrastructure in the United States began around 1983. The infrastructure under discussion included highways, public transit, wastewater treatment, water resources, air traffic control, airports, and municipal water supplies. However, these were not given the attribute of “critical.”

From 1983 until 2013, the US government formed policies and held discussions regarding infrastructure and “critical infrastructure” occurred. There were multiple iterations that are beyond the scope of this article; however, the US government perspective of “critical infrastructure” being used today in 2022 was laid out by the Obama Administration in Presidential Policy Directive 21 and Executive Order 13636.

These policy documents identified a list of US infrastructure that should be considered “…so vital to the United States that the incapacity or destruction of such systems and assets would have a debilitating impact on security; national economic security; national public health or safety; or any combination of those matters.”[1]

The following are the 16 specifically identified sectors in the US Critical Infrastructure hierarchy:

  • Chemical Sector
  • Commercial Facilities Sector
  • Communications Sector
  • Critical Manufacturing Sector
  • Dams Sector
  • Defense Industrial Base Sector
  • Emergency Services Sector
  • Energy Sector
  • Financial Services Sector
  • Food and Agriculture Sector
  • Government Facilities Sector
  • Healthcare and Public Health Sector
  • Information Technology Sector
  • Nuclear Reactors, Materials, and Waste Sector
  • Transportation Systems Sector
  • Water and Wastewater Systems Sector

These appear in graphic format at the US Department of Homeland Security Cybersecurity & Infrastructure Security Agency (CISA) website.

Interdependencies Between Critical Infrastructure Sectors

As you may have observed in the discussion of the aforementioned railroad strike, the shutdown of the railroads is not a singular impact, instead impacting other aspects of commerce, manufacturing, and modes of transportation. These other impacts can be viewed as interdependencies.

Interdependencies and connections between infrastructure sectors or elements could mean that damage, disruption, or destruction to one infrastructure element may cause cascading effects negatively impacting the continued operation of another infrastructure element or sector.

These interdependencies are not always obvious, and can be subtle. Right now, the current trend towards greater and more accelerated infrastructure interdependency is due to the Internet. Some experts on critical infrastructure analysis have observed that the advent of the Internet has made critical infrastructure more complex and more interdependent, and thus more fragile.

In their research project Keeping America Safe: Toward More Secure Networks for Critical Sectors (2017), the MIT Center for International Studies observed that “…no one currently understands the extent to which electricity generation is coupled with other sectors, and therefore the risk of ‘catastrophic macroeconomic failure’ in the event of a cyber attack is not adequately known.”

A seminal report by Rinaldi, et al, Identifying, Understanding, and Analyzing Critical Infrastructure Interdependencies written just before the catastrophic events of September 11, 2001 — reported:

“…what happens to one infrastructure can directly and indirectly affect:

  • Other infrastructures,
    • Large geographic regions, and
    • Send ripples throughout the national and global economy…”

This paragraph makes one realize that understanding and studying infrastructure interdependencies is extremely important for our country and the broader North American continent.

Another analysis of critical infrastructure interdependencies noted that “…critical infrastructure interdependencies constitute a risk multiplier: they can themselves be a threat or hazard, affect the resilience and protection performance of critical infrastructure, and lead to cascading and escalating failures.”

Interdependencies need to be understood as part of any risk assessment. If they are not considered, the risk assessment could offer the wrong conclusions.

Case Studies of Critical Infrastructure Interdependencies

In addition to our brief discussion about the railroads at the beginning of this article, here are three additional case studies that demonstrate where interdependencies were highlighted by the failure of a single element or piece of critical infrastructure.

Admittedly, the first two cases are dated; however, they are well documented and provide a sense of cascading failures in two industries.

Orbiting Satellites and Critical Infrastructure Interdependencies

In May of 1998, the five-year-old Galaxy IV Communications Satellite failed. The satellite’s computer and backup computer failed, causing the satellite to tilt away from Earth. The owner, PanAmSat, was unable to realign the satellite. PanAmSat was finally able to reroute traffic the next day to another satellite; however, the Galaxy IV was considered permanently out of service, seven years short of its intended 12-year operational life.

What were the impacts of this event? Here are some examples:

  • Pager service (remember that?) was disabled for 80-90% of U.S. customers (about 45 million). This impacted physicians, emergency departments, police, fire, and emergency medical technicians. This failure alone demonstrated how this one item of critical infrastructure can impact a major system used by emergency service providers.
  • Credit card authorization services at around 5,400 Chevron stations and most Wal-Mart stores were shut down.
  • Public television transmissions were impacted. These included:
    • CBS
    • Warner Brothers
    • UPN
    • Reuters TV
    • Motor Racing Network
    • CNN Airport Channel
    • Chinese Television Network in Hong Kong
    • US Armed Forces Entertainment Network
  • Radio stations, including National Public Radio’s national feed
  • Private business TV stations such as Aetna, Microsoft, and 3M

Overall, this one satellite failure caused a broad impact across multiple industries relying on the satellite’s communications capabilities.

Seattle Tacoma Airport and Olympic Oil Pipeline Interdependencies

The Olympic Pipeline is a 400-mile interstate pipeline system running from the US-Canada border near Blaine, Washington to Portland, Oregon. It delivers aviation jet fuel, diesel fuel, and gasoline to product terminals at Seattle (Harbor Island), Seattle-Tacoma International Airport (“Sea-Tac”), Renton, Tacoma, Vancouver (WA), and Portland.

Sea-Tac Airport receives jet fuel via a special pipeline from Renton, Washington, using a 16-inch pipe, since Sea-Tac airport cannot accommodate offloading fuel trucks to its jet fuel tank farm.

On Sunday, May 23rd, 2004, a pinhole-sized leak opened in a ¾-inch sample line due to electric conduit rubbing against the tube. The leak immediately caused between 3,300 and 10,000 gallons of fuel to leak and catch fire (most of it was burned). The pipeline was immediately shut down, impacting delivery of 11 million gallons of fuel per day system-wide. The Olympic Pipeline lost $10,000 in business each hour.

On Tuesday, May 25th, two days after the leak and subsequent pipeline isolation, Sea-Tac refueling operations were in trouble. The airport had 2.9 million gallons of fuel on hand, and typically uses 1.2 million gallons per day to fuel the airline jets. As such, Sea-Tac operations requested airlines to not refuel their planes at the airport. Alaska Airlines – the primary tenant airline at Sea-Tac – decided to refuel its planes in cities like San Francisco and San Jose to reduce reliance on Sea-Tac’s fuel.

Alaska even considered canceling some flights beginning on Wednesday, May 26th, if the pipeline was not returned to service.

On Wednesday, May 26th, the pipeline flow was restored, and Sea-Tac returned to normal refueling services soon afterwards.

This story affecting Sea-Tac is another example wherein the failure of one component led to the failure of one system, thus leading to the failure and negative impact of multiple elements of critical infrastructure.

Ukraine War and Critical Infrastructure Interdependencies

Between the start of the Russian invasion on February 24th, 2022 and March 24th, 2022, multiple critical infrastructures were seriously damaged or destroyed, including:

  • 92 factories and warehouses
  • 378 schools
  • 138 healthcare institutions
  • 12 airports
  • 7 thermal power and hydroelectric power plants

The only functioning oil refinery was also destroyed.

That was the first month of the war. Up to the time of this writing, the number of destroyed critical infrastructure facilities is much larger still.

The following table lists the categories of critical infrastructure damaged by the end of the first month of the war.

InfrastructureNumber of ItemsCost in $US Millions
Roads (kilometers)8,265$27,546
Housing4,431$13,542
Civilian Airports8$6,816
Industrial Enterprises, Factories92$2,921
Healthcare Institutions138$2,466
Nuclear Power Plants1$2,416
Railway Stations and Rolling Stockn/a$2,205
Bridges260$1,452
Ports and Port Infrastructure2$622
Secondary and Higher Education Institutions378$601

Other impacts to critical infrastructure from the war in Ukraine include the following[2]:

  • The European construction industry has had difficulty sourcing building supplies from Russia and Ukraine. For instance, Ukraine produces steel, timber, pallets, and clay for ceramic tiles. Also, the sanctions against Russia are restricting availability of copper, iron ore, and steel.
  • Ukrainian and Russian nationals provide more than 15% of the global shipping workforce, meaning that the war is leading to a shortage of able-bodied seamen.
  • Russia and Ukraine produce approximately one-third of the world’s ammonia and potassium exports, ingredients necessary for agricultural fertilizer. This alone has caused fertilizer prices to increase by 20 to 50%.
  • Food security has been impacted across the Middle East, North Africa, and Western and Central Asia — Ukraine is often called a “global breadbasket.”

The conclusion here is that a war between two countries has a major impact on critical infrastructures around the globe. It is important that the risk analysis includes a review of potential infrastructure interdependencies.

Rogers Telecom Outage, Ontario, Canada

At around 3:45 AM Eastern Time on Friday, July 8, 2022, Rogers Communications — one of the largest Internet service providers in Canada — experienced a significant disruption that rendered its network unavailable for long periods of time over the course of approximately 24 hours.

This failure impacted over 10 million wireless subscribers and 2.25 million retail internet subscribers. The outage caused problems for payment systems, Automated Teller Machines (ATMs), and phone connections, primarily in eastern Canada.

For instance, the outage affected all financial institutions across Canada, including the Royal Bank of Canada. It caused a shutdown or interruption of the INTERAC payment system used by all Canadian banks, thus affecting debit card and funds transfer services. Even Toronto Dominion Bank and Bank of Montreal suffered some service and capability interruptions.

Air Canada, which is headquartered in Toronto, suffered a major impact to its call centers. Even Vancouver International Airport – thousands of miles west of Toronto – was affected. There, travelers could not pay for parking, use terminal ATMs, or purchase items at airport retailers. Cash was king!

Even Canadian government offices, such as the Canadian Radio-television and Telecommunications Commission, lost telephone connections.

Although this is not a deep dive into the cause of this outage, the point of focus for the reader is that the failure of a telephone system resulted in nationwide impacts to commerce, retail sales, and provision of services such as automobile parking. Hence, critical infrastructure interdependencies were demonstrated on that fateful day.

Recent News: Substations Shot in North Carolina

In in early December 2022, two substations were shot at by unknown assailants. Approximately 45,000 homes lost power and some water systems were shut down due to lack of electricity. Again, this is an example of ways critical infrastructure is interrelated.

Thoughts to Ponder

In his paper on critical infrastructure interdependencies, Dr. Steven Rinaldi observed: “Interdependencies, however, are a complex and difficult problem to analyze.” This author agrees. If you look at the cases summarized above, and even consider those examples you have witnessed where the failure of one piece of critical infrastructure has affected one or more other sectors, you will realize that some extraordinary critical thinking is necessary to best understand the interdependencies.

Would your analysis have predicted the downstream failures observed with the failure of a single satellite? A tiny pipeline leak? A small regional war in Eastern Europe? Or the failure of one phone system?

It makes one wonder.

Overall, this perspective is intended to get you thinking, especially if you’re performing risk analysis of critical infrastructures and associated systems. Thinking “out of the box” is appropriate when trying to best understand the broader impacts that can surface with the failure of a single critical infrastructure.

Ernie Hayden
MIPM, CISSP, GICSP (Gold), PSP

About the Author

Ernie Hayden is a highly experienced technical consultant and thought leader in the areas of cyber/physical security. He has authored a book, Critical Infrastructure Risk Assessment – The Definitive Threat Identification and Reduction Handbook (published by Rothstein), which was awarded the ASIS International 2021 Security Book of the Year. Ernie is the Founder/Principal of 443 Consulting, LLC, and can be reached at enhayden1321(_at_)gmail.com.


1 2001 USA Patriot Act, Page 401, Definition of Critical Infrastructure https://www.congress.gov/107/plaws/publ56/PLAW-107publ56.pdf<

2 Many facts from McKinsey & Company analysis dated May 9, 2022

GUEST BLOG | October 26, 2022

Remote Writing Trailer Air Brakes with RF | Ben Gardiner, NMFTA

Over the course of a few years and a pandemic, we (AIS and NMFTA) tested several tractor-trailers for the security properties of the trailer databus, J2497 aka PLC4TRUCKS. What we discovered was that 1) this traffic could be read remotely with SDRs and active antennas but, more importantly, 2) that valid J2497 traffic could be induced on the trailer databus using SDRs, power amplifiers and simple antennas. In this blog post we will introduce you to some concepts and the discoveries overall – for the full technical details please get the whitepaper.

J2497 aka PLC4TRUCKS is a Power Line Carrier (PLC) scheme designed and implemented by Intellon as a bridge between UARTs over powerlines in the Intellon SSC P485 transceiver IC. For years this patented chip was the only way to realize the J2497 standard. With the recent expiration of the patent, this has changed, but the as-implemented behavior of the Intellon chip is still the de facto standard, and the J2497 specification itself has SSC P485-specific components to it. It was developed and deployed as an alternative physical layer to J1708 for the J1587 protocol that sits on top of it; the SSC-P485 converts bi-directionally between J1708 and J2497.

Offering this as a ‘conversion chip’ allowed suppliers to provide PLC4TRUCKS trailer controllers based on previously fielded J1708/J1587 solutions. This was very beneficial for the rapid deployment of solutions for the impending tailer ABS fault lamp regulations of the time but also meant that the trailer equipment inherited legacy features from the J1708/J1587 code. We were very interested in what diagnostic features were implemented in the trailer controllers using J1708/J1587 mechanisms and eventually found during the ‘Powermaster’ project that all trailer controllers and even some tractor controller did respond to the J1708 ‘data link escape’ means of executing proprietary diagnostics.

The ‘Powermaster’ project really kicked off in 2019. Much like the work that Baker, et. al.,[1] released earlier that same year (simultaneous to our own testing, during which we found remote read capability), in which these authors demonstrated that Intellon’s (then-Atheros’, which would become Qualcomm’s) HomePlug GreenPhy (HPGP) Power Line Communications (PLC) can be received at distances of several feet using Software Defined Radios (SDRs). We eventually published our remote read findings at the Car Hacking Village DEF CON 29 SAFE MODE[2] in 2020. Our results were the same: the much earlier (perhaps original) Intellon PLC scheme in J2497 can be read remotely, just like the modern incarnation in HPGP.

The remote read issue was reported in 2020, but during this time we were also testing for remote write – that part didn’t go so smoothly, more on that later. What we eventually confirmed was that it is possible to write remotely to J2497 via induced RF, depending on the equipment configuration (again, just like Baker et. al. who reported a wireless disruption issue in HPGP in February 2022[3]). We found that the most susceptible equipment is tanker trailers and 3x road train trailers. The equipment from all trailer and tractor brake suppliers is affected and the maximum distance can be up to 12 feet. Furthermore, the equipment needed to make it work is not expensive: as cheap as $300 USD for the most susceptible trailer equipment configurations. For details on the confirmed results and testing methods we followed, please consult the tables in NMFTA’s Disclosure of Confirmed Remote Write.[4]

What we’ve confirmed is that all three of today’s trailer brake suppliers implement diagnostics over J1587 on J2497 using Data Link Escapes (DLEs). We have encountered no diagnostics features there that require any authentication or authorization, meaning it is susceptible to a replay attack. In the ‘Powermaster’ project we eventually gravitated to testing with transmission of solenoid test commands because we found we were unable to measure the induced waveform voltages even though the valid J2497 messages were clearly being received; and the solenoid test commands have an audible response from the brake controllers which makes the test tests easy to confirm. It also made it easy to film, which we did and presented a couple videos at our DEF CON 30 talk which is now available on the media server[5].

We still aren’t sure of how the attack to induce valid messages is successful (we do have some theories[6] presented in the whitepaper) but we do know where it works and doesn’t:

  • Tankers, which are large metal shells and whose wiring typically runs out along their side, are very susceptible
  • Dry vans with wooden decking and metal beams are not very susceptible, as compared to trailers with the same dimensions but with metal decking. In these, the wiring runs under the decking and through the beams. In the metal decking trailers we tested, the wiring ran inside an extruded channel.
  • Even the least susceptible dry-van trailers with wooden decking are susceptible when in a 3x road train configuration.

In addition to the Powermaster project with AIS we also developed mitigation techniques and technologies which we released into the public domain[7]. We then worked with CISA VDP on a coordinated disclosure and was able to eventually share the attack and mitigation details with the ATA TMC task forces that are responsible for defining the next generation tract trailer interface and are actively doing so today. The industry is focused on newer communications methods because J2497 never delivered on even its modest bandwidth promises. However, there’s the problem of 20 years’ worth of tractors and trailers using J2497. More than half of them will continue to be used for another 15 years. In addition to requesting that diagnostics on J2497 be excluded from the NGTTI we have also requested that all new tractors include mitigations against the J2497 remote write attack so that new tractors can protect older trailers. And at the Sept 2022 ATA TMC meeting the Task Force accepted our recommendations with a vote of 14 in favor to 4 against.

We will keep working with the ATA TMC task forces to ensure that the next-generation tractor-trailer interface does not inherit the issues we’ve seen in J2497. We are also looking for more testing opportunities to confirm these results on other equipment, as well as opportunities test new concepts.

Don’t forget to access the whitepaper for the full details to the research.

If you would like to host us for some testing[8], please contact ben.gardiner@nmfta.org.

Ben Gardiner


Ben Gardiner is a Senior Cybersecurity Research Engineer contractor presently working to secure commercial transportation at the National Motor Freight Traffic Association (NMFTA). With more than ten years of professional experience in embedded systems design and lifetime worth of hacking experience, he has deep knowledge of the low-level functions of operating systems and the hardware with which they interface. Prior to partnering with the NMFTA team in 2019, he held security assurance and reversing roles at a global corporation, as well as working in embedded software and systems engineering roles at several organizations. He holds a Masters of Science in Engineering in Applied Math & Stats from Queen’s University. He is a DEF CON Hardware Hacking Village (DC HHV) volunteer, is GIAC GPEN certified and a GIAC advisory board member, he is also chair of the SAE TEVEES18A1 Cybersecurity Assurance Testing TF (drafting J3061-2), and a voting member of the SAE Vehicle Electronic Systems Security Committee.


[1] Baker, Richard, and Ivan Martinovic. “Losing the Car Keys: Wireless {PHY-Layer} Insecurity in {EV} Charging.” 28th USENIX Security Symposium (USENIX Security 19), 2019.
[2] Poore, Chris, and Gardiner, Ben. “Power Line Truck Hacking: 2TOOLS4PLC4TRUCKS.” DEF CON 30 Car Hacking Village 2019. http://www.nmfta.org/documents/ctsrp/Power_Line_Truck_Hacking_2TOOLS4PLC4TRUCKS.pdf?v=1
[3] Sebastian Köhler and Richard Baker and Martin Strohmeier and Ivan Martinovic, “Brokenwire : Wireless Disruption of CCS Electric Vehicle Charging” 2022 https://arxiv.org/pdf/2202.02104.pdf
[4] Gardiner, Ben, NMFTA Inc. “2021 Disclosure of Confirmed Remote Write.” http://www.nmfta.org/documents/ctsrp/Disclosure_of_Confirmed_Remote_Write_v4_DIST.pdf?v=1
[5]https://media.defcon.org/DEF%20CON%2030/DEF%20CON%2030%20video%20and%20slides/DEF%20CON%2030%20-%20Ben%20Gardiner%20-%20Trailer%20Shouting%20-%20Talking%20PLC4TRUCKS%20Remotely%20with%20an%20SDR.mp4
[6] the theory that we induce more voltage on the chassis and hence the receiver picks up a single-ended voltage seems promising considering that CharIn has adopted differential PLC for all HPGP connections in their megawatt charging interface
[7] Gardiner, Ben. “Mitigations Options to J2497 Attacks” March 3rd 2022. http://www.nmfta.org/documents/ctsrp/Actionable_Mitigations_Options_v9_DIST.pdf?v=1
[8] Gardiner, Ben. “NFMTA CTSRP Heavy Vehicle Testing Plan” March 1st 2022. https://github.com/nmfta-repo/nmfta-vehicle_cybersecurity_requirements/blob/main/resources/heavy_vehicle_testing_plan.md

GUEST BLOG | June 14, 2022

The Battle of Good versus Evil: Regulations and Cybersecurity | Urban Jonson

Okay, so the title might be a little over the top, but we are at a very critical stage in the balance between government regulations and cybersecurity. With fifteen years in the transportation industry, I am seeing an important trend in transportation regulations, especially pertaining to heavy vehicles and trucking. This trend is also observable across other industries but it is in trucking that we are seeing clear and immediate issues.

The purpose behind most government regulations is to try to improve safety, protect the environment, reduce pollution, and in general make the world a better place. We can argue that the intentions of these regulations are actually good. I mean, we really don’t want to pollute the environment or have massive safety issues. Right? Well, as has been pointed out to me repeatedly, the road to hell is paved with good intentions. Unfortunately, the technocrats who create regulations are usually not qualified in computer science or more specifically cybersecurity. In the case of cybersecurity, we have seen a lot of unintended consequences over the past few years. 

When the original vehicle Controller Area Network (CAN bus) was designed, it was to be a closed and trusted system. Therefore, security concerns like authentication, authorization, secure software development lifecycles, etc. were not considered. Let’s be honest, in the 1980s when the CAN bus network was unveiled at Society of Automotive Engineers (SAE), cybersecurity was not much of a thing anywhere. (This is the time I was getting into computers and I can attest to the lack of computer security…. never you mind how.) We are now in a technology transition period where we are taking systems which were designed to be closed and trusted and adding internet connectivity with little forethought. This is true with many CAN-based systems found on, planes, trains, trucks, automobiles, water treatment plants, power stations, etc. This transitional period is fraught with risk as we bring these old system designs online and the pace is accelerating.

The first wave of connectivity came in the form of productivity and enhanced performance. For example, telematics was a thing in trucking long before any regulator thought to mandate direct connectivity to trucks. Some of these telematics systems connected directly to the vehicle and some did not. Some were read-only and some were not. Some were outright terrifying from a cybersecurity perspective. That being said there was a choice to what to connect to your truck and how to do it.

Most people who are familiar with my work at NMFTA with the Commercial Transportation Security Research Program are probably familiar with the work that I and IOActive have done bringing attention to the cybersecurity issues regarding the rollout of the US Department of Transportation (US DOT) Federal Motor Carrier Safety Administration (FMCSA) Electronic Logging Device (ELD) regulations. Yeah, that’s a lot of acronyms! Corey Thuen did some presentations on the topic including at DefCon, and I have written and talked about it a lot. What made the ELD regulations so novel and problematic was that they mandated an electronic logging device without any material cybersecurity controls be connected to the vehicle CAN bus with read and write capabilities (at the time no OEM broadcast engine hours, so it had to be requested by sending a message on the bus) and that it be connected to the internet. What that really meant was that a read-write internet bridge to a system, that was originally designed as a trusted and closed system, was now required under the force of law. Yeah, I know. What could possibly go wrong?

Since the original regulations came out, there have been addendums and follow-ups including references to cybersecurity which were missing from the original regulations. Many people in the industry worked tirelessly to implement the required functionality in existing ELD systems and make sure they were as secure as the provider could manage. A large number of smaller providers popped up as there was a relatively low-cost barrier to entry. Some of these new, low-cost providers had interesting solutions that included almost no security at all. Think hardware with debug enabled, registration links with passwords in clear text. A total mess. So as a result, some of the new ELD systems are great and some of them are nothing short of terrifying from a cybersecurity perspective. While I was at NMFTA, we developed a whole matrix to evaluate the cybersecurity posture of telematics systems. OEMs have started broadcasting engine hours so write capabilities to the CAN bus are not needed to have a compliant ELD device. Telematics providers have started migrating from being connected directly to the on-board diagnostics (OBD) port—many times as spliced-in connections—to being connected to a connector intended for permanent and semi-permanent aftermarket equipment installation (RP 1226) that is increasingly being firewalled in truck designs.

So, that’s it right? We learned our lesson and we won’t do this again. Yeah, that would be wishful thinking. The trend for regulations mandating real-time vehicle information is increasing at a rapid pace. In the EU, they are working on the next set of vehicle emissions regulations, Euro VII, which will reportedly include real-time connectivity and reporting requirements for heavy vehicles, including information such as emissions readings and so forth. China, who generally follows the Euro standards, expanded on Euro VII to include a China 7b which includes real time connectivity and monitoring of exhaust information for trucks. In the US, the California Air Resource Board (CARB) is finalizing a new set of Heavy-Duty Inspection and Maintenance (HD I/M) regulations. The full details regarding this regulation effort can be found at https://ww2.arb.ca.gov/rulemaking/2021/hdim2021. These regulations include the concept of a Continuously Connected Remote On-Board Diagnostic (CC-ROBD) device which is to be semi-permanently installed into heavy vehicles operating in California. The purpose is to get more frequent and accurate vehicle emissions information to encourage vehicle operators to fix malfunctioning trucks quicker and to improve maintenance programs to avoid faults. The idea is that fixing faulty trucks and improving their operating efficiency will reduce emissions overall. Most of the fleets and OEMs that I work with already have extensive predictive maintenance capabilities and are already incentivized by fuel costs to ensure their trucks are in good working order. The regulations do not expand on what is already expected from an OBD port, but regulators would like the data reported remotely with a much higher degree of frequency. So, at the surface, everything should be aligned. Well… due to what is, in my opinion, a lack of overall industry and regulatory coordination and facilitation, the CARB regulations specify that the CC-ROBD needs to be connected to the OBD port and have read/write access to obtain the necessary emissions information.

Hey, CARB invented and regulated the OBD port into existence. Why shouldn’t they be using it? My friends at Geotab have an excellent history of the OBD port. Ever since the OBD port was put in place it has been reused for a number of different use cases apart from just emissions. It’s how we diagnose vehicle trouble codes, upgrade firmware, and even add optional features. It has become an all-powerful connection which bridges many vehicle networks. Now, I am not saying that you can’t connect to the OBD port in a secure and responsible manner. All I’m saying is that it is very hard to do, and requires significant organizational commitment to cybersecurity and investment in technology and processes. I’ve seen telematics providers who do an excellent job. (Hint: the $50 ELD device at Walmart or the “free” dongle from your insurance company are probably not of that caliber.) The HD I/M regulations do reference some basic cybersecurity requirements for the CC-ROBD devices, including SAE 3005-1 and SAE 3005-2, but there is a lot of wiggle room in there for issues to develop. For example, SAE 3005-2 specifies that only diagnostic messages should be allowed but it is commonly known that diagnostic messages can be abused. Just look at the research done by Ben Gardiner, et. al. on J2497. At least CARB is not allowing self-certification, so maybe we are learning after all.

So, as the trucking industry is moving from OBD and spliced-in wiring harnesses to the RP 1226 connector, they are being pushed back onto the OBD port. Given that this new regulation is scheduled to become effective in 2024, this could cause some serious problems. It takes most large fleets about two years to find, select, test, and deploy a new telematics device to their fleets. The engine designs for 2024 are already pretty much set in stone, so there is not much time to affect any changes to new tractors, never mind the existing ones. Add to this a global supply chain which is completely out of whack and we have the potential for a rather interesting convergence of events. And by interesting, I mean a total disaster.

As we introduce regulations to improve air quality, we may be setting ourselves up for bare shelves if we can’t field compliant trucks in California, which receives most of the inbound freight from China, Taiwan, etc. Remember, trucks move most goods from ports to inland destinations. As we are fond of saying, “If you bought it, a truck brought it.” If we can’t field compliant trucks in California this is not just an issue for California, but for the nation as a whole. Remember, trucks are a critical part of the supply chain delivering food, fuel, parts, raw materials and practically everything we need to keep everything running.

Obviously, my hope is that by raising awareness, the industry and regulators can come up with a solution that works for everyone and keeps our nation’s freight moving. While I am frustrated at our inability to learn our lessons, I am also hopeful and confident that the ever creative and resilient transportation sector will find a way to manage. Organizations such as IOActive are helping telematics vendors and vehicle OEMs assess the cybersecurity of their products. I am working within the industry to help them understand their cybersecurity posture, upcoming regulations which may impact their operations, and to help them come up with plans to reconcile conflicting requirements and mitigate risk. More importantly we still have time…but not much to avoid this potential disaster.

Regards,
Urban Jonson

You can now find me at SERJON providing advisory services in cybersecurity and general information technology. Please feel free to reach out to me at ujonson@serjon.com.  I want to thank IOActive for being gracious enough to host this blog entry, as well as John Sheehy of IOActive and Ben Gardiner at Yellow Flag Security for their contributions to this post.

GUEST BLOG | October 6, 2021

The Risk of Cross-Domain Sharing with Google Cloud’s IAM Policies | Chris Cuevas and Erik Gomez, SADA

We’re part of the security resources at SADA, a leading Google Cloud Premier Partner. With our backgrounds being notably diverse, we appreciate the need for visibility of your core access controls.

If you’re involved in securing your enterprise’s Google Cloud Platform (GCP) environment, ideally, the organization policy for Domain Restricted Sharing (DRS) is well-regarded in your security toolbox. In the event DRS hasn’t made its way into your arsenal, after reading this post, please take a moment and review these docs.

While we’re not covering DRS in-depth here, we will be discussing related concepts. We believe it is crucial for an enterprise to maintain full visibility into which identities have access to its GCP resources. DRS is intended to prevent external or non-enterprise managed identities from obtaining or being provided Identity Access Management (IAM) role bindings within your GCP environment.

If we take this one step further, we believe an enterprise should maintain visibility of the use of its managed identities within external GCP environments. This is the basis of the post where we’ll raise a number of concerns.

The SADA security team has found a feature of IAM that presents challenges with detection and mitigation. We’ll refer to this IAM feature as Cross-Domain Sharing (XDS).

Introduction to XDS

Today, external parties with GCP environments can provide IAM role bindings to your enterprise’s managed identities. These IAM policies can be set and made effective without your knowledge or awareness, resulting in GCP resources being accessed beyond the boundaries of your enterprise. While we agree there are a number of valid use cases for these XDS IAM policies, we are not comfortable with the lack of enterprise visibility.

Malicious actors are constantly seeking new avenues to gain any type of foothold within a targeted organization. Targeting Cloud DevOps and SREs with social engineering attacks yields high rewards as these organizational employees have more elevated privileges and trusted relationships. 

Acknowledging this mindset, let’s consider the following:

Alice (alice@external.org) views internal.org as a prime target for a social engineering campaign combined with her newly discovered XDS bug. She quickly spins up a new GCP project called “Production Secrets” and adds a GCP IAM role binding to it for Bob (bob@internal.org) (see the diagram below).

Alice then initiates a social engineering campaign targeting Bob, informing him of the new “Production Secrets” project. As Alice is not part of the internal.org organization, the “Production Secrets” project is presented in Bob’s list of available GCP Projects without an organization association. And, if Bob searches for “Production Secrets” using the search bar of the GCP cloud console, the project will again be presented with no clear indicators it’s not actually affiliated with the internal.org GCP organization. With Bob not wanting to miss any team deadlines related to adopting the new “Production Secrets” project, he migrates secrets over and begins creating new ones within the “Production Secrets” project. Alice rejoices as internal.org’s secrets are now fully disclosed and available for additional attacks.

cross domain sharing (XDS) example

If your organization’s identities are being used externally, would you be able to prevent, or even detect, this type of activity? If Bob connects to this external project, what other attacks could he be vulnerable to in this scenario?

Keeping in mind Google Cloud’s IAM identities or “members” in IAM Policies can include users, groups, and domains, bad actors can easily increase their target scope from a single user identity to your entire enterprise. Once the nefarious GCP Project “Production Secrets” is in place and accessible by everyone in your enterprise with GCP environment access, the bad actors can wait for unintended or accidental access while developing more advanced phishing ruses.

Now, the good news!

The team at Google Cloud have been hard at work, and they recently released a new GCP Organization Policy constraint specifically to address this concern. The Organization Policy constraint “constraints/resourcemanager.accessBoundaries” once enabled, removes this concern as a broad phishing vector by not presenting external and no-organization GCP Projects within the Cloud Console and associated APIs. While this approach does not address all risks related to XDS, it does reduce the effective target scope.

Before you run and enable this constraint, remember there are valid use cases for XDS, and we recommend identifying all XDS projects and assessing if they are valid, or if they may be adversely affecting your enterprise’s managed identities. This exercise may help you identify external organizations that are contractors, vendors, partners, etc. and should be included in the Organization Policy constraint.

To further reduce the chances of successful exfiltration of your enterprise’s sensitive data from existing GCP resources via XDS abuse, consider also implementing Google Cloud’s VPC Service Controls (VPC-SC).

Is your GCP environment at risk, or do you have security questions about your GCP environment? SADA and IOActive are here to help. Contact SADA for a Cloud Security Assessment and IOActive for a Cloud Penetration Test.

Chris Cuevas, Sr Security Engineer, SADA
Erik Gomez, Associate CTO, SADA


Note: This concern has been responsibly reported to the Google Cloud Security team.

GUEST BLOG | June 9, 2021

Cybersecurity Alert Fatigue: Why It Happens, Why It Sucks, and What We Can Do About It | Andrew Morris, GreyNoise

Introduction

“Although alert fatigue is blamed for high override rates in contemporary clinical decision support systems, the concept of alert fatigue is poorly defined. We tested hypotheses arising from two possible alert fatigue mechanisms: (A) cognitive overload associated with amount of work, complexity of work, and effort distinguishing informative from uninformative alerts, and (B) desensitization from repeated exposure to the same alert over time.”

Ancker, Jessica S., et al. “Effects of workload, work complexity, and repeated alerts on alert fatigue in a clinical decision support system.” BMC Medical Informatics and Decision Making, vol. 17, no. 1, 2017.

My name is Andrew Morris, and I’m the founder of GreyNoise, a company devoted to understanding the internet and making security professionals more efficient. I’ve probably had a thousand conversations with Security Operations Center (SOC) analysts over the past five years. These professionals come from many different walks of life and a diverse array of technical backgrounds and experiences, but they all have something in common: they know that false positives are the bane of their jobs, and that alert fatigue sucks.

The excerpt above is from a medical journal focused on drug alerts in a hospital, not a cybersecurity publication. What’s strangely refreshing about seeing these issues in industries outside of cybersecurity is being reminded that alert fatigue has numerous and challenging causes. The reality is that alert fatigue occurs across a broad range of industries and situations, from healthcare facilities to construction sites and manufacturing plants to oil rigs, subway trains, air traffic control towers, and nuclear plants.

I think there may be some lessons we can learn from these other industries. For example, while there are well over 200 warning and caution situations for Boeing aircraft pilots, the company has carefully prioritized their alert system to reduce distraction and keep pilots focused on the most important issues to keep the plane in the air during emergencies.

Many cybersecurity companies cannot say the same. Often these security vendors will oversimplify the issue and claim to solve alert fatigue, but frequently make it worse. The good news is that these false-positive and alert fatigue problems are neither novel nor unique to our industry.

In this article, I’ll cover what I believe are the main contributing factors to alert fatigue for cybersecurity practitioners, why alert fatigue sucks, and what we can do about it.

Contributing Factors

Alarm fatigue or alert fatigue occurs when one is exposed to a large number of frequent alarms (alerts) and consequently becomes desensitized to them. Desensitization can lead to longer response times or missing important alarms.

https://en.wikipedia.org/wiki/Alarm_fatigue

Technical Causes of Alert Fatigue

Overmatched, misleading or outdated indicator telemetry

Low-fidelity alerts are the most obvious and common contributor to alert fatigue. This results in over-alerting on events with a low probability of being malicious, or matching on activity that is actually benign.

One good example of this is low-quality IP block lists – these lists identify “known-bad IP addresses,” which should be blocked by a firewall or other filtering mechanism. Unfortunately, these lists are often under-curated or completely uncurated output from dynamic malware sandboxes.

Here’s an example of how a “known-good” IP address can get onto a “known-bad” list: A malicious binary being detonated in a sandbox attempts to check for an Internet connection by pinging Google’s public DNS server (8.8.8.8). This connection attempt might get mischaracterized as command-and-control communications, with the IP address incorrectly added to the known-bad list. These lists are then bought and sold by security vendors and bundled with security products that incorrectly label traffic to or from these IP addresses as “malicious.”

Low-fidelity alerts can also be generated when a reputable source releases technical indicators that can be misleading without additional context. Take, for instance, the data accompanying the United States Cybersecurity and Infrastructure Security Agency (CISA)’s otherwise excellent 2016 Grizzly Steppe report. The CSV/STIX files contained a list of 876 IP addresses, including 44 Tor exit nodes and four Yahoo mail servers, which if loaded blindly into a security product, would raise alerts every time the organization’s network attempted to route an email to a Yahoo email address. As Kevin Poulsen noted in his Daily Beast article calling out the authors of the report, “Yahoo servers, the Tor network, and other targets of the DHS list generate reams of legitimate traffic, and an alarm system that’s always ringing is no alarm system at all.”

Another type of a low fidelity alert is the overmatch or over-sensitive heuristic, as seen below:

Alert: “Attack detected from remote IP address 1.2.3.4: IP address detected attempting to brute-force RDP service.”
Reality: A user came back from vacation and got their password wrong three times.

Alert: “Ransomware detected on WIN-FILESERVER-01.”
Reality: The file server ran a scheduled backup job.

Alert: “TLS downgrade attack detected by remote IP address: 5.6.7.8.”
Reality: A user with a very old web browser attempted to use the website.

It can be challenging to security engineering teams to construct correlation and alerting rules that accurately identify attacks without triggering false positives due to overly sensitive criteria.

Legitimate computer programs do weird things

Before I founded GreyNoise, I worked on the research and development team at Endgame, an endpoint security company later acquired by Elastic. One of the most illuminating realizations I had while working on that product was just how many software applications are programmed to do malware-y looking thingsI discovered that tons of popular software applications were shipped with unsigned binaries and kernel drivers, or sketchy-looking software packers and crypters.

These are all examples of a type of supply chain integrity risk, but unlike SolarWinds, which shipped compromised software, these companies are delivering software built using sloppy or negligent software components.

Another discovery I made during my time at Endgame was how common it is for antivirus software to inject code into other processes. In a vacuum, this behavior should (and would) raise all kinds of alerts to a host-based security product. However, upon investigation by an analyst, this was often determined to be expected application behavior: a false positive.

Poor security product UX

For all the talent that security product companies employ in the fields of operating systems, programming, networking, and systems architecture, they often lack skills in user-experience and design. This results in security products often piling on dozens—or even hundreds—of duplicate alert notifications, leaving the user with no choice but to manually click through and dismiss each one. If we think back to the Boeing aviation example at the beginning of this article, security product UIs are often the equivalent of trying to accept 100 alert popup boxes while landing a plane in a strong crosswind at night in a rainstorm. We need to do a better job with human factors and user experience.

Expected network behavior is a moving target

Anomaly detection is a strategy commonly used to identify “badness” in a network. The theory is to establish a baseline of expected network and host behavior, then investigate any unplanned deviations from this baseline. While this strategy makes sense conceptually, corporate networks are filled with users who install all kinds of software products and connect all kinds of devices. Even when hosts are completely locked down and the ability to install software packages is strictly controlled, the IP addresses and domain names with which software regularly communicates fluctuate so frequently that it’s nearly impossible to establish any meaningful or consistent baseline.

There are entire families of security products that employ anomaly detection-based alerting with the promise of “unmatched insight” but often deliver mixed or poor results. This toil ultimately rolls downhill to the analysts, who either open an investigation for every noisy alert or numb themselves to the alerts generated by these products and ignore them. As a matter of fact, a recent survey by Critical Start found that 49% of analysts turn off high-volume alerting features when there are too many alerts to process.

Home networks are now corporate networks

The pandemic has resulted in a “new normal” of everyone working from home and accessing the corporate network remotely. Before the pandemic, some organizations were able to protect themselves by aggressively inspecting north-south traffic coming in and out of the network on the assumption that all intra-company traffic was inside the perimeter and “safe,” Today, however, the entire workforce is outside the perimeter, and aggressive inspection tends to generate alert storms and lots of false positives. If this perimeter-only security model wasn’t dead already, the pandemic has certainly killed it.

Cyberattacks are easier to automate

A decade ago, successfully exploiting a computer system involved a lot of work. The attacker had to profile the target computer system, go through a painstaking process to select the appropriate exploit for the system, account for things like software version, operating system, processor architecture and firewall rules, and evade host- and system-based security products.

In 2020, there are countless automated exploitation and phishing frameworks both open source and commercial. As a result, exploitation of vulnerable systems is now cheaper, easier and requires less operator skill.

Activity formerly considered malicious is being executed at internet-wide scale by security companies

“Attack Surface Management,” a cybersecurity sub-industry, identifies vulnerabilities in their customers’ Internet-facing systems and alerts them of such. This is a good thing, not a bad thing, but the issue is not what these companies do—it’s how they do it.

Most Attack Surface Management companies constantly scan the entire internet to identify systems with known-vulnerabilities, and organize the returned data by vulnerability and network owner. In previous years, an unknown remote system checking for vulnerabilities on a network perimeter was a powerful indicator of an oncoming attack. Now, alerts raised from this activity provide less actionable value to analysts and happen more frequently as more of these companies enter the market.

The internet is really noisy

Hundreds of thousands of devices, malicious and benign, are constantly scanning, crawling, probing, and attacking every single routable IP address on the entire internet for various reasons. The more benign use cases include indexing web content for search engines, searching for malware command-and-control infrastructure, the above-mentioned Attack Surface Management activity, and other internet-scale research. The malicious use cases are similar: take a reliable, common, easy-to-exploit vulnerability, attempt to exploit every single vulnerable host on the entire internet, then inspect the successfully compromised hosts to find accesses to interesting organizations.

At GreyNoise, we refer to the constant barrage of Internet-wide scan and attack traffic that every routable host on the internet sees as “Internet Noise.” This phenomenon causes a significant amount of pointless alerts on internet-facing systems, forcing security analysts to constantly ask “is everyone on the internet seeing this, or just us?” At the end of the day, there’s a lot of this noise: over the past 90 days, GreyNoise has analyzed almost three million IP addresses opportunistically scanning the internet, with 60% identified as benign or unknown, and only 40% identified as malicious.

Non-Technical Causes of Alert Fatigue

Fear sells

An unfortunate reality of human psychology is that we fear things that we do not understand, and there is absolutely no shortage of scary things we do not understand in cybersecurity. It could be a recently discovered zero-day threat, or a state-sponsored hacker group operating from the shadows, or the latest zillion-dollar breach that leaked 100 million customer records. It could even be the news article written about the security operations center that protects municipal government computers from millions of cyberattacks each day. Sales and marketing teams working at emerging cybersecurity product companies know that fear is a strong motivator, and they exploit it to sell products that constantly remind users how good of a job they’re doing.

And nothing justifies a million-dollar product renewal quite like security “eye candy,” whether it’s a slick web interface containing a red circle with an ever-incrementing number showing the amount of detected and blocked threats, or a 3D rotating globe showing “suspicious” traffic flying in to attack targets from many different geographies. The more red that appears in the UI, the scarier the environment, and the more you need their solution. Despite the fact that these numbers often serve as “vanity metrics” to justify product purchases and renewals, many of these alerts also require further review and investigation by the already overworked and exhausted security operations team.

The stakes are high

Analysts are under enormous pressure to identify cyberattacks targeting their organization, and stop them before they turn into breaches. They know they are the last line of defense against cyber threats, and there are numerous stories about SOC analysts being fired for missing alerts that turn into data breaches.

In this environment, analysts are always worried about what they missed or what they failed to notice in the logs, or maybe they’ve tuned their environment to the point where they can no longer see all of the alerts (yikes!). It’s not surprising that analyst worry of missing an incident has increased. A recent survey by FireEye called this “Fear of Missing Incidents” (FOMI). They found that three in four analysts are worried about missing incidents, and one in four worry “a lot” about missing incidents. The same goes for their supervisors – more than six percent of security managers reported losing sleep due to fear of missing incidents.

Is it any wonder that security analysts exhibit serious alert fatigue and burnout, and that SOCs have extremely high turnover rates?

Everything is a single pane of glass

Security product companies love touting a “single pane of glass” for complete situational awareness. This is a noble undertaking, but the problem is that most security products are really only good at a few core use cases and then trend towards mediocrity as they bolt on more features. At some point, when an organization has surpassed twenty “single panes of glass,” the problem has become worse.

More security products are devoted to “preventing the bad thing” than “making the day to day more efficient”

There are countless security products that generate new alerts and few security products that curate, deconflict or reduce existing alerts. There are almost no companies devoted to reducing drag for Security Operations teams. Too many products measure their value by their customers’ ability to alert on or prevent something bad, and not by making existing, day-to-day security operations faster and more efficient.

Product pricing models are attached to alert/event volume

Like any company, security product vendors are profit-driven. Many product companies are heavily investor-backed and have large revenue expectations. As such, Business Development and Sales teams often price products with scaling or tiered pricing models based on usage-oriented metrics like gigabytes of data ingested or number of alerts raised. The idea is that, as customers adopt and find success with these products, they will naturally increase usage, and the vendor will see organic revenue growth as a result.

This pricing strategy is often necessary when the cost of goods sold increases with heavier usage, like when a server needs additional disk storage or processing power to continue providing service to the customer.

But an unfortunate side effect of this pricing approach is that it creates an artificial vested interest in raising as many alerts or storing as much data as possible. And it reduces the incentive to build the capabilities for the customer to filter and reduce this “noisy” data or these tactically useless alerts.

If the vendor’s bottom line depends on as much data being presented to the user as possible, then they have little incentive to create intelligent filtering options. As a result, these products will continue to firehose analysts, further perpetuating alert fatigue.

False positives drive tremendous duplication of effort

Every day, something weird happens on a corporate network and some security product raises an alert to a security analyst. The alert is investigated for some non-zero amount of time, is determined to be a false positive caused by some legitimate application functionality, and is dismissed. The information on the incident is logged somewhere deep within a ticketing system and the analyst moves on.

The implications of this are significant. This single security product (or threat intelligence feed) raises the same time-consuming false-positive alert on every corporate network where it is deployed around the world when it sees this legitimate application functionality. Depending on the application, the duplication of effort could be quite staggering. For example, for a security solution deployed across 1000 organizations, an event generated from unknown network communications that turns out to be a new Office 365 IP address could generate 500 or more false positives. If each takes 5 minutes to resolve, that adds up to a full week of effort.

Nobody collaborates on false positives

Traditional threat intelligence vendors only share information about known malicious software. Intelligence sharing organizations like Information Sharing and Analysis Centers (ISACs), mailing lists, and trust groups have a similar focus. None of these sources of threat intelligence focus on sharing information related to confirmed false-positive results, which would aid others in quickly resolving unnecessary alerts. Put another way: there are entire groups devoted to reducing the effectiveness of a specific piece of malware or threat actor between disparate organizations. However, no group supports identifying cases when a benign piece of software raises a false positive in a security product.

Security products are still chosen by the executive, not the user

This isn’t unusual. It is a vestige of the old days. Technology executives maintain relationships with vendors, resellers and distributors. They go to a new company and buy the products they are used to and with which they’ve had positive experiences.

Technologies like Slack, Dropbox, Datadog, and other user-first technology product companies disrupted and dominated their markets quickly because they allowed enterprise prospects to use their products for free. They won over these prospects with superior usability and functionality, allowing users to be more efficient. While many technology segments have adopted this “product-led” revolution, it hasn’t happened in security yet, so many practitioners are stuck using products they find inefficient and clunky.

Why You Should Care

The pain of alert fatigue can manifest in several ways:

  1. Death (or burnout) by a thousand cuts, leading to stress and high turnover
  2. Lack of financial return to the organization
  3. Compromises or breaches missed by the security team

There is a “death spiral” pattern to the problem of alert fatigue: at its first level, analysts spend more and more time reviewing and investigating alerts that provide diminishing value to the organization. Additional security products or feeds are purchased that generate more “noise” and false positives, increasing the pressure on analysts. The increased volume of alerts from noisy security products cause the SOC to need a larger team, with the SOC manager trying to grow a highly skilled team of experts while many of them are overwhelmed, burned out, and at risk of leaving.

From the financial side of things, analyst hours spent investigating pointless alerts are a complete waste of security budget. The time and money spent on noisy alerts and false positives is often badly needed in other areas of the security organization to support new tools and resources. Security executives face a difficult challenge in cost justifying the investment of good analysts being fed bad data.

And worst of all, alert fatigue contributes to missed threats and data breaches. In terms of human factors, alert fatigue can create a negative mindset leading to rushing, frustration, mind not on the task, or complacency. As I noted earlier, almost 50% of analysts who are overwhelmed will simply turn off the noisy alert sources. All of this contributes to an environment where threats are more easily able to sneak through an organization’s defenses.

What can we do about it?

The analyst

Get to “No” faster. To some extent, analysts are the victim of the security infrastructure in their SOC. The part of the equation they control is their ability to triage alerts quickly and effectively. So from a pragmatic viewpoint, find ways to use analyst expertise and time as effectively as possible. In particular, find tools and resources that helps you to rule out alerts as fast as possible.

The SOC manager

Tune your alerts. There is significant positive ROI value to investing in tuning, diverting, and reducing your alerts. Tune your alerts to reduce over-alerting. Leverage your Purple Team to assist and validate your alert “sensitivity.” Focus on the critical TTPs of threat actors your organization faces, and audit your attack surface and automatically filter out what doesn’t matter. These kinds of actions can take a tremendous load off your analyst teams and help them focus on the things that do matter.

The CISO

More is not always better. Analysts are scarce, valuable resources. They should be used to investigate the toughest, most sophisticated threats, so use the proper criteria for evaluating potential products and intelligence feeds, and make sure you understand the potential negatives (false positives, over-alerting) as well as the positives. Be skeptical when you hear about a single pane of glass. And focus on automation to resolve as many of the “noise” alerts as possible.

Security vendors

Focus on the user experience. Security product companies need to accept the reality that they cannot solve all of their users’ security problems unilaterally, and think about the overall analyst experience. Part of this includes treating integrations as first-class citizens, and deprioritizing dashboards. If everything is a single pane of glass, nothing is a single pane of glass—this is no different than the adage that “if everyone is in charge, then no one is in charge.” Many important lessons can be learned from others who have addressed UI/UX issues associated with alert fatigue, such as healthcare and aviation.

The industry

More innovation is needed. The cybersecurity industry is filled with some of the smartest people in the world, but lately we’ve been bringing a knife to a gunfight. The bad guys are scaling their attacks tremendously via automation, dark marketplaces, and advanced technologies like artificial intelligence and machine learning. The good guys have been spending all their time in a painfully fragmented and broken security environment, with all their time focused on identifying the signal, and none on reducing the noise. This has left analysts struggling to manually muscle through overwhelming volumes of alerts. We need some security’s best and brightest to turn their amazing brains to the problem of reducing the noise in the system, and drive innovation that helps analysts focus on what matters the most.

Conclusion

Primary care clinicians became less likely to accept alerts as they received more of them, particularly as they received more repeated (and therefore probably uninformative) alerts.

–  Ancker, et al.

Our current approach to security alerts, requiring analysts to process ever-growing volumes, just doesn’t scale, and security analysts are paying the price with alert fatigue, burnout, and high turnover. I’ve identified a number of the drivers of this problem, and our next job is to figure out how to solve it. One great area to start is to figure out how other industries have improved their approach, with aviation being a good potential model. With some of these insights in mind, we can figure out how to do better in our security efforts by doing less.

Andrew Morris
Founder of GreyNoise

GUEST BLOG | November 19, 2020

Hiding in the Noise | Corey Thuen

IOActive guest blog - Corey Thuen

Greetings! I’m Corey Thuen. I spent a number of years at Idaho National Laboratory, Digital Bond, and IOActive (where we affectionately refer to ourselves as pirates, hence the sticker). At these places, my job was to find 0-day vulnerabilities on the offensive side of things.

Now, I am a founder of Gravwell, a data analytics platform for security logs, machine, and network data. It’s my background in offensive security that informs my new life on the defensive side of the house. I believe that defense involves more than looking for how threat actor XYZ operates, it requires an understanding of the environment and maximizing the primary advantage that defense has—this is your turf and no one knows it like you do. One of my favorite quotes about security comes from Dr. Eugene Spafford at Purdue (affectionately known in the cybersecurity community as “Spaf”) who said, “A system is good if it does what it’s supposed to do, and secure if it doesn’t do anything else.” We help our customers use data to make their systems good and secure, but for this post let’s talk bad guys, 0-days, and hiding in the noise.

A very important part of cybersecurity is threat actor analysis and IOC generation. Security practitioners benefit from knowledge about how adversary groups and their toolkits behave. Having an IOC for a given vulnerability is a strong signal, useful for threat detection and mitigation. But what about for one-off actors or 0-day vulnerabilities? How do we sort through the million trillion events coming out of our systems to find and mitigate attacks?

IOActive has a lot of great folks in the pirate crew, but at this point I want to highlight a pre-pandemic talk from Jason Larsen about actor inflation. His thesis and discussion are both interesting and hilarious; the talk is certainly worth a watch. But to tl;dr it for you, APT groups are not the only attackers interested in, or capable of, compromising your systems. Not by a long shot. 

When I was at IOActive (also Digital Bond and Idaho National Laboratory), it was my job as a vulnerability researcher to find 0-days and provide detailed technical write-ups for clients so vulnerabilities could be discovered and remediated. It’s this work that gives me a little different perspective when it comes to event collection and correlation for security purposes. I have a great appreciation for weak signals. As an attacker, I want my signals to the defenders to be as weak as possible. Ideally, they blend into the noise of the target environment. As a defender, I want to filter out the noise and increase the fidelity of attacker signals.

Hiding in the Data: Weak Signals

What is a weak signal? Let’s talk about an example vulnerability in some ICS equipment. Exploiting this particular vulnerability required sending a series of payloads to the equipment until exploitation was possible. Actually exploiting the vulnerability did not cause the device to crash (which often happens with ICS gear), nor did it cause any other obvious functionality issues. However, the equipment would terminate the network communication with an RST packet after each exploit attempt. Thus, one might say an IOC would be an “unusual number of RST packets.” 

Now, any of you readers who have actually been in a SOC are probably painfully aware of the problems that occur when you treat weak signals as strong signals. Computers do weird shit sometimes. Throw in users and weird shit happens a lot of the time. If you were to set up alerts on RST packet indicators, you would quickly be inundated; that alert is getting switched off immediately. This is one area where AI/ML can actually be pretty helpful, but that’s a topic for another post.

rst packet spike

The fact that a given cyber-physical asset has an increased number of RST packets is a weak signal. Monitoring RST packet frequency itself is not that helpful and, if managed poorly, can actually cause decreased visibility. This brings us to the meat of this post: multiple disparate weak signals can be fused into a strong signal.

Let’s add in a fictitious attacker who has a Metasploit module to exploit the vulnerability I just described. She also has the desire to participate in some DDoS, because she doesn’t actually realize that this piece of equipment is managing really expensive and critical industrial processes (such a stretch, I know, but let’s exercise our imaginations). Once exploited, the device attempts to communicate with an IP address to which it has never communicated previously—another weak signal. Network whitelisting can be effective in certain environments, but an alert every time a whitelist is violated is going to be way way too many alerts. You should still collect them for retrospective analysis (get yourself an analytics platform that doesn’t charge for every single event you put in), but every network whitelist change isn’t going to warrant action by a human.

As a final step in post-exploitation, the compromised device initiates multiple network sockets to a slack server, above the “normal” threshold for connection counts coming out of this device on a given day. Another weak signal.

So what has the attacker given a defender? We have an increased RST packet count, a post-exploitation download from a new IP address, and then a large uptick in outgoing connections. Unless these IP addresses trigger on some threat list, they could easily slide by as normal network activity hidden in the noise.

An analyst who has visibility into NetFlow and packet data can piece these weak signals together into a strong signal that actually warrants getting a human involved; this is now a threat hunt. This type of detection can be automated directly in an analytics platform or conducted using an advanced IDS that doesn’t rely exclusively on IOCs. When it comes to cybersecurity, the onus is on the defender to fuse these weak signals in their environment. Out-of-the-box security solutions are going to fail in this department because, by nature, they are built for very specific situations. No two organizations have exactly the same network architecture or exactly the same vendor choices for VPNs, endpoints, collaboration software, etc. The ability to fuse disparate data together to create meaningful decisions for a given organization is crucial to successful defense and successful operations.

Hiding Weak Signals in Time

One other very important aspect of weak signals is time. As humans, we are particularly susceptible to this problem but the detection products on the market face challenges here too. A large amount of activity or a large change in a short amount of time becomes very apparent. The frame of reference is important for human pattern recognition and anomaly detection algorithms. Just about any sensor will throw a “port scan happened” alert if you `nmap -T5 -p- 10.0.0.1-255.` What about if you only send one packet per second? What about one per day? Detection sensors encounter significant technical challenges keeping context over long periods of time when you consider that some organizations are generating many terabytes of data every day.

An attacker willing to space activity out over time is much less likely to be detected unless defenders have the log retention and analytics capability to make time work for defense. Platforms like Gravwell and Splunk were built for huge amounts of time series data, and there are open-source time series databases, like InfluxDB, that can provide these kinds of time-aware analytics. Key/value stores like ELK can also work, but they weren’t explicitly built for time series, and time-series-first is probably necessary at scale. It’s also possible to do these kinds of time-based data science activities using Python scripts, but I wouldn’t recommend it.

Conclusion

Coming from a background in vulnerability research, I understand how little it can take to compromise a host and how easily that compromise can be lost in the noise when there are no existing, high-fidelity IOCs. This doesn’t just apply to exploits, but also lost credentials and user behavior.

Relying exclusively on pre-defined IOCs, APT detection mechanisms or other “out-of-the-box” solutions causes a major gap in visibility and gives up the primary advantage that you have as a defender: this is your turf. Over time, weak signals are refined and automated analysis can be put in place to make any attacker stepping foot into your domain stand out like a sore thumb.

Data collection is the first step to detecting and responding to weak signals. I encourage organizations to collect as much data as possible. Storage is cheap and you can’t know ahead of time which logfile or piece of data is going to end up being crucial to an investigation.

With the capability to fuse weak signals into a high-fidelity, strong signal on top of pure strong signal indicators and threat detection, an organization is poised to be successful and make sure their systems “do what they’re supposed to do, and don’t do anything else.”

-Corey Thuen

GUEST BLOG | November 3, 2020

Low-hanging Secrets in Docker Hub and a Tool to Catch Them All | Matías Sequeira

TL;DR: I coded a tool that scans Docker Hub images and matches a given keyword in order to find secrets. Using the tool, I found numerous AWS credentials, SSH private keys, databases, API keys, etc. It’s an interesting tool to add to the bug hunter / pentester arsenal, not only for the possibility of finding secrets, but for fingerprinting an organization. On the other hand, if you are a DevOps or Security Engineer, you might want to integrate the scan engine to your CI/CD for your Docker images.

GET THE TOOL: https://github.com/matiassequeira/docker_explorer

The idea for this work came up when I was opening the source code for a project on which I was collaborating. Apart from migrating the source code to an open GitHub, we prepared a ready-to-go VM that was uploaded to an S3 bucket and a few Docker images that we pushed to Docker Hub. A couple of days later, a weekend to be more specific, we got a warning from AWS stating that our SNS resources were being (ab)used – more than 75k emails had been sent in a few hours. Clearly, our AWS credentials were exposed.

Once we deleted all the potentially exposed credentials and replaced them in our environments, we started to dig into the cause of the incident and realized that the credentials were pushed to GitHub along with the source code due to a miscommunication within the team. As expected, the credentials were also leaked to the VM, but the set of Docker images were fine. Anyhow, this got me thinking about the possibility of scanning an entire Docker images repository, the same way hackers do with source code repositories. Before starting to code something, I had to check whether it was possible.

Analyzing Feasibility

Getting a list of images

The first thing I had to check was if it was possible to retrieve a list of images that match a specific keyword. By taking a look at the API URLs using a proxy, I found:

https://hub.docker.com/v2/search/repositories?query={target}&page={page}&page_size={page_size}

The only limitation I found with API V2 was that it wouldn’t retrieve anything beyond page number 100. So, given the maximum page size of 100, I wouldn’t be able to scan more than 10k containers per keyword. This API also allows you to sort the list by pull count, so, if we retrieve the repositories with fewer pulls (in other words, the newest repositories), we have a greater chance of finding something fresh. Although there’s a V1 of the API that has many other interesting filters, it wouldn’t allow me to retrieve more than ~2.5k images.

After getting the image name, and since not all the images had the `latest` version, I had to make a second request to get a list of versions for each image to the following endpoint:

https://hub.docker.com:443/v2/repositories/{image}/tags/?page_size=25&page=1

Once I had the `image:version`, the only thing left was to pull the image, create a temporary container, and dump its filesystem.

Analyzing the image dump

This was one of the most important parts of my research, because it involved the engine I used for scanning the images. Before trying to create a scan engine, which is a big enough problem, and with so many options to choose from, I started to look for the most suitable tool which could work on an entire filesystem, which could look for a wide variety of secrets. After doing some research, I found many options, but sadly, most of them were oriented to GitHub secrets, had a high rate of false positives, or evaluated secrets as isolated strings (without considering format var=value).

While discussing this with one of my colleagues, he mentioned a tool that had been published less than a week ago called Whispers, a great tool by Artëm Tsvetkov and Christian Martorella. By doing some tests, I found this tool very convenient for several reasons:

  • Contains many search rules (AWS secrets, GitHub secrets, etc.) that assess potential secrets as var=value format
  • Findings are classified by type, impact (minor, major, critical, blocker), file location, etc.
  • Allows you to add more rules and plugins
  • Written in Python3 and thus very easy to modify

Developing the tool

Once I had everything in place, I started to work on a script to automate the process in order to scan thousands of images. By using/testing the tool, I came up with additional requirements, such as allowing the user to provide the number of core processors to use, limit Whispers execution time, store logs separately for each container, delete containers and images in order to avoid filling up the disk space, etc.

Also, in order to maximize the number of findings, minimize the number of false positives, and ease data triage, I made a couple of modifications to the standard Whispers:

  • Added a rule for Azure stuff
  • Excluded many directories and files
  • Saved files with potential secrets into directories

Running the tool

With the tool pretty much ready to analyze bulk images, I signed up for two different DigitalOcean accounts and claimed $100 in credit for each. Later, I spun up two servers, set up the tool in each environment, and ran the tool using a large set of keywords/targets.

The keywords/images I aimed to scan were mainly related to technologies that handle or have a high probability of containing secrets, such as:

  • DevOps software (e.g. Kubernetes / K8s / Compose / Swarm / Rancher)
  • Cloud services (e.g. AWS / EC2 / CloudFront / SNS / AWS CLI / Lambda)
  • CI/CD software (e.g. Jenkins / CircleCI / Shippable)
  • Code repositories (e.g. GitLab / GitHub)
  • Servers (e.g. NGINX / Apache / HAProxy)

After a month of running the tool, I found myself with a total of 20 GB of zipped data ready to triage, for which I had to develop an extra set of tools to clean all of the data, applying and reutilizing the same criteria. Among the rules or considerations, the most important were:

  • Created a list of false-positive strings that were reported as AWS access keys
  • Deleted AWS strings containing the string “EXAMPLE”
  • Discarded all the potential passwords that were not alphanumeric and shorter than 10 chars
  • Discarded passwords containing the string “password”
  • Discarded test, dummy, or incorrect SSH private keys
  • Deleted duplicate keys /values for each image, to lessen the amount of manual checking

Results

After many weeks of data triage, I found a wide variety of secrets, such as 200+ AWS accounts (of which 64 were still alive and 14 of these were root), 1,500+ valid SSH keys, Azure keys, several databases, .npmrc tokens, Docker Hub accounts, PyPI repository keys, many SMTP servers, reCAPTCHA secrets, Twitter API keys, Jira keys, Slack keys, and a few others.

Within the most notable findings was the whole infrastructure (SMTP servers, AWS keys, Twitter keys, Facebook keys, Twilio keys, etc.) of a US-based software company with approximately 300 employees. I reached out to the company, but, unfortunately, I did not hear back. Also, I identified an Argentinian software company focusing on healthcare that had a few proofs-of-concept with valid AWS credentials.

The most commonly overlooked files were ‘~/aws/credentials’, Python scripts with hardcoded credentials, the Linux bash_history, and a variety of .yml files.

So, what can I use the tool for?

If you are a bug bounty hunter or a pentester, you can use the tool with different keywords, such as the organization name, platform name, or developers’ names (or nicknames) involved in the program you are targeting. The impact of finding secrets can range from a data breach to the unauthorized use of resources (sending spam campaigns, Bitcoin mining, DDoS attack orchestration, etc.).

Security recommendations for Docker images creation

If you work in DevOps, development, or SysAdmin and currently use cloud infrastructure, these tips might come handy for Docker image creation:

  • Always try to use a fresh, clean Docker image.
  • Delete your SSH private key and SSH authorized keys:
    • sudo shred myPrivateSSHKey.pem authorized_keys
    • rm myPrivateSSHKey.pem authorized_keys
  • Delete ~.aws/credentials using the above method.
  • Clean bash history:
    • history -c
    • history -w
  • Don’t hardcode secrets in the code. Instead, use environment variables and inject them at the moment of container creation. Also, when possible, use mechanisms provided by container orchestrators to store/use secrets and don’t hardcode them in config files.
  • Perform a visual inspection of your home directory, and don’t forget hidden files and directories:
    • ls -la
  • If you are using a free version of Docker Hub, assume that everything within the image is public.

Update SEPTEMBER 2020: While writing this blog post, Docker Hub announced that by NOVEMBER 2020 it will start to limit the number of downloads by time. But, there’s always a way ;).

GET THE TOOL: https://github.com/matiassequeira/docker_explorer

-Matías Sequeira

Matías Sequeira is an independent security researcher. He started his career in cybersecurity as an Infosec consultant, working for clients in the financial and medical software fields. Concurrently, he began conducting research into ransomware and its defenses as part of the AntiRansomware Team. In recent years, Matías has been focused in the R&D of cybersecurity tools which were presented in conferences such as BlackHat, Ekoparty, and Hack In the Box, amongst others. Currently, he is pursuing an MSc in Cybersecurity at Northeastern University under a Fulbright scholarship and likes playing CTFs during his free time.

GUEST BLOG | August 13, 2020

IOActive Guest Blog | Urban Jonson, Heavy Vehicle Cyber Security Program, NMFTA

Hello,

My name is Urban Jonson, and I’m the Chief Technology Officer and Program Manager, Heavy Vehicle Cyber Security Program, with the National Motor Freight Traffic Association, Inc. (NMFTA).

I’m honored that IOActive has afforded me this guest blogging opportunity to connect with you. The research at IOActive is always innovative and they have done some really good work in transportation, including aviation, truck electronic logging devices, and even satellites. Being among such technical experts really raises the stakes of the conversation. Luckily, I can lean on some of my awesome collaborators like Ben Gardiner at NMFTA, as well as countless industry experts who I’m privileged to call friends.

heavy trucking industry

I feel a profound sense of loss of technical progress in our field this year. All of my favorite technical events, where I can connect with people to discuss and share ideas, have been canceled (heavy sigh). The cancellation of the NMFTA HVCS meetings have been the hardest for me, as they pull together an incredible community in the motor freight industry. Many of the attendees are now my friends and I miss them.

The cancelation of my other favorite industry events, Blackhat/DEF CON, CyberTruck Challenge, and ESCAR, have been hard as well. While I do enjoy many of the presentations at these conferences, my biggest benefit is meet one-on-one with some of the brightest minds in the industry. Where else do I get to sit down with Craig Smith and casually discuss the state of the automotive industry? I remember having great conversations with Colin O’Flynn about wily new ideas on power fault injection at many different events. These one-on-one opportunities for conversations, collaboration, and information sharing are invaluable to me.

This year I had wanted to talk to some of my friends about Triton malware and vehicle safety systems such as lane departure assist, crash avoidance, and adaptive cruise control. Alas, no such luck this year. So, I’m going to dump this discussion out in the open here.

The Triton Malware

First, for the uninitiated, a quick review of the Triton malware. The Triton malware intrusion was a sophisticated attack that targeted a petrochemical plant in the Middle East in 2017. How the attackers first got into the network is a little bit of a mystery, but most likely the result of a misconfigured firewall or spearphishing attack. The first piece was a Windows-based remote access tool to give the attackers remote access to an engineering workstation. What came next was very interesting: according to reports1, a highly specific secondary attack was mounted from the compromised engineering workstation against a specific Schneider Electric Triconex2 safety controller and select firmware versions (10.0 – 10.4) using a zero-day vulnerability. The safety controllers in question are designed to take direct action to initiate shutdown operations for the plant without user intervention in the case of serious safety issues.

Stop and think about that for a second—someone had taken the time to figure out which specific controller and firmware versions were running at the plant, obtain similar hardware to research, find a zero-day vulnerability, then research and compromise the plant’s IT infrastructure, just to install this malware. That is not an insignificant effort, and not as easy as they make it out to be in the hacker movies.

An unplanned “accidental” shutdown of the plant revealed the presence of the malware and the intrusion. It was theorized that the attackers wanted to obtain the capability but not use it, and that the shutdown was an accidental reveal3. Cyberphysical attacks are usually broken into separate cyber and physics packages. Given the effort put into the attack, it is extremely unlikely the attacker would have intended on such a dumb physics package. If you want an example of a well thought out cyber-physical attack read up on Operation Olympic Games which targeted Iranian uranium centrifuges4. This goes to show that if you play around with bombs, physical or digital, they can go off unintentionally.

Another interesting tell occurred as the response team was trying to secure and clean up the intrusion, when the attackers fought back to try to maintain a foothold. Actively engaging blue-team efforts in real-time is risky, as it can quickly lead to full attribution and unwanted consequences. This tells us that the attackers considered this capability a high priority; they had made a large investment in resources to be able to compromise the safety controllers and they were determined to keep it. A great deal of information about this intrusion is still murky and closely guarded, but it is generally considered to have potentially been one of the deadliest malware attacks so far, had the capability been leveraged.

Safety Controllers

The safety controller concept of a contained device taking decisive action without user intervention sounds eerily familiar. The concept is virtually everywhere in the new safety technologies in modern cars and trucks in the form of crash avoidance, lane departure assist, and other features for which we have all seen the ads and literature. FMCSA is even studying how to retrofit existing trucks with some of these promising new safety technologies, which can help reduce accidents and save lives.

These automotive safety systems rely on sensors such as cameras and LIDAR to get the input they need to make decisions affecting steering, braking, and other actions. This brings up some interesting questions. How secure are these components? How diverse is the marketplace; that is, do we have risk aggregation through the deployment of just a few models/versions of sensors? Is there a specific sensor model that is ubiquitous in the industry? Do we have our own version of a Triconex safety controller that we need to worry about?

How secure are these components? How diverse is the marketplace; that is, do we have risk aggregation through the deployment of just a few models/versions of sensors? Is there a specific sensor model that is ubiquitous in the industry?

The short answer seems to be yes. I read an interesting paper on Fault Detection, Isolation, Identification and Recovery (FDIIR) for automotive perception sensors by Goelles, Schlager, and Muckenhuber from Virtual Vehicle Research5. (Note: This paper is an interesting paper worth reading in its own right, and discusses a sensor fault classification system that can be applied to other domains, such as aviation and maritime.) The conclusion of the paper is that, for the most part, systems such as LIDAR treated as black boxes with little or no knowledge of the internal firmware or interfaces. This is mostly due to a small number of companies in fierce competition working hard to protect their intellectual property. In my opinion, that is not a good sign. If we need multiple sensors to cooperatively decide on safety-critical actions, transparency is going to be crucial to designing a trusted system. The present lack of transparency in these systems almost certainly implies a lack of security assurance for their interfaces. This sort of inscrutable interface (aka attack surface) is a hacker’s delight.

All of this is not really new—our own Ben Gardiner discussed similar points in 20176. So, what other truck-specific safety system black boxes can we discuss through the filter of the Triton attack that might not be ready knowledge to you? Enter RP 1218.

RP 1218 – Remote Disablement of Commercial Vehicles

First a little background: the American Trucking Association’s (ATA) Technology Maintenance Council (TMC) develops recommended practices (RPs) for the trucking industry. The council is comprised of representatives from motor carriers, OEMs, and Tier 1 suppliers for the truck industry. They generally do great work and mostly focus on physical truck maintenance-related issues, but they also work on other recommended practices such as Telematics-Tractor connectors (RP 1226). These are, strictly speaking, only recommendations for the industry, but many of them end up being de-facto standards, especially in-vehicle electronics.

The TMC has recently decided to take up RP 1218 and develop an updated version, which is how it came to our attention. Now, why has this RP drawn our attention and ire at the NMFTA HVCS? The title for RP 1218 is “Guidelines for Remote Disablement of Commercial Vehicles.” It consists of a recommended practice on how to implement a remote shutdown and/or limp mode for a heavy truck. The current version is rather old, from 2005. The problem is that cybersecurity was not at the forefront of the trucking industry’s thinking in at that time.

The core security premise of RP 1218 was based around “secret” CAN message instructions sent to the engine controller. Uh-oh. CAN doesn’t include encryption, so there’s no such thing as a secret CAN message. Well, not for very long anyway. Even the existence of the RP was enough to give us the jitters.

We immediately set out to determine if anyone had implemented RP 1218 and did a basic survey of remote disablement technology with the assistance of our friends at CanBusHack. The good news was that we could not find anyone who had implemented RP 1218 as specified. The bad news was that we found plenty of other ways to do it, including messing around with diesel exhaust fluid (DEF) messages and derate limits, among others. I’m not going to dig into those details here.

We also discovered a robust global market for both Remote Vehicle Shutdown (RVS) and Remote Vehicle Disablement (RVD). Luckily for me, most of that market is outside of North America, my primary area of concern. The methods by which the various vendors achieved RVS/RVD varied significantly, but were not as simple as sending a message to the engine using RP 1218. That’s good, but the problem is that companies are building in a full remote stop button on their entire fleet. It seems that the sensitivity of RVS/RVD is well understood, and due to this concern, there’s not a great deal of transparency into these systems; we found it difficult to get even basic information. Another inscrutable black box.

While you can certainly make the case that it’s necessary to be able to disable a vehicle from a national security perspective, to prevent truck hijackings and terrorists turning trucks into battering rams, such a system would need to be absolutely bulletproof. While there are some ideas on how to mitigate such risks using things line Consequence driven, Cyber-informed Engineering (EEC)7, that’s a very hard thing to accomplish when it involves black-box technology with unknown interfaces. It’s worth repeating, black boxes with unknown interfaces are huge flashing targets for threat actors.

If we look at this through the lens of the Triton intrusion, how much effort do you think someone would go through to obtain the ability to affect motor transportation at scale? Do you think they would conduct the same level of research on infrastructure and components, and attempt to compromise these systems so that they can be hit at the most critical time? I certainly do. This whole set of problems is pressing, and I really needed to get some perspective and ideas.

How much effort do you think someone would go through to obtain the ability to affect motor transportation at scale?

This brings us back to meeting up in person with industry experts with extensive expertise in industrial control systems (ICS), automotive, and many other areas. When I’m looking at this massive problem, I don’t get to ask important questions of my friends, many of whom I only see once a year at these events. Are there any lessons from the Triton ICS attack that we can leverage in designing active safety systems for vehicles? Can we develop an attack tree for someone attempting a sophisticated nation-state attack against vehicle safety control systems or remote disablement vendors? How do we best defend against someone who would like to own our infrastructure and unleash disruption on our transportation sector? How do we improve our designs, resiliency, processes, and general security posture against this type of threat?

Multi-mode Transportation Sharing and Support

Unfortunately, in today’s world, I can’t ask my friends in a quiet corner over a drink and strategize on a way to mitigate this risk. I’m sure that I’m not the only one feeling bereft of such opportunities.

By the way, informal collaborations at security events are exactly how the NMFTA Heavy Vehicle Cyber Security program came to be in the first place. After a Miller and Valasek presentation at Black Hat 2014, I sat down with a bright guy from Cylance at the Minus5 Ice Bar at the Mandalay Bay and we “doodled” the attack vectors on a truck. After taking a look at the finished napkin, we were both horrified. When I returned to Alexandria, Virginia, I started doing the research that eventually became our first white paper on heavy vehicle cybersecurity, which we “published” in September 20158. Okay, honestly, we sat on it for a couple of years before we made it public.

So how do I move past this obstacle to fun, as well as to progress and work on keeping trucks secure and moving? In my case, I’ve been endeavoring to create a multi-mode transportation sharing and support group. This is an informal monthly gathering of a few select folks from various transportation modes, sharing resources and connections and generally supporting each other’s missions. Additionally, I’m trying (and mostly still failing) to reach out to those wonderful smart and talented friends to connect with them and see how they are doing personally, and to share whatever resources, references, articles, papers, connections, or technology I can provide to help them be successful in their missions. I ask whether they might be able to give me some advice or novel take on my problem and discuss possible solutions and mitigations. Like most tech geeks, I like technology because it’s easier to understand and deal with than most people. However, the lack of camaraderie this year is a little much, even for me.

So let’s help everyone connect. Think of those people that you see so infrequently at these canceled conferences, and call them to check-in. Don’t text or email, just pick up the phone and give them a call. I am sure that they’d appreciate hearing from you, and you’ll probably find that you have some interesting technical topics to discuss. Maybe you could invite the “usual gang/CTF team” to a Zoom happy hour.

Don’t stop connecting just because you’re stuck in the basement like me. You never know, maybe you’ll solve an interesting problem, find a new, really evil way to hack something cool, help someone find a resource, or just maybe make the world a slightly safer place. Most importantly, if you discover the solution to the problems I’ve discussed here, please let me know.

Urban Jonson


1 Blake Sobczak, The inside story of the world’s most dangerous malware. E&E News, March 2019
2 NSA/CISA Joint Report Warns on Attacks on Critical Industrial Systems. July 27, 2020.
3 Tara Sales, Triton ICS Malware Hits A Second VictimSAS 2019, Published on Threatpost.com, April 2019.
4 Pierluigi Paganini, ‘Olyimpic Games’ and boomerang effect, it isn’t sport but cyber war, June 2012.
5 Goelles, T.; Schlager, B.; Muckenhuber, S. Fault Detection, Isolation, Identification and Recovery (FDIIR) Methods for Automotive Perception Sensors Including a Detailed Literature Survey for LidarSensors, 2020, Volume 20, Issue 13.
6 Ben Gardiner, Automotive Sensors: Emerging Trends With Security Vulnerabilities And SolutionsMEMS Journal, February 2017.
7 For more information on CCE please see the INL website.
8 National Motor Freight Traffic Association, Inc., A Survey of Heavy Vehicle Cyber Security. September 2015.