Stop shitting on Cisco ACI when it’s not the problem.

Liam Keegan
10 min readAug 26, 2024

--

Photo by Todd Morris on Unsplash

First, a disclaimer. The manufacturers in the data center networking business are all very good. At the same time, each and every one of them finds unique and aggravating ways to annoy their customers. I believe that whether you’re a Cisco, Juniper, Arista, SONiC, Cumulus or whitebox shop, the grass is greenest where you water it. The best manufacturer is the one that you adapt to be the most relevant to your line of business.

Ok, so what is your point?

Recently, Juniper published an article (linked) encouraging potential customers to look at the Apstra platform, which it touts as easy and efficient. (Shockingly), Juniper characterizes Cisco ACI as closed and difficult. “Buy our product and all your problems will be fixed!” says every network vendor, ever. Ahh, if things were only that simple.

Instead of Juniper (good)/Cisco (bad) comparison, I think this article could be re-titled. Here’s my hot take:

Marketing is my passion!

It’s actually an interesting read, but not from the Product A vs. Product B perspective. I think it’s interesting because it shows the difference between an ecosystem that has realized its full potential by enabling the teams that support it vs. just airdropping in a technology platform.

There are three pillars of IT: people, process and technology. For too long, manufacturers have sold, partners have implemented, and customers have operated technology solutions without fully embracing the people and process aspect of the solution. I have had exactly ZERO people come to me and say their budget for headcount and training of teams is expanding. Industry-wide, the problem of airdropping in technology platforms without fully enabling the teams that support them, and the sprinkling of “AI” on everything that plugs into the wall just makes this problem worse.

Different Horses for Different Courses

One last note before I jump into the meat of this article. I do not believe that ACI (or any SD-controlled data center) is the right choice for every organization. In smaller, static networks, I don’t even think VXLAN is the right choice. These controller-based platforms incur technical debt and require operational changes to be able to support them. Having upfront, honest discussions about the organizational willingness to embrace these changes is key prior to a PO being signed.

When you’re talking about scaled-out, global footprints, or highly dynamic environments, I think software-defined is the way to go, but only if an organization is willing to accept the operational changes required to make it successful. If not, don’t bother.

Old Ways vs. New Ways

If I could get in the time machine and go back 10 years to the initial product launch (as an aside, happy 10th birthday to ACI), the one message that I wish Cisco had pounded into every TME, TSA, PSS and SE in the company would have been:

Cisco ACI is not just an evolution of Nexus switches. It is the manifestation of cloud networking in an on-premise environment. It requires an evolution of network engineering skills that directly map to cloud providers such as AWS and Azure. By applying your network domain expertise to ACI, you will be able to design, implement and support cloud and on-prem software-defined networking better than anyone that reads a book and passes an Azure exam. To achieve this, you will need to reverse your thinking by understanding exactly what you want to achieve at the beginning of the process, and this isn’t easy for people who like to type ‘conf t’ and type a bunch of crap in the console. The benefit will be that you can let the application owners own their networking without you having to get involved. Oh, by the way, you’ll have to upgrade it more than once every five years.

To be fair, this really isn’t ACI-specific. It’s the difference between software-defined networking and traditional VLANs+SVIs. It’s why you chuckle when you look at someone’s Azure console and see a vNet with a 10.0.0.0/8 network allocation. No network person in their right mind would have set that up, but the app owner that clicked next, next, next didn’t think twice. It’s also why you can do a ‘show ver’ on a legacy data center switch and see an uptime of 7+ years.

I posit that without well trained teams that understand software-defined networking, and organizational processes that change long-standing departmental silos, your organization is going to find themselves in the same mess on a different platform. I also don’t think you can “AI yourself” into good decision making.

Let’s go through the article!

I really like how Juniper has laid out each of these categories in their comparison article, since it provides a good framework to talk about the organizational changes needed.

Putting on my marketing hat once again, here’s my column header edit:

I’m going to go through each section and talk about what needs to change with the people and processes of an organization to achieve this technical nirvana. I may comment on the tech from an ACI perspective as well.

Operational Experience

I think the biggest difference here is the sphere of responsibility around the networking of an application. How many network teams are willing to give up control of the network side to an application owner? I believe that the network teams should get rid of this as fast as possible, but this seems to be a Layer 8 problem more than anything technical.

If these DevOps teams are so concerned about pace of change, there’s no why they shouldn’t be slipstreaming the networking into their Ansible/Terraform/<insert another tool here> pipeline and letting the ACI platform handle it. The network team advises on the setup and overall platform health, then the app owner runs the day to day in their domain.

If you require a network engineer to be involved in the loop with your next gen network, you’re always going to bottleneck at the human. Period.

Automation

Both vendors should be commended for their commitment to openness and API control of their platforms. I think this is just a difference of tooling preference. I don’t know enough about Apstra to comment on their Day 0/1/2 templates, but generally I like templates as a starting point to focus the implementation/operations scopes. This is something that Cisco has done with Nexus Dashboard Fabric Controller, but it’s not part of the ecosystem

Cisco has done an excellent job with Ansible and Terraform support on ACI (see the Nexus As Code site on DevNet), not to mention a Python library and a well-documented REST API. I think Cisco’s take is that organizations have already invested in these toolsets so leverage what they already have in use.

But, regardless of platforms, you need teams that understand automation. Cisco offers the DevNet Associate. Juniper offers JNCIA-DevOps. Organizations need to upskill their network staff to understand these concepts and then enable collaboration between the network, systems and security groups. Everyone has to play in everyone else’s sandbox now and it’s uncomfortable for people.

Yeah, the native GUI can be rough.

Multivendor Compatibility

I don’t know how much this is a concern for organizations, or where the perimeter of multivendor starts and stops. For instance, is this referring to having a single VXLAN fabric that encompasses Juniper and SONiC? To me, that sounds like the seventh circle of support finger pointing hell and I wouldn’t recommend that to anyone. They are 100% correct that ACI is Cisco-only, but Cisco has other solutions (such as the excellent NSO and Crosswork platforms) to handle multi-vendor service enablement.

This may be a larger issue in the service provider space, but that’s not where I focus.

Training and Certification

I think this is a bit silly. If you learn VXLAN (or any standards-based feature/functionality), you can apply the knowledge and understanding to any vendor’s equipment. Yes, there are implementation differences, but it’s not like you have to go back to network school to learn what a VNI is on Aruba vs. Juniper. You don’t start at zero.

ACI is cloud networking. If you have a CCNP-level understanding of networking, you can then use that base to layer on the knowledge of software-defined networking, then figure out the cloud-specific mapping. It’s the same if you have a JNCIP. A blanket statement saying that that knowing a data model can turn every network engineer into an expert is simply fantasy.

If Apstra was so easy, why does Juniper have a five-day class? The same reason Cisco has a five-day ACI class. If an organization doesn’t enable its network engineers to understand, use and stay proficient on ACI (or Apstra, or <insert your platform of choice here>), knowledge will atrophy.

Buying the technology is the easy part. It’s ensuring that your organization has the people trained who can support it and not just have it be one of seven thousand “other duties as assigned” that need to be done yesterday.

Unrelatedly, the phrase “many network operators” has a very strong “Canadian girlfriend” ring to it.

Upgrades

If you embrace software-defined anything, you must proactively keep the software platform up to date. Full stop. For early releases of ACI, I don’t think customers understood this operational requirement. It’s no different for Apstra — they have their own end-of-life/release schedule.

In ACI-land, if you were an early adopter, the upgrade paths were plentiful. Too plentiful. If you had an original fabric running 1.0(2) that was released on August 2, 2014, the ACI Upgrade Path Tool shows that you’d need to do seven (!) distinct upgrades.

Cisco heard the feedback. If you’re on a relatively recent version (4.2(7) — released March of 2021, or later), you can move to any current version without pitstops.

It’s not like Apstra doesn’t have limitations either, but every platform does! You need to keep the platforms up to date and adopt a philosophy of small, incremental changes.

Stepping back from ACI fanboying, I did a podcast with the Nexus Dashboard team where we brought up how much you have to trust your fabric controller. Upgrades need follow DevOps-style of small, incremental steps. This shouldn’t need to be done in an outage window. (When was the last time you got an email from AWS saying your workload would be unavailable because they were doing switch upgrades?) To achieve this network nirvana, you need to TRUST your controller.

Ask youself: is your organization willing to upgrade its fabric in the middle of the day? Has your automation controllear earned that trust? Will it ever?

AI-Enabled Operations

On this point, I don’t have much to opine on. Cisco has Day 2 Operations, Juniper has the Marvis VNA. I agree with the point that teams need more flow visibility and more automated insights into their data centers, and more “AI-enabled” assistance. I think both Juniper and Cisco see the world the same way and are approaching the solutions from a relatively similar perspective.

But, once again, all the AI in the world isn’t going to help an organization who doesn’t have the teams that can understand the output! This goes back to training and enablement.

Final Thoughts

If you’ve made it this far, I applaud you for reading my tongue-in-cheek hot take on the Juniper marketing article. If I leave you with anything, it’s that no technology manufacturer sells a bag of magic beans. Organizations need to invest in the people and processes that support these next-gen networks, just like they’re forced to do if they adopt any public cloud strategy.

Yes, there is plenty of room for improvements in ACI implementations. If you struggle with your ACI implementation, I want to hear about it. You can reach me on Twitter or LinkedIn. If you’re willing to make your environment available for anonymous feedback, I’ll make a YouTube video on three simple changes to make your life easier and address your pain points. I’ll even send you the tools. It’d be a fun project, so reach out!

But what about Nexus Dashboard?

Cisco is having a lot of customer discussions about moving from ACI to Nexus Dashboard. After all the comparison here between Apstra and ACI, I actually think a more relevant comparison is Apstra to NDFC. On the same token, I’m not convinced that blindly moving from ACI to NDFC actually solves anything. Unless the root causes of some of these issues are addressed holistically, not much will change for the same reasons that moving blindly from ACI to Apstra wouldn’t either.

Now, I’m off to see my Canadian girlfriend. She’s really into software-defined networking.

--

--

Liam Keegan
Liam Keegan

Written by Liam Keegan

Data center/security/collab hack, CCIE #5026, focusing on automation, programmability, operational efficiency and getting rid of technical debt.

No responses yet