Automation

From SSH Scripts to a Real Source of Truth with NetBox

July 1, 2026

Every network automation journey starts the same way: a folder of scripts that SSH into devices, and a spreadsheet — or if you’re feeling advanced, a folder of YAML files — that lists what those scripts should run against. It works, right up until it doesn’t. The spreadsheet drifts from reality. Two scripts disagree about which VLAN lives where. Someone renames a device and three playbooks break. The data describing your network becomes less trustworthy than the network itself.

NetBox fixes the root problem: it gives you a single, structured, queryable source of truth for what your network is *supposed* to be. This post is about why that matters, and — more usefully — how to migrate onto it gradually, one device type at a time, without a heroic big-bang cutover that you’ll never actually schedule.

## The problem isn’t your scripts. It’s your data.

Automation people obsess over the execution layer — Ansible, Nornir, raw Netmiko — but the execution layer is rarely where things rot. What rots is the *data* the execution reads from. A folder of YAML has no schema, no referential integrity, and no validation. Nothing stops you from typing a VLAN ID of 5000, referencing a site that doesn’t exist, or assigning the same IP twice.

“`yaml
# devices.yml — looks fine, is quietly wrong
– name: core-sw-01
site: lond1 # this site was renamed to “ldn1” last quarter
mgmt_ip: 10.0.0.5
vlans: [10, 20, 5000] # 5000 is out of range; nothing caught it
– name: core-sw-02
mgmt_ip: 10.0.0.5 # duplicate IP; nothing caught it
“`

That file will happily generate configs. The errors surface on the device, at deploy time, which is the most expensive possible place to find them.

A source of truth is a database that *refuses* to hold data like that. NetBox knows a VLAN is 1–4094, that an IP can be assigned once, that a device belongs to a site that must exist. It turns a whole category of outages into form-validation errors you see before anything ships.

## What NetBox actually models

NetBox isn’t a monitoring tool and it isn’t a config manager. It’s a data model of your intended state: sites, racks, devices, interfaces, IP addresses, prefixes, VLANs, circuits, cables, and the relationships between them. The relationships are the point. An interface belongs to a device; an IP is assigned to an interface; a prefix lives in a VRF at a site. You can’t express nonsense, because the model won’t let you.

Crucially, everything in it is reachable through a REST API and a GraphQL API. Your automation stops reading flat files and starts asking questions:

“`text
GET /api/dcim/devices/?site=ldn1&role=core-switch&status=active
“`

That query is *always* current, because it reads the same database your team edits through the web UI. No sync step, no drift between “the spreadsheet” and “the automation’s copy of the spreadsheet.”

## The migration principle: additive, not big-bang

Here’s the mistake that kills NetBox rollouts: treating it as an all-or-nothing cutover. “We’ll model the entire network, then flip every playbook to read from NetBox.” That project is too big, so it never finishes, and NetBox becomes a half-populated museum piece everyone ignores.

The approach that works is boring and incremental. Pick **one device type or one site**, make NetBox authoritative for *just that slice*, and leave everything else exactly as it is. Prove the loop end to end on something small, then widen it. NetBox and your old YAML coexist happily during the transition — nothing forces you to choose globally on day one.

A migration order I’ve had good luck with:

The first slice is IP address management, because it’s the highest-pain, lowest-risk place to start. Everyone already fights over a spreadsheet of IPs; replacing that spreadsheet with NetBox’s IPAM helps immediately and touches no config generation. The second slice is a single, homogeneous device role — say, access switches at one site — where you generate config from NetBox and compare it to what’s live before trusting it. The third slice onward is just repeating the second for each role until the YAML folder is empty and no one’s edited it in a month.

## Step 1: seed NetBox from what you already have

You don’t type your network into NetBox by hand. You import it — ideally from the devices themselves, so the source of truth starts out matching reality. A pragmatic seeding script pulls facts from devices and creates the corresponding NetBox objects through the `pynetbox` client:

“`python
import pynetbox
from netmiko import ConnectHandler

nb = pynetbox.api(“https://netbox.internal”, token=”not-in-git”)

def seed_device(host, name, site_slug, role_slug, dtype_slug):
# Idempotent: get-or-create so re-runs don’t duplicate
device = nb.dcim.devices.get(name=name)
if device is None:
device = nb.dcim.devices.create(
name=name,
site={“slug”: site_slug},
role={“slug”: role_slug},
device_type={“slug”: dtype_slug},
status=”active”,
)
print(f”created device {name}”)
else:
print(f”device {name} already present, skipping”)
return device

seed_device(“core-sw-01”, “core-sw-01”, “ldn1”, “core-switch”, “c9300-48″)
“`

The important property is idempotency: running it twice does not create duplicates. That single discipline lets you re-run seeding as you refine it, which you will.

For interfaces and IPs, pull from the box and reconcile:

“`python
def seed_interfaces(device, host):
conn = ConnectHandler(
device_type=”cisco_ios”, host=host,
username=”netops”, password=”not-in-git”,
)
rows = conn.send_command(“show ip interface brief”, use_textfsm=True)
conn.disconnect()

for row in rows:
iface = nb.dcim.interfaces.get(device_id=device.id, name=row[“interface”])
if iface is None:
iface = nb.dcim.interfaces.create(
device=device.id, name=row[“interface”], type=”1000base-t”,
)
if row[“ipaddr”] not in (“unassigned”, “”):
nb.ipam.ip_addresses.create(
address=f”{row[‘ipaddr’]}/32″,
assigned_object_type=”dcim.interface”,
assigned_object_id=iface.id,
)
“`

Now NetBox holds a first draft that reflects the live network. You clean it up in the UI, and from that point forward NetBox — not the device, not the spreadsheet — is what “should be true.”

## Step 2: generate config from NetBox, verify before trusting

The payoff arrives when config generation reads from the API instead of YAML. The query is live, so the template can never drift from inventory:

“`python
import pynetbox
from jinja2 import Template

nb = pynetbox.api(“https://netbox.internal”, token=”not-in-git”)

tpl = Template(“””
hostname {{ device.name }}
!
{% for iface in interfaces %}
interface {{ iface.name }}
description {{ iface.description or “managed-by-netbox” }}
{% if iface.ip %} ip address {{ iface.ip }}{% endif %}
!
{% endfor %}
“””)

device = nb.dcim.devices.get(name=”access-sw-07″)
interfaces = nb.dcim.interfaces.filter(device_id=device.id)

rendered = tpl.render(device=device, interfaces=interfaces)
print(rendered)
“`

The rule during migration is: **generate, diff, human-approve, then deploy.** Don’t let NetBox push config the first time you trust it. Render the intended config, diff it against the running config, and eyeball the difference. When the diff is empty (or only shows changes you meant), you’ve earned the right to automate the push. This is how you catch modelling mistakes on your terms instead of the device’s.

“`python
# Pseudocode for the safety loop
rendered = render_from_netbox(device)
running = fetch_running_config(device)
diff = unified_diff(running, rendered)

if not diff:
print(“in sync”)
elif approved_by_human(diff):
deploy(device, rendered)
else:
print(“held for review”)
“`

## Step 3: close the loop with drift detection

Once a slice is authoritative, the last piece flips the question around. Instead of only asking “what config does NetBox say this device should have,” you ask “does the live device still match NetBox?” That’s drift detection, and it’s what turns a source of truth into an *enforced* source of truth:

“`python
def check_drift(device_name):
device = nb.dcim.devices.get(name=device_name)
intended = render_from_netbox(device)
running = fetch_running_config(device_name)
diff = list(unified_diff(running.splitlines(), intended.splitlines()))
return diff # empty list == in sync

drift = check_drift(“access-sw-07″)
if drift:
alert(f”access-sw-07 has drifted from NetBox:\n” + “\n”.join(drift))
“`

Run that on a schedule and you’ll know within an hour when someone makes a change on the box that isn’t reflected in the source of truth — which is exactly the moment your data starts lying to you. Catching it early is what keeps NetBox trustworthy long-term.

## Why the database wins

The case for NetBox over a folder of YAML comes down to three things a file can’t give you. It gives you **validation** — the model rejects impossible data before it reaches a device. It gives you **relationships** — you can ask “every device on this prefix” or “every interface in this VLAN” and get a real answer, because the objects are linked, not just listed. And it gives you **one place to look** — the web UI, the REST API, and your automation all read the same rows, so there is no authoritative copy and stale copies, just the truth and views onto it.

A YAML folder can imitate the first of those with enough tooling, and the second with a lot of pain, but never the third. The whole value of a source of truth is that there is exactly *one* of it.

## Start small, and start this week

If there’s a single takeaway, it’s that you don’t need permission for a giant project to begin. Stand up NetBox, import your IP addresses, and stop editing the IP spreadsheet — that alone pays for itself. Then take one device role, generate its config from NetBox, and diff before you deploy. Each slice is small, reversible, and useful on its own, and the old system keeps running beside it the whole time.

The folder of SSH scripts got you here, and there’s no shame in that. But scripts execute intent; they can’t *store* it reliably. NetBox is where the intent lives, and once it does, everything downstream — generation, deployment, drift detection — becomes a straightforward read against data you can finally trust.

Next-Hop.dev

From SSH Scripts to a Real Source of Truth with NetBox

Share this:

Leave a Reply Cancel reply

Recent Posts

Simulating Link Failure and Latency with tc and netem

Out-of-Band Management That You’ll Actually Be Glad You Built

BGP Communities as Policy: A Pattern That Scales

Parsing Show-Command Output with TextFSM and ntc-templates

Chasing an Intermittent MTU Black Hole Across a VXLAN Overlay

A Reproducible Home Lab with Containerlab and FRR

Next-Hop.dev